Job Description
Aetheria AI Systems is seeking a visionary Senior Machine Learning Engineer to join our Core Intelligence team in San Francisco. We are building the next generation of autonomous reasoning agents that bridge the gap between Large Language Models and complex enterprise workflows.
As an early hire, you will influence the architectural foundations of our proprietary inference engine, optimize high-throughput model deployments, and solve non-trivial problems in retrieval-augmented generation (RAG) and long-context reasoning. This is a high-impact role for an engineer who thrives at the intersection of cutting-edge research and production-grade software engineering.
Responsibilities
- Architect and implement scalable production pipelines for fine-tuning and deploying state-of-the-art LLMs and multi-modal models.
- Design and optimize RAG architectures using advanced vector databases and semantic caching mechanisms.
- Lead the development of custom evaluation frameworks to benchmark model performance against safety and accuracy KPIs.
- Collaborate with product teams to translate complex business requirements into robust AI-driven features.
- Optimize model latency and throughput for real-time inference using techniques like quantization and pruning.
- Mentor junior engineers and contribute to a culture of technical excellence and rapid iteration.
Qualifications
- Master’s or PhD in Computer Science, Artificial Intelligence, or a related quantitative field.
- 5+ years of experience in software engineering, with at least 3 years focused on Machine Learning in production environments.
- Deep expertise in Python and deep learning frameworks such as PyTorch or TensorFlow.
- Proven track record of deploying Large Language Models (LLMs) and working with Transformer architectures.
- Hands-on experience with cloud infrastructure (AWS/GCP) and MLOps tools like Kubernetes, Docker, and MLflow.
- Strong understanding of vector databases (e.g., Pinecone, Weaviate) and distributed computing.