Job Description
Are you ready to push the boundaries of artificial intelligence? Nexus AI Research Labs is seeking a high-caliber Senior Machine Learning Engineer to join our core infrastructure team in San Francisco. We build cutting-edge generative models that solve real-world problems at scale.
As a key member of our engineering team, you will design, implement, and optimize deep learning pipelines while working alongside world-class researchers and distributed systems experts. If you are passionate about scalable AI and pushing production systems to their limits, we want to hear from you.
Responsibilities
- Architect and deploy large-scale distributed training frameworks for foundation models.
- Optimize inference pipelines to minimize latency in production environments.
- Collaborate with research scientists to translate experimental code into scalable production features.
- Implement rigorous testing and monitoring protocols for ML model performance.
- Mentor junior engineers and promote best practices in MLOps and clean code architecture.
- Analyze massive datasets to improve model accuracy and training efficiency.
Qualifications
- M.S. or Ph.D. in Computer Science, Artificial Intelligence, or a related quantitative field.
- 5+ years of experience in production-level machine learning engineering.
- Expert-level proficiency in Python and deep learning frameworks (PyTorch or JAX).
- Strong understanding of CUDA programming and GPU utilization optimization.
- Experience with cloud infrastructure (AWS/GCP) and container orchestration (Kubernetes).
- Proven track record of deploying models that serve millions of daily requests.
- Excellent communication skills with the ability to articulate complex technical concepts.