Introduction
In a presentation at InfoQ Dev Summit Boston, Vivek Gupta, Director of the AI Rotational Program at Microsoft, shared his insights on how to grow and cultivate strong machine learning (ML) engineers. With over 30 years of experience in software development and more than a decade focused on AI and ML, Gupta provided a comprehensive guide for managers and engineers alike. This post summarizes the key takeaways from his talk.
What Managers Need to Know
To effectively lead ML engineers, managers need a broad understanding of various domains, including product management, applied sciences, and engineering. While they don't need to be experts in everything, they must stay current with the latest tools and technologies to guide their teams and foster innovation.
Nourishing Early-Career Engineers
For engineers just starting their careers, Gupta emphasized the following:
- Feedback: Provide regular and constructive feedback on coding, collaboration, and prioritization.
- Time to Learn: Carve out dedicated time for learning and experimentation, such as through hackathons.
- Encourage Questions: Create a safe environment where asking questions is encouraged to avoid engineers getting stuck.
- Mentoring: Assign mentors to guide them through their learning journey.
- Collaboration: Foster collaboration across different teams and disciplines to share knowledge and reduce duplicate efforts.
Cultivating Senior Engineers
For senior engineers, the focus shifts to:
- Advanced Mentoring: They should not only receive mentoring from more senior people but also become mentors to junior engineers.
- Wider Collaboration: Encourage them to look for collaboration opportunities across the entire organization.
- Driving Innovation: They should be proposing new ideas for projects and hackathons.
Essential Skills for Production Machine Learning
Gupta highlighted that ML engineers need all the standard engineering skills (coding, testing, DevOps, etc.), but they also require specialized skills for production ML:
- Understanding Data Science: ML engineers need to understand the data science process to build scalable and resilient systems.
- Data Management: This includes tracking data used for training and validation, managing data pipelines, and monitoring for data shifts.
- Privacy and Security: Ensuring that data is handled in a privacy-preserving and secure manner is crucial.
- Training Pipelines: Building automated and consistent pipelines for model training and retraining.
- Model and Prompt Management: Versioning models, managing prompts for LLMs, and ensuring compatibility when models are updated.
- Evaluation: Implementing robust evaluation techniques for models, especially for LLMs where answers are probabilistic.
- Telemetry: Collecting data on model performance in production to identify when retraining is needed.
- Human in the Loop: Designing systems that incorporate human oversight before accepting automated results.
- User Feedback: Closing the loop by incorporating user feedback to improve models over time.
Conclusion
Growing and cultivating strong ML engineers requires a multifaceted approach that includes providing the right support for early-career engineers, fostering the growth of senior engineers, and developing the specialized skills needed for production machine learning. By focusing on these areas, organizations can build high-performing ML teams that drive innovation and deliver value.