Job Description
Our customer is seeking a highly skilled Senior AI/ML Engineer to design, develop, and deploy AI-driven applications with a strong focus on multi-agent systems, streaming APIs, and efficient deployment architectures. This role will require expertise in using LangChain and Langraph for asynchronous, multi-agent models, deploying optimized ONNX models, and leveraging AWS services such as SageMaker, Lambda, and API Gateway. You will work closely with cross-functional teams to create scalable, reliable, and efficient solutions.
Multi-Agent System Development:
- Design and develop multi-agent systems using LangChain and Langraph for advanced AI-driven applications.
- Implement async and streaming functionalities within LangChain environments for real-time interaction and data processing.
- Integrate multiple agents to interact effectively and efficiently, utilizing APIs and modular architectures.
Model Deployment and Optimization:
- Deploy, optimize, and manage ONNX models in production environments, focusing on minimizing latency and maximizing model performance.
- Use AWS SageMaker for model training, tuning, and inference pipeline management.
- Collaborate with data scientists to convert trained models to ONNX format and implement best practices in model compression and latency reduction.
Serverless Architecture and API Management:
- Develop and manage serverless functions and APIs using AWS Lambda and API Gateway, ensuring scalability and low latency.
- Design robust, scalable, and secure APIs to facilitate interactions between multi-agent systems and other application components.
- Oversee deployment pipelines and monitoring solutions to ensure efficient performance across serverless applications.
Collaboration and Documentation:
- Work closely with data scientists, DevOps engineers, and software developers to ensure smooth deployment and integration of models and services.
- Document processes, architectures, and best practices for repeatable and transparent AI/ML deployments.
- Provide mentoring and technical support to other team members in best practices for multi-agent, model deployment, and AWS integrations.
- Bachelor’s or Master’s degree in Computer Science, AI/ML, Engineering, or related field.
- 5+ years in AI/ML engineering or software development, with a focus on model deployment and multi-agent systems.
- Strong experience with LangChain or Langraph for multi-agent systems, async functionality, and streaming integrations.
- Proficiency in AWS services, especially SageMaker, Lambda, and API Gateway.
- Demonstrable experience in deploying and optimizing ONNX models in production environments.
- Proficient in Python, with strong understanding of async programming.
- Experience with ML model frameworks (e.g., PyTorch, TensorFlow) and converting models to ONNX format.
- Deep understanding of AWS infrastructure, especially for serverless and machine learning services.
- Knowledge of RESTful and GraphQL API development, security, and best practices.
- Excellent problem-solving abilities and attention to detail.
- Strong communication skills, with the ability to work in cross-functional teams.
- Ability to work independently and proactively in a remote, collaborative environment.
Preferred Qualifications:
- Experience with other cloud platforms (e.g., Azure, GCP) is a plus.
- Familiarity with containerized deployments using Docker or Kubernetes on AWS.
- Prior experience in architecting multi-agent systems for production environments.