Senior Platform/MLOps Engineer
Bright Machines
Platform Engineers at Bright Machines are responsible for defining and implementing the systems that make Software Defined Manufacturing possible and that power our flexible robotic manufacturing lines. Our robots, and the software that controls them, are deployed in a variety of factory conditions and help support the manufacturing operations for some of the biggest names in the industry.
As a Senior Platform/MLOps Engineer, you will build scalable systems that are foundational to the Bright Machines technology stack. With a focus on our AI/ML infrastructure, you will design, implement, and maintain our training pipelines, model deployments, and inference app workloads. Our computer vision and deep learning models are used for defect detection, classification, and visual validation, providing end-to-end inspection solutions that deliver consistent, accurate results under real-world factory conditions. You will collaborate with the Smart Robotics team and the Platform Engineering team to design, implement, and deploy our GPU workloads in kubernetes. If you are ready to apply exceptional engineering practices and build the platform that will define the next generation in manufacturing, this is your opportunity to “Be Bright”.
WHAT YOU WILL BE DOING
Design, implement, and maintain reliable, scalable, and secure infrastructure, applications, and tooling, with a focus on our ML/AI pipelines and workloads
Write clean, maintainable code, and perform peer code-reviews
Write clear and concise documentation and engage in cross-team communication and knowledge sharing
Work with other team members to investigate design approaches, prototype new technology and evaluate technical feasibility
Pair with adjacent teams to understand how your frameworks and infrastructure are actually used in the field, continuously improving them and leveraging recent advances to improve developer velocity
WHAT YOU WILL BRING
At least 5+ years of experience in Platform Engineering, DevOps, or Site Reliability Engineering (SRE).
B.S. or M.S. degree (or equivalent) in Computer Science, Engineering, or a related field
Proficiency in at least one modern programming languages (Python, Javascript, C#, Go, etc)
Demonstrated industry best-practices in MLOps
Proficiency with CI/CD tools and GitOps workflows
Familiarity with running GPU workloads in kubernetes
Strong knowledge of Kubernetes (self-hosted and managed) and modern k8s paradigms (e.g. CNCF)
Proficiency with Infrastructure as Code tools (Terraform, etc) and configuration management tools (Ansible, etc)
Familiarity with observability stacks (Prometheus, Grafana, OpenTelemetry)
IT WOULD BE GREAT IF YOU HAD
Experience in air-gapped or extremely strict security environments
Experience communicating with users, technical leaders and management to collect requirements, describe system designs, and architecting software systems that meets your stakeholders needs
Knowledge and demonstrated application of software engineering best practices relating to the SDLC including code reviews, SCM, CI/CD, testing, and operations
Demonstrated ability to mentor and grow other team members
160000 - 190000 USD a year