Staff Data Scientist - Verification & Validation
Zoox
Zoox is on an ambitious journey to develop a full-stack autonomous vehicle system for cities. We are seeking a Staff Data Scientist to join a verification and validation team that evaluates safety-critical AI systems.
You will join a team of software and data engineers that leverage methods including log data analysis, simulation, and closed-course structured testing. You'll work cross-functionally with AI software, System Design and Mission Assurance, Simulation, Sensors, and other teams to develop, execute, and iterate on validation methods and pipelines. These pipelines evaluate safety-critical systems, are highly visible, and are an important critical path element of launching our service. The ideal candidate brings a hybrid of statistical rigor and engineering mindset to drive clarity from ambiguity, establish new processes, and propel the team forward.
In this role, you will:
Design Evaluation Frameworks: Architect statistical methodologies for safety-critical AI systems to form objective, rigorous conclusions about their performance and reliability.
Conduct Robust Analysis: Deliver validation evidence to support increasingly complex operations and identify potential edge-case failures.
Inform Strategy: Deliver clear, data-driven insights to development teams to guide system improvement, and to executive leadership to inform milestone-level go/no-go decisions.
Define Metrics: Drive alignment across engineering teams on performance metrics and data extraction strategies.
Lead the Lifecycle: Manage all phases of evaluation including prototyping, requirements capture, design, implementation, and validation.
Scale Pipelines: Partner with engineers to build and maintain scalable data processing and simulation pipelines, applying distributed computing to analyze petabytes of driving data.
Qualifications:
- MS or PhD in Statistics, Computer Science, Machine Learning, Applied Mathematics, or related quantitative field
- Proficiency in Python and SQL with experience in production-quality code
- Demonstrated expertise in statistical methodologies including hypothesis testing, power analysis, spatiotemporal modeling, Bayesian inference, and multivariate analysis.
- Experience with large-scale data analysis and statistical modeling
- Proficiency with Git, unit testing, and collaborative development practices
Bonus Qualifications:
Hands-on experience with production machine learning pipelines: dataset creation, training frameworks, metrics pipelines
Experience with modern data processing technologies such as Apache Spark, Spark SQL, and Databricks
Experience with designing metrics and delivering actionable insights that drive business decisions
256000 - 307000 USD a year