Modern drug discovery generates enormous amounts of complex data — from chemical assay readouts to multi-omics data and public literature-derived databases. This data represents billions of dollars and countless years of work to generate, yet organizing and extracting insights from this data to accelerate the design of new therapeutics is one of the greatest challenges facing drug discovery teams.

At Inductive Bio, we’re making sense of this data to build machine learning models and tools that help scientists design better drugs faster. Our platform combines cutting-edge AI with intuitive software that allows drug discovery teams to collaborate effectively on the molecular design process. We are enabled by a unique and growing proprietary data set, and we are already applying our methods to dozens of active drug discovery programs. Backed by leading investors at the intersection of biotechnology and technology and advised by renowned experts in drug discovery, we are growing rapidly and poised to make a major impact in drug discovery.

The Role:

We are seeking a Data Engineer to join our talented, ambitious, and kind team. You’ll have the chance to take a leading role in designing and building the data infrastructure that powers our models and platform. You’ll be responsible for creating robust, scalable systems that ingest, transform, and serve diverse scientific datasets. As an early data engineer at a rapidly growing startup, you’ll have the opportunity to define best practices, shape technical strategy, and collaborate closely with ML scientists and chemists.

What you’ll do

Design and implement data pipelines that harmonize, validate, and version scientific data for downstream use in modeling and analysis
Develop tools and schemas for integrating heterogeneous data types (chemical, image-based, genomic, etc)
Build and maintain scalable data storage systems and APIs to make experimental and model-derived data accessible to scientists and machine learning teams
Collaborate with ML Scientists to prepare and curate datasets for training and evaluating predictive models
Partner with Software Engineers to surface clean, well-structured data to end users through our internal and customer-facing platforms
Establish and enforce best practices for data governance, reproducibility, and lineage tracking

Who you are

4+ years of experience as a Data Engineer, ML Platform Engineer, or similar role
Proficiency building and maintaining data pipelines and ETL processes in python (e.g. using orchestration tools such as Dagster, Airflow, or Prefect)
Experience with cloud-based storage and compute (AWS S3, ECS, etc, or equivalent)
Outstanding written and oral communication skills
Interest in diving deep into the science of a drug discovery and the business of a growing startup
Nice to have: Experience managing and working with scientific data, particularly in chemistry

Working at Inductive

At Inductive Bio, we know that the people on the team are what make us great. We offer competitive salary and equity-based compensation; comprehensive healthcare benefits (including dental and vision); and the opportunity to grow along with a rapidly scaling company. We are a passionate, kind, and mature team. Working at a fast-growing startup is not always a 9–5 job, but we believe that our employees should have full lives beyond their careers.

This job is no longer accepting applications

See open jobs at Inductive Bio.See open jobs similar to "Data Engineer" Lux Capital.

See more open positions at Inductive Bio