Staff Data Engineer (San Diego) #3378

San Diego, CA
Research & Development – Research /
Full-Time /
Remote
Are you a champion of automation interested in using your talents  for optimizing processes that will make an impact on the fight against cancer? If so, join GRAIL on the Data Integration team in Research!  

GRAIL is seeking a Staff Data Engineer to join our team to support the growing data needs of GRAIL’s clinical and research activities. You will leverage your expertise in automation and data engineering to ensure our scientific  teams have the data they need to succeed. This role is pivotal in advancing GRAIL's mission by enhancing our data infrastructure and contributing to our early cancer detection efforts.

RESPONSIBILITIES

    • Be a part of a highly collaborative team that focuses on delivering value to cross-functional partners by designing, deploying, and automating secure, efficient, and scalable data infrastructure and tools, reducing manual efforts and streamlining operations.
    • Help model Grail data and ensure that it follows FAIR principles (findable, accessible, interoperable and reusable).
    • Drive the design, deployment, and automated delivery of data infrastructure, standardized data models, datasets, and tools.
    • Integrate automated testing and release processes to improve the quality and velocity of software and data deliveries.
    • Collaborate with cross-functional teams, from Research to Clinical Lab Operations to Software Engineering  to provide comprehensive data solutions from conception to delivery. 
    • Ensure all software and data meet high standards for quality, clinical compliance, and privacy.
    • Mentor fellow engineers and scientists, promoting best practices in software and data engineering.

PREFERRED EXPERIENCE

    • B.S. / M.S. in a quantitative field (e.g., Computer Science, Engineering, Mathematics, Physics, Computational Biology) with at least 8 years of related industry experience, or Ph.D. with at least 5 years of related industry experience.
    • Extensive experience with relational databases, data modeling principles, data pipeline tools and workflow engines (e.g., SQL, DBT, Apache Airflow, AWS GLUE, Spark.
    • Extensive experience with DevOps practices, including CI/CD pipelines, containerized deployment (e.g., Kubernetes), and infrastructure-as-code (e.g., Terraform).
    • Experience with supporting data science / machine learning data pipelines, preferably in the context of analysis of biological data.
    • Experience in developing data pipelines using scalable cloud-based data warehouses / data lakes on AWS, Azure, or GCP.
    • Solid programming skills in  object-oriented and/or functional programming paradigms. 
    • Ability to embrace uncertainty, navigate ambiguity, and collaborate with product teams and stakeholders to refine requirements and drive towards clear engineering objectives and designs.
    • A commitment to constructive dialogue, both in giving and receiving critical feedback, to foster an environment of continuous improvement.

HIGHLY WELCOME EXPERIENCE

    • Prior industry experience in the healthcare,  biotech, or  life sciences industry, especially in the context of next-generation sequencing.
    • Experience working in a regulated environment (e.g., FDA, CLIA, GDPR).
    • Proficiency in  Python, and R.
    • Experience building microservices and web applications.
The estimated, full-time, annual base pay scale for this position is $180,000 - $202,000.  Actual base pay will consider skills, experience, and location.