Applied AI Engineer, Clinical Informatics

at Eli Lilly

Eli Lilly2 Locations

Want this job?

Let DoneWithWork tailor your resume to this exact posting, write the cover letter, and submit the application for you.

Apply with DoneWithWork — $19.99/mo

View original posting →

Job description

At Lilly, we unite caring with discovery to make life better for people around the world. We are a global healthcare leader headquartered in Indianapolis, Indiana. Our employees around the world work to discover and bring life-changing medicines to those who need them, improve the understanding and management of disease, and give back to our communities through philanthropy and volunteerism. We give our best effort to our work, and we put people first. We’re looking for people who are determined to make life better for people around the world.Therapy areas across Eli Lilly focus on new therapeutic approaches for the treatment of different diseases. You will work with partners across Lilly to discover and develop novel biologic, small molecule and nucleic acid-based therapeutics. Our focus is the patient: by understanding the biology and pathophysiology underlying disease states, we aim to address the root cause of disease and develop breakthrough therapies. We have one of the strongest pipelines in the industry and a track record of delivering impactful medicines that improve people’s lives. The Lilly research environment is evolving to centralize the access and analysis of human genetic, omic, and clinical data. This new initiative will work to define data, tools and process to provide the therapy area teams key evidence for target evaluation and target discovery. We are seeking a highly specialized Applied AI Engineer Clinical Informatician to lead research at the intersection of completed clinical trial datasets and biobank-linked population data. This is fundamentally a hands-on research role (not operational trial management), where you will be an individual contributor. Your core mission is to build the systems and tools that extract, define, and contextualize patient phenotypes from locked trial databases, real-world data, and biobank cohorts, that will turn archived data that can generate translational insight that shapes the next generation of clinical research. You will work with rich, already-collected datasets: locked trial databases, archived omics profiles, longitudinal electronic health records, and population-scale biobank cohorts. Your mandate is to build the AI and ML systems that make these datasets manageable and ready for detailed analysis. This role suits someone who thinks like a scientist, builds like an engineer, and communicates like a clinician. Apply today!Key ResponsibilitiesAI & Machine Learning for Translational DiscoveryDevelop and deploy agentic AI applications that enable natural language interaction with clinical dataGround AI outputs in validated biological knowledge, for example implementing RAG pipelines anchored in biomedical ontologies (HPO, Gene Ontology, MeSH, DrugBank), clinical trial registries, and curated pathway databasesDeploy unsupervised and self-supervised learning approaches like clustering, representation learning, contrastive learning to discover latent patient archetypes and molecular disease subtypes across trial and biobank dataDeploy survival models and dynamic treatment regime estimators using combined clinical and omics featuresAI tooling to harmonize heterogeneous trial and biobank datasets to common data representationsEvaluate and monitor model performance, safety, and reliability in production environmentsManage vendors and contractors as well as partner relationships with relevant teams across Lilly Post-Trial Data Research & AnalysisBuilding pipelines for locked clinical trial databases (SDTM, ADaM) to conduct secondary and exploratory research beyond primary endpointsDeploy ML workflows to identify trial subgroup effects, treatment heterogeneity, and responder/non-responder signatures from completed trial dataMine adverse event narratives, clinical notes, and investigator comments using NLP to surface latent safety signals not captured in structured endpoints in biobanks and clinical datasetsReconstruct patient-level longitudinal trajectories from trial visit data to model disease progression, drug response kinetics, and time-to-event outcomesArchitect workflows for meta-analytic and cross-trial integrative analyses across multiple completed studies to identify generalizable biological and clinical patternsBuild connections to large-scale biobank cohorts (UK Biobank, All of Us, etc.) as external validation and enrichment resources for trial-derived findings for clinical phenotypesResearch Rigor, Reproducibility & GovernanceEstablish research data management practices ensuring full reproducibility of analyses including data versioning, containerized compute environments, and audit-ready analysis logsEnsure all research activities follow HIPAA, GDPR, and relevant IRB and ethics committee requirementsBasic QualificationsM.S. in Biomedical Informatics, Computational Biology, Bioinformatics, Statistical Genetics , Epidemiology, Computer Science or a closely related quantitative field or an MD/PhD with equivalent depth in translational data science with 6+ years of research experience working with clinical trial datasets (SDTM/ADaM), biobank data, or large-scale population health data in an academic, pharmaceutical, or research institute settingOr Ph.D. in Biomedical Informatics, Computational Biology, Bioinformatics, Statistical Genetics, Epidemiology, Computer Science or a closely related quantitative field or an MD/PhD with equivalent depth in translational data science with 3+ years of research experience working with clinical trial datasets (SDTM/ADaM), biobank data, or large-scale population health data in an academic, pharmaceutical, or research institute settingAdditional Skills & PreferencesDemonstrated use of AI tools in production environments for clinical data analysisExpert proficiency in Python and/or R for statistical modelling and ML; strong command of SQL and experience with cloud-based research computing environments (ideally DNAnexus, AWS, GCP, Azure, or HPC clusters)Familiar with advanced generative AI metho

Want this job?

Let DoneWithWork tailor your resume to this exact posting, write the cover letter, and submit the application for you.

Apply with DoneWithWork — $19.99/mo

View original posting →