Machine Learning Engineer III / Senior Machine Learning Engineer - AI Platform
at Workday
Want this job?
Let DoneWithWork tailor your resume to this exact posting, write the cover letter, and submit the application for you.
Apply with DoneWithWork — $19.99/moJob description
Your work days are brighter here.We’re obsessed with making hard work pay off, for our people, our customers, and the world around us. As a Fortune 500 company and a leading AI platform for managing people, money, and agents, we’re shaping the future of work so teams can reach their potential and focus on what matters most. The minute you join, you’ll feel it. Not just in the products we build, but in how we show up for each other. Our culture is rooted in integrity, empathy, and shared enthusiasm. We’re in this together, tackling big challenges with bold ideas and genuine care. We look for curious minds and courageous collaborators who bring sun-drenched optimism and drive. Whether you're building smarter solutions, supporting customers, or creating a space where everyone belongs, you’ll do meaningful work with Workmates who’ve got your back. In return, we’ll give you the trust to take risks, the tools to grow, the skills to develop and the support of a company invested in you for the long haul. So, if you want to inspire a brighter work day for everyone, including yourself, you’ve found a match in Workday, and we hope to be a match for you too.About the TeamDo you want to build impactful, AI features and solutions that will be used by millions of end-users? We are in the AI Platform organization at Workday and we solve meaningful problems that lie at the intersection of machine learning and enterprise-scale software! We build advanced AI solutions that power the core Workday software by modeling user behavior and providing intelligent automation. Come join us and make it easier and balanced for millions of Workday users!This role is focused on building the systems and tooling required to host and scale agent-based applications powered by LLMs. You will work across the platform stack to create reusable capabilities for agent execution, workflow orchestration, observability, evaluation, reliability, and developer experience.You’ll partner closely with applied AI, product, and infrastructure teams to define how agents are built and operated across the organization. This is an ideal role for someone who enjoys solving hard engineering problems in a fast-evolving technical space and wants to shape the foundation for the next generation of AI applications.About the RoleWe are looking for a Machine Learning Engineer to help design and build our Agent Platform—the core infrastructure that enables teams to develop, deploy, orchestrate, and operate AI agents in production.This role is focused on building the systems and tooling required to host and scale agent-based applications powered by LLMs. You will work across the platform stack to create reusable capabilities for agent execution, workflow orchestration, observability, evaluation, reliability, and developer experience.You’ll partner closely with applied AI, product, and infrastructure teams to define how agents are built and operated across the organization. This is an ideal role for someone who enjoys solving hard engineering problems in a fast-evolving technical space and wants to shape the foundation for the next generation of AI applications.Responsibilities:Design and build the core platform capabilities required to develop, host, and operate AI agents at scale.Develop infrastructure and services for agent execution, orchestration, state management, and runtime reliability.Build reusable abstractions, frameworks, and workflows in Python to support agent development patterns across teams.Design and implement systems for tool use, memory, retrieval, workflow coordination, and human-in-the-loop interactions.Build and maintain services deployed on Kubernetes, with a focus on scalability, resiliency, and operational excellence.Develop capabilities for evaluation, tracing, observability, debugging, and performance monitoring of agent behavior in production.Improve platform performance across latency, throughput, fault tolerance, and cost efficiency.Create internal APIs, SDKs, and developer tooling that make it easier for engineering teams to build on the platform.Partner with cross-functional teams to productionize new agent use cases and establish common platform patterns and best practices.Contribute to technical architecture and help define the roadmap for agent infrastructure and platform evolution.About YouBasic Qualifications (MLE III):3+ yrs experience as part of a data science, machine learning software development team or relevant work in a PhD or equivalent program.5+ years experience in Python and experience building reliable, maintainable production services.3+ years experience with distributed systems, APIs, asynchronous workflows, and service-oriented architecture.3+ years experience designing systems with a focus on scalability, reliability, observability, and maintainability.Basic Qualifications (Sr. MLE):6+ years of software engineering experience, including experience building and operating production-grade backend, ML, or platform systems.8+ years experience in Python and experience building reliable, maintainable production services.5+ years experience with distributed systems, APIs, asynchronous workflows, and service-oriented architecture.5+ years experience designing systems with a focus on scalability, reliability, observability, and maintainabilityPreferred Qualifications:Experience building or supporting agent platforms, AI infrastructure, or internal developer platforms.Experience building and deploying machine learning or LLM-powered applications in production.Familiarity with LLM application patterns, including:Tool callingRetrieval-augmented generation (RAG)Memory and context managementMulti-step workflows and orchestrationHuman-in-the-loop systemsExperience designing and implementing evaluation frameworks for LLM or agent quality.Familiarity with vector databases, model serving, prompt/version management, and experimentation tooling.Solid knowledge of Data Science principles and their application in NLPExperi
Want this job?
Let DoneWithWork tailor your resume to this exact posting, write the cover letter, and submit the application for you.
Apply with DoneWithWork — $19.99/mo