Job Description
Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
Responsibilities
- Architect & Build: Develop large-scale, highly available data pipelines (batch and streaming) that power real-time machine learning and analytics across Microsoft Ads.
- ML Pipeline Integration: Collaborate with data scientists to integrate models, e.g., LLMs, ranking algorithms, and fraud detection classifiers—into production workflows.
- Optimize & Scale: Leverage technologies such as Azure big data frameworks (ADF, AML), SCOPE, COSMOS, Spark (or similar big data frameworks) to optimize data processing, reduce latency, and manage costs effectively.
- Data Quality & Governance: Implement frameworks for auditing, lineage tracking, and automated validation to ensure data fidelity, compliance, and privacy.
- Reliability & SLAs: Define, monitor, and enforce performance SLAs for mission-critical data flows in a 24×7 environment.
- Automation & Tooling: Develop CI/CD pipelines, monitoring and alerting tools, to reduce manual overhead and streamline deployments.
- Dashboards and Visualization: Develop dashboards using Power BI or similar tools and to enable visualization of data pipeline operations.
- Leadership & Collaboration: Work cross-functionally with product managers, ML researchers, and software engineers; mentor junior engineers and guide architectural best practices.
Qualifications
Required Qualifications:
- Bachelor’s Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
- OR equivalent experience.
- Experience with machine learning workflows and integrating ML models into production pipelines.
- Expertise in distributed systems and big data technologies like Hive, Presto, Spark, or Azure equivalents or similar.
- Solid programming skills in C#, .NET, SQL, Python or equivalent, with a focus on scalable and cost-effective solutions.
- Deep understanding of distributed systems, stream processing, and high-performance computing.
- Proven ability to automate data auditing and implement data lineage tracking tools to reduce operational overhead.
- Experience handling large-scale, high-volume datasets with an emphasis on cost optimization.
- Knowledge of CI/CD pipelines, containerized environments, and cloud infrastructure.
Preferred Qualifications
- Familiarity with data visualization tools for delivering operational insights.
- Proven experience in data privacy compliance and governance practices.
- Hands-on experience in building and deploying machine learning models in production environments.
- Solid communication and collaboration skills to work effectively with diverse teams.
#MicrosoftAI
Microsoft is an equal opportunity employer. Consistent with applicable law, all qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.