Job description

Role overview

We are looking for a practical, hands-on GCP Data Engineer who can contribute to the design and delivery of modern enterprise data and AI platforms on Google Cloud Platform. The role partners with lead engineers, enterprise architects, analytics teams, and AI/ML teams to create scalable, dependable, and AI-ready data solutions for analytics, automation, and digital transformation.

This position is suited to someone with a strong base in cloud data engineering, distributed processing, and modern data platform development, and who enjoys building reusable, high-quality data assets in a fast-moving enterprise setting. It also offers the chance to work on large-scale modernization, semantic data enablement, and next-generation AI data ecosystems.

Role details

Experience: 3 to 6 years

Notice period: 15 days

Level: L30

Shift: 1:00 PM to 10:00 PM

Locations: Bangalore and Mumbai

Important note: This position is not meant for purely ETL-focused profiles. Strong hands-on experience in data product development is required.

Key responsibilities

Design, build, and maintain batch as well as real-time data pipelines on GCP.
Create ingestion, transformation, and serving flows that support analytics and AI use cases.
Help move legacy workflows into cloud-native data architectures.
Develop reusable data engineering components aligned with architectural standards.
Contribute to event-driven and streaming data processing implementations.
Build reusable, domain-focused data products.
Implement standardized transformation logic and data models for downstream analytics and AI consumption.
Support data quality checks, schema handling, metadata enrichment, data contracts, and reusable transformation frameworks.
Ensure pipelines are robust, scalable, and ready for production use.
Work with GCP services such as BigQuery, Dataflow, Dataproc, DBT, Pub/Sub, Cloud Storage, Cloud Composer/Airflow, and Cloud SQL.
Develop ETL and ELT pipelines and improve data processing performance.
Support orchestration and scheduling of enterprise workflows.
Track, troubleshoot, and resolve pipeline issues and operational failures.
Help define semantic models and business-friendly structures for analytics and reporting.
Partner with analytics and BI teams to improve consistency and usability of data assets.
Contribute to standardized metrics, dimensions, reporting datasets, and metadata/catalog integration.
Build AI-ready pipelines for ML and GenAI initiatives.
Support feature engineering and data preparation for AI/ML use cases.
Assist with integrations involving Vertex AI, BigQuery ML, vector databases, and GenAI frameworks.
Contribute to semantic search and AI-assisted data interaction patterns.
Follow coding, architecture, and DevOps standards.
Take part in code reviews, testing, debugging, and performance tuning.
Collaborate with architects, lead engineers, analysts, and client stakeholders.
Prepare engineering documentation, operational runbooks, and technical knowledge-sharing material.
Keep learning and adopting modern cloud, data engineering, and AI platform technologies.
Support monitoring, logging, lineage, and observability practices.
Ensure compliance with enterprise security, governance, and compliance requirements.
Assist with incident resolution, root cause analysis, and platform stability improvements.
Contribute to continuous improvement in operations and delivery quality.

Requirements

Bachelor’s degree in Computer Science, Engineering, Information Systems, or a related discipline.
3 to 6 years of experience in data engineering and cloud-based data platform development.
Hands-on work with Google Cloud Platform data services.
Strong programming ability in SQL and Python.
Experience building scalable ETL/ELT pipelines and distributed data processing workflows.
Understanding of data lakes, data warehouses, and streaming pipeline concepts.
Exposure to analytics, AI/ML, or GenAI-enabled data ecosystems is preferred.
Strong analytical, troubleshooting, and problem-solving skills.
Ability to work in Agile and cross-functional delivery teams.
GCP certifications such as Associate Cloud Engineer or Professional Data Engineer are an added advantage.
Exposure to enterprise-scale data modernization programs is preferred.
Familiarity with modern data engineering and data product development approaches is beneficial.
Experience with semantic layers, reporting platforms, or BI ecosystems is preferred.
Exposure to AI-ready data platforms and modern analytics ecosystems is an advantage.
Experience in retail, marketing, customer analytics, or digital commerce domains is a plus.

Technical expertise

Cloud data engineering on GCP, including BigQuery, Dataflow, Dataproc, Pub/Sub, and Cloud Storage.
Data processing with SQL, Python, PySpark, and DBT.
Streaming and pipeline development with Apache Beam and batch/real-time processing patterns.
Workflow orchestration using Cloud Composer (Airflow) and Workflows.
Semantic and analytics enablement with semantic modeling and reporting datasets.
Exposure to Looker is preferred.
AI/ML enablement with Vertex AI, BigQuery ML, and awareness of GenAI ecosystems.
Metadata and governance concepts such as Data Catalog, metadata management, and lineage.
DevOps and automation knowledge, including CI/CD, Git, and automation frameworks.

Additional information

The role is at DCF Level L30. Candidates should be comfortable working in a hands-on engineering capacity and should have stronger depth in data product development rather than only traditional ETL execution. Prior exposure to enterprise data modernization, semantic data structures, and AI-ready platforms will be valuable.

GCP Data Engineer

Where you'll work