Data Engineer

Information Technology

New York (New York), New York (United States), United States

January 2, 2025

Full-time

Apply Now

Key Responsibilities:

Databricks Platform Expertise:
Develop, manage, and optimize data pipelines on the Databricks platform.
Debug and troubleshoot Spark applications to ensure reliability and performance.
Implement best practices for Spark compute and optimize workloads.
Python Development:
Write clean, efficient, and reusable Python code using object-oriented programming principles.
Design and build APIs to support data integration and application needs.
Develop scripts and tools to automate data processing and workflows.
MongoDB Management:
Integrate, query, and manage data within MongoDB.
Ensure efficient storage and retrieval processes tailored to application requirements.
Optimize MongoDB performance for large-scale data handling.
Collaboration and Problem Solving:
Work closely with data scientists, analysts, and other stakeholders to understand data needs and deliver solutions.
Proactively identify and address technical challenges related to data processing and system design.

Required Qualifications:

Proven experience working with Databricks and Spark compute.
Proficient in Python, including object-oriented programming and API development.
Familiarity with MongoDB, including querying, data modeling, and optimization.
Strong problem-solving skills and ability to debug and optimize data processing tasks.
Experience with large-scale data processing and distributed systems.

Preferred Qualifications:

Knowledge of other big data technologies like Delta Lake, Hadoop, or Kafka.
Experience with cloud platforms (e.g., AWS, Azure, or GCP).
Familiarity with CI/CD pipelines and version control systems like Git.
Strong understanding of data architecture, ETL processes, and data warehousing concepts.

Mechanical Engineer (Power and Aircon)