Job Description
This role requires an individual with an obsession for data quality, who is comfortable writing Python scripts and libraries, experienced in dealing with financial datasets, and proficient in SQL and database management.
Responsibilities
- Serve as a key point of contact for our clients, managing their expectations effectively and delivering top-notch service.
- Develop and maintain ETL pipelines, ensuring data integrity, quality, and timeliness.
- Write, debug, and optimize Python scripts and libraries to automate data processes.
- Handle complex financial datasets, extracting meaningful insights that drive business value.
- Utilize your strong SQL and database skills to design and implement robust data structures according to client demands.
- Employ data ingestion patterns, including FTP, APIs, and web scraping, to fetch and process data from various sources.
- Collaborate with the broader team to meet project goals, maintaining a high level of communication, documentation, and cooperation.
- Leverage your knowledge of Google Cloud Platform (or similar cloud technologies) to implement scalable solutions.
- Stay current with industry trends and emerging technologies in data engineering.
- Strictly adhere to best practices and promote strong SDLC and DevOps culture using a modern cloud data engineering technology stack.
Qualifications
Education:
- Bachelor’s or Master’s degree in Computer Science, Data Science, or a related field.
- 3-5 years of proven experience as a Data Engineer or similar role, preferably in the financial sector (or an equivalent combination of education and experience).
Experience
- 2+ years of client-facing skills with a background in managed services.
- 3+ years of experience with Python for writing scripts and libraries.
- Experience working with financial datasets.
Technical Skills
- Strong knowledge of data architecture, data modeling, and data management principles.
- Strong SQL and database management skills.
- Proficiency with Git or similar version control systems.
- Proficiency in database technologies, ETL processes, and data ingestion methods (FTP, APIs, and web scraping).
- Experience with at least one cloud platform (e.g., AWS, Azure, Google Cloud).
- Knowledge and hands-on expertise in the following technologies:
- Data lake architectures
- Distributed computing
- Data modeling for relational and non-relational data stores
- Prior hands-on experience using a workflow orchestration system like Apache Airflow, Kestra, Prefect, or Mage at scale.
- Experience with cloud data warehouses like Snowflake, Redshift, Azure Synapse, and Google BigQuery.
- Hands-on coding proficiency in modern Python, Java, JSON, and YAML.
Soft Skills
- Expertise in managing and delivering on client expectations.
- Sense of ownership and responsibility, with the ability to work independently with minimal supervision.
- Demonstrable obsession with data quality and attention to detail.
- Excellent communication and interpersonal skills.
Location Requirement:
- Must be within commuting distance to NYC for regular client meetings.
PI259805607