Data EngineerHybridFort Mill, SC, USANew York, NY, USA

Overview

Application

Key Responsibilities

· Design, develop, and maintain ETL pipelines using AWS Glue, Glue Studio, and Glue Catalog.

· Ingest, transform, and load large datasets from structured and unstructured sources into AWS data lakes/warehouses.

· Work with S3, Redshift, Athena, Lambda, and Step Functions for data storage, query, and orchestration.

· Build and optimize PySpark/Scala scripts within AWS Glue for complex transformations.

· Implement data quality checks, lineage, and monitoring across pipelines.

· Collaborate with business analysts, data scientists, and product teams to deliver reliable data solutions.

· Ensure compliance with data security, governance, and regulatory requirements (BFSI preferred).

· Troubleshoot production issues and optimize pipeline performance.

Required Qualifications

· 9+ years of experience in Data Engineering, with at least 5+ years on AWS cloud data services.

· Strong expertise in AWS Glue, S3, Redshift, Athena, Lambda, Step Functions, CloudWatch.

· Proficiency in PySpark, Python, SQL for ETL and data transformations.

· Experience in data modeling (star, snowflake, dimensional models) and performance tuning.

· Hands-on experience with data lake/data warehouse architecture and implementation.

· Strong problem-solving skills and ability to work in Agile/Scrum environments.

Preferred Qualifications

· AWS Certified Data Analytics – Specialty or AWS Solutions Architect certification.

· Familiarity with CI/CD pipelines for data engineering (CodePipeline, Jenkins, GitHub Actions).

· Knowledge of BI/Visualization tools like Tableau, Power BI, QuickSight.