Job Title: Senior Architect – AWS | Kafka | | Glue Streaming; API consumption
Location: Dallas, Texas (Preferred) Or Florham Park, New Jersey , NJ
Full time
About the Role:
We are seeking a Senior Technical Lead /Architect with strong expertise in AWS-based streaming data pipelines, Apache Kafka (MSK), AWS Glue, Flink and PySpark, to help solution, design and implement a scalable data ingestion, data validation, data enrichment and reconciliation processing, and event logs, data observability, operational KPI tracking framework.
You will play a key role in solutioning and building out the , event driven capabilities with control gates in place to measure, track, and improve the operational SLAs and drive the data quality and reconciliation workflows for a high-impact data platform serving financial applications
Key Responsibilities:
Provide technical solution discovery effort on any new capabilities or new functionality.
Assist PO with technical user stories to ensure healthy backlog features
Lead the development of real-time data pipelines using AWS DMS, MSK, Kafka or Glue Streaming and for CDC ingestion from multiple SQL Server sources (RDS/on-prem).
Build and optimize streaming and batch data pipelines using AWS Glue (PySpark) to validate, transform, and normalize data to Iceberg and DynamoDB.
Define and enforce data quality, lineage, and reconciliation logic with support for both streaming and batch use cases.
Integrate with S3 Bronze/Silver layers and implement efficient schema evolution and partitioning strategies using Iceberg.
Collaborate with architects, analysts, and downstream application teams to design API and file-based egress layers.
Implement monitoring, logging, and event-based alerting using CloudWatch, SNS, and EventBridge.
Mentor junior developers and enforce best practices for modular, secure, and scalable data pipeline development.
Required Skills:
6+ years of hands-on expert level data engineering experience in cloud-based environments (AWS preferred) with event driven implementation
Strong experience with Apache Kafka / AWS MSK including topic design, partitioning, and Kafka Connect/Debezium
Proficiency in AWS Glue (PySpark) and for both batch and streaming ETL
Working knowledge of AWS DMS, S3, Lake Formation, DynamoDB, and Iceberg
Solid grasp of schema evolution, CDC patterns, and data reconciliation frameworks
Experience with infrastructure-as-code (CDK/Terraform) and DevOps practices (CI/CD,Git