compLogoData Bricks Data ArchitectHybridSan Rafael, CA, USASouth San Francisco, CA, USA
Role Overview
We are seeking a Data Bricks Data Architect to support the design, implementation, and optimization of cloud-native data platforms built on the Data bricks Lakehouse Architecture. This is a hands-on, engineering-driven role requiring deep experience with Apache Spark, Delta Lake, and scalable data pipeline development, combined with early-stage architectural responsibilities.
The role involves close onsite collaboration with client stakeholders, translating analytical and operational requirements into robust, high-performance data architectures, while adhering to best practices for data modeling, governance, reliability, and cost efficiency.

Key Responsibilities
·      Design, develop, and maintain batch and near-real-time data pipelines using Databricks, PySpark, and Spark SQL
·      Implement Medallion (Bronze/Silver/Gold) Lakehouse architectures, ensuring proper data quality, lineage, and transformation logic across layers
·      Build and manage Delta Lake tables, including schema evolution, ACID transactions, time travel, and optimized data layouts
·      Apply performance optimization techniques such as partitioning strategies, Z-Ordering, caching, broadcast joins, and Spark execution tuning
·      Support dimensional and analytical data modeling for downstream consumption by BI tools and analytics applications
·      Assist in defining data ingestion patterns (batch, incremental loads, CDC, and streaming where applicable)
·      Troubleshoot and resolve pipeline failures, data quality issues, and Spark job performance bottlenecks.

Nice-to-Have Skills
·      Exposure to Data bricks Unity Catalog, data governance, and access control models
·      Experience with Data bricks Workflows, Apache Airflow, or Azure Data Factory for orchestration
·      Familiarity with streaming frameworks (Spark Structured Streaming, Kafka) and/or CDC patterns
·      Understanding of data quality frameworks, validation checks, and observability concepts
·      Experience integrating Data bricks with BI tools such as Power BI, Tableau, or Looker
·      Awareness of cost optimization strategies in cloud-based data platforms
·      Prior Lifesciences Domain Experience