The Senior / Principal Data Engineer will design, develop, and maintain scalable data platform and analytics solutions across data lakes and operational databases. This role requires hands-on expertise in Azure Databricks, Azure SQL, Python/PySpark, Notebooks, with a strong understanding of data modeling, ETL/ELT best practices, and CI/CD automation in Azure DevOps. The ideal candidate will have a proven record of building robust, efficient, and secure data pipelines that enable analytics, reporting, and AI/ML solutions, preferably in life sciences, clinical research, or healthcare domains.
Requirements
Data Architecture & Engineering
- Design and implement end-to-end data pipelines using Azure Databricks, Azure Data Factory, and ADLS Gen2.
- Build scalable and performant data models for data lakes (Medallion architecture), data warehouses, and operational systems.
- Develop ELT/ETL frameworks for ingestion from APIs, relational sources, flat files, and third-party systems (e.g., Dynamics 365, Veeva, EDC).
- Optimize data transformations, partitioning, and delta lake performance for analytics workloads.
Data Integration & Automation
- Leverage Python and PySpark for data ingestion, cleansing, enrichment, and advanced transformations.
- Implement CI/CD pipelines for data workflows using Azure DevOps and Git, including automated testing, deployment, and monitoring.
- Develop and integrate RESTful APIs for cross-system data exchange and automation.
Analytics Enablement
- Collaborate with the BI team to ensure clean, high-quality, and accessible data for the Power BI platform.
- Support semantic modeling, metric layer design, and data governance best practices.
- Enable advanced analytics by provisioning data for ML/AI initiatives and predictive insights.
Cross Functional Collaboration
- Collaborate with product/system owners, analysts, and business stakeholders to translate analytical requirements into technical data solutions.
- Drive best practices in Agile development, version control, and DevOps workflows.
Education Requirements and Qualifications
Qualifications
- Bachelor’s degree in Computer Science, Information Systems, Engineering, or related field (Master’s preferred).
- Minimum 5–8 years of relevant experience building and maintaining data solutions (data lakes, data warehouses, operational databases).
- Expert-level proficiency in Azure Databricks, PySpark, SQL, and Azure DevOps.
- Proven experience with Azure Data Factory, ADLS Gen2, and Azure SQL Server.
- Working knowledge of CI/CD automation, version control (Git), and infrastructure as code (ARM or Terraform).
- Experience with Power BI or similar analytics platforms (Tableau, Looker) required; experience with Snowflake, Redshift, or Synapse Analytics is a plus.
- Strong analytical, debugging, and performance-tuning skills.
- Experience in life sciences or healthcare industries is a strong plus.
Skills
Core expertise: Expert-level in Databricks, PySpark, SQL, and Azure DevOps
Data engineering: Data modeling, Delta Lake optimization, ETL/ELT design, distributed processing.
Integration & Automation: Azure Data Factory, REST APIs, CI/CD pipelines, Git branching strategies.
Analytics & BI: Power BI (Tableau), semantic layer design, DAX/SQL tuning.
Cloud & DevOps: Azure ecosystem (ADF, ADLS, Azure SQL, Synapse), Infrastructure as Code
Data Governance & Quality: Metadata management, data validation frameworks, logging and monitoring.
Soft skills: Good communication, mentoring, Agile teamwork, analytical thinking, collaboration.
Sponsored
Explore Engineering
Skills in this job
People also search for
Similar Jobs
More jobs at Allucent
Apply for this position
Sign In to ApplyAbout Allucent
Allucent is on a mission to help bring new therapies to light by solving the distinct challenges of small and mid-sized biotech companies. We are purpose-built through the convergence of several leading providers to address this unmet need.