Position Overview
The Data Engineering Lead is responsible for designing and implementing modern, scalable data architectures to support migration of legacy, file-based analytical systems to AWS Cloud Native environments.
This role leads the transformation of legacy SAS-based data storage models—including flat files, batch outputs, and subsystem-specific data artifacts—into structured, governed, and scalable data models optimized for cloud-native processing.
The Data Engineering Lead will ensure data integrity, performance, and visibility across a system-of-systems modernization initiative, while providing technical leadership for data modeling, ingestion patterns, validation frameworks, and transparency reporting.
Expert-level proficiency in Python and strong experience designing AWS-based data architectures are required.
Key Responsibilities
Legacy Data Discovery & Data Model Transformation
- Participate in structured system inventory efforts to document:
- Legacy file-based storage structures
- SAS dataset dependencies
- Subsystem data flows
- Manual gating and handoff processes
- Analyze legacy storage models and design target-state data models aligned to AWS Cloud Native architecture.
- Replace file-driven batch dependencies with:
- API-based ingestion
- Event-driven workflows
- Database-backed storage (e.g., Aurora/Postgres)
- Define canonical data schemas and transformation standards.
Cloud-Native Data Architecture Design
- Architect scalable AWS data pipelines using services such as:
- S3
- Glue
- Lambda
- EventBridge
- SNS/SQS
- Aurora/Postgres
- Batch
- Athena
- Design data ingestion, staging, transformation, and validation workflows.
- Establish schema management, versioning, and data lineage practices.
- Optimize data storage for performance, scalability, and cost efficiency.
- Support serverless and containerized data processing architectures.
Expert Python-Based Data Engineering
- Develop advanced Python-based data transformation and validation pipelines.
- Implement modular, reusable data processing components.
- Optimize large-scale data manipulation for distributed execution.
- Develop high-performance ETL/ELT frameworks.
- Embed automated validation checks directly into data pipelines.
Expert-level Python proficiency is required, particularly for:
- High-volume data processing
- Data validation logic
- Modular data engineering frameworks
Data Accuracy, Validation & Visibility
- Design and implement automated data validation frameworks to ensure:
- Functional equivalence during migration
- Record-level and aggregate-level consistency
- Downstream compatibility across subsystems
- Develop dashboards and reporting mechanisms providing:
- Data accuracy metrics
- Pipeline health indicators
- Variance detection summaries
- Enable transparency into data transformation impacts across modernization phases.
- Support regression validation through golden datasets and automated comparisons.
System-of-Systems Data Coordination
- Coordinate with Senior Developers and Requirements Engineers to align data models with application modernization.
- Ensure upstream/downstream data contract stability.
- Prevent data thrashing during phased migration.
- Support orchestration of gated workflows through automated triggers rather than manual file exchanges.
- Collaborate across workstreams to establish shared data standards.
DevSecOps & Governance Alignment
- Integrate data pipelines into CI/CD frameworks.
- Support infrastructure-as-code alignment (Terraform/CloudFormation collaboration).
- Ensure compliance with security controls (IAM, encryption, key management).
- Produce documentation supporting:
- Architecture review boards
- Interface control documents
- Data flow diagrams
- Support ATO-related data validation evidence.
Requirements
Required Qualifications
- 8+ years of experience in data engineering or data architecture.
- Expert-level proficiency in Python for data engineering.
- Demonstrated experience transforming legacy file-based systems into cloud-native data architectures.
- Experience developing data models for high-volume, data-intensive applications.
- Deep experience with AWS data services (Glue, Lambda, S3, Aurora/Postgres, EventBridge, etc.).
- Experience designing scalable ETL/ELT pipelines.
- Experience building analytical dashboards (e.g., QuickSight or equivalent).
- Experience implementing automated data validation and quality controls.
- Experience working in Agile Scrum Teams.
- U.S. Citizenship required.
Preferred Qualifications
- Experience modernizing SAS-based data environments.
- Experience supporting system-of-systems integration programs.
- Experience implementing data lineage and metadata management.
- Experience operating in regulated or federal environments.
Key Competencies
- Systems-level thinking across data ecosystems
- Strong schema design and normalization expertise
- Data accuracy and integrity focus
- Automation-first mindset
- Cross-workstream coordination capability
Benefits
- 401(k) with matching and 100% Vested
- Health Insurance - 3 plans to select from
- Dental insurance
- Vision Insurance
- Health savings account
- Life insurance
- Short Term Disability
- Long Term Disability
- AD&D
- Paid time off
- Professional development assistance
- Training
- Tuition reimbursement
- Flexible schedule
- Flexible spending account
- Referral program
- Paid Legal Plan
- and more....
Ignite IT is an Equal Employment Opportunity/Affirmative Action Employer. We evaluate qualified applicants without regard to race, color, religion, sex, national origin, disability, Veteran status, sexual orientation, or other protected characteristic. In accordance with EO 13665 Final Rule, Ignite IT will not discharge or in any other manner discriminate against employees or applicants because they have inquired about, discussed, or disclosed their own pay or the pay of another employee or applicant.
Applicants selected may be required to possess and maintain a government clearance
US CITIZENSHIP REQUIRED