Senior Data Pipeline Engineer
🌍 Location: Remote [USA]
💰Salary: $200-240k + Bonus
About Clever
Clever is a venture-backed real estate technology company on a mission to revolutionize the way people buy, sell, and manage real estate. We are at the forefront of innovation, combining cutting-edge technology with deep industry expertise to create seamless, efficient, and transparent real estate experiences.
We've built the leading online education platform in real estate and earned a 4.9 TrustPilot rating with over 3,800 reviews - helping consumers save over $210 million in real estate fees. If you're excited to play a key role at a company that transforms how people navigate their real estate journey, we’d love to hear from you!
Role Overview
We’re hiring a Senior Data Pipeline Engineer to own and evolve the data infrastructure that powers Clever's core data products, from the pipelines that feed our agent benchmarking tools to the warehouse architecture behind our company's highest-priority initiatives.
You'll be the primary owner of Clever's data platform: AWS infrastructure, orchestration (Airflow/MWAA), Databricks lakehouse, and the ingestion pipelines that bring operational and third-party data into the warehouse. This is a high-autonomy role where you'll shape infrastructure strategy, not just maintain what exists.
This role reports to the Data Engineering Manager and serves as a critical partner across data engineering, product, and operations, helping Clever scale reliable data infrastructure that directly enables revenue-driving products and company priorities.
Key Responsibilities
- Own and operate Clever's data platform infrastructure. Manage AWS services (EC2, RDS, VPC, S3, MWAA), Terraform-managed infrastructure-as-code, and Databricks administration. You are the go-to person for keeping these systems running securely and cost-effectively.
- Maintain and improve data pipeline reliability. The Airflow/MWAA orchestration layer automates all ETL/ELT jobs feeding Databricks. You'll monitor, triage, and resolve pipeline failures — and proactively improve reliability so failures happen less often.
- Build and extend data ingestion pipelines. Design and implement ingestion for new operational data sources (e.g., telephony, CRM, transaction data) that directly support Clever's speed-to-match initiative (P1 company priority).
- Manage database infrastructure. Administer PostgreSQL/RDS instances, including replica promotion, security group configuration, and VPC peering. Ensure databases are performant, secure, and properly networked.
- Support security and compliance. Maintain infrastructure aligned with SOC-2 requirements, including VPN management (Pritunl), SSO configuration, and access controls. Respond to audit findings that require infrastructure changes.
- Collaborate with the data engineering team. Partner closely with data engineers and data analysts to ensure smooth handoffs between infrastructure and pipeline/transformation work. Provide technical mentorship on infrastructure best practices.
- Drive infrastructure strategy. Evaluate opportunities to reduce complexity (e.g., consolidating orchestration, optimizing cloud spend) and propose a forward-looking platform roadmap.
What Success Looks Like:
In 3 months…
- You've onboarded to Clever's AWS environment, Terraform repos, Airflow/MWAA setup, and Databricks workspace, and can independently triage and resolve pipeline alerts.
- You've established a working relationship with the data engineering team and understand how each data product depends on the infrastructure you now own.
- You've completed a documented audit of the current infrastructure state, identifying any immediate risks or technical debt.
In 6 months…
- Pipeline reliability has measurably improved: fewer unresolved alerts, faster mean-time-to-resolution, and no extended outages caused by infrastructure gaps.
- You've delivered at least one new data ingestion pipeline for an operational data source supporting speed-to-match and/or algorithmic content.
- You've proposed and begun executing on an infrastructure roadmap that addresses cost optimization, security posture, and reduced single-points-of-failure.
In 12 months…
- Clever's data platform is more resilient, better documented, and less dependent on any single person than when you started.
- Operational data ingestion is running reliably for multiple new sources, enabling the data team to support Bench Score, Market Score, and speed-to-match with fresh, trustworthy data.
- You've made meaningful progress on at least one strategic infrastructure initiative (e.g., orchestration consolidation, Databricks optimization, or compliance automation).
Ideal Candidate
- You've spent 8+ years in data engineering or data platform roles, with at least 2-3 years focused on infrastructure ownership (not just writing transformations).
- You're comfortable being the primary infrastructure owner on a small team. You see this as an opportunity to shape strategy, not a burden.
- You’ve administered Databricks on AWS and have familiarity with the suite of Databricks tools, including Unity Catalog.
- You have deep, hands-on AWS experience — not just using services, but configuring VPCs, managing security groups, setting up peering, and debugging networking issues.
- You've managed Terraform in production and understand state management, module structure, and the discipline required to keep IaC reliable.
- You've administered Airflow (ideally MWAA) and have opinions about DAG design, monitoring, and failure handling.
- You're pragmatic about tooling: you'll maintain what works, improve what's fragile, and propose changes only when the ROI is clear.
- You communicate clearly with non-infrastructure stakeholders and can explain tradeoffs without jargon.
Qualifications
- 8+ years of experience in data engineering or data platform roles
- Databricks administration and lakehouse architecture
- Apache Airflow administration and DAG development (MWAA preferred)
- AWS (EC2, RDS, VPC, S3, MWAA): our AWS environment is central to all data operations and requires deep, hands-on expertise
- Terraform (infrastructure as code): managing multiple repos across RDS, CDC, and orchestration infrastructure
- Python (PySpark, general scripting)
- PostgreSQL / RDS management, including replica promotion, security group configuration, VPC peering
- VPN setup and management (Pritunl on EC2)
- SSO configuration
- Excellent communication skills, able to influence across teams and levels
- Experience with SOC-2 compliance infrastructure requirements is a plus
- Real estate, fintech, or marketplace industry experience a plus
What We Offer
💰 Salary: $200-240k + Bonus
🏥 Comprehensive Benefits: Medical, dental, vision, and life insurance
🌴 Paid Time Off: 18 days of PTO (increases with tenure) plus 10 paid holidays
📚 Professional Development: Annual budget for learning and career growth
🏝️ Tenure Sabbaticals: Paid sabbaticals to celebrate major milestones
🏡 Clever Product Benefit: Exclusive access to Clever homeownership perks
🖥️ Work-From-Home Stipend: Support for your remote workspace
💼 401(k): Retirement plan administered through Guideline
👶 Parental Leave: 6–12 weeks of paid parental leave
💙 Wellness Benefits: Free counseling sessions and optional weekly meditation
Equal Employment Opportunity Employer Statement:
Clever Real Estate provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.