Is the Data Engineer | NDA role at Gt Hq remote?

Yes, the Data Engineer | NDA role at Gt Hq is a remote position.

How do I apply for the Data Engineer | NDA position at Gt Hq?

You can apply for the Data Engineer | NDA position at Gt Hq directly through HireHere. Click the "Apply" button on the job listing to be taken to the application page.

Data Engineer | NDA

Gt HqRemote1w ago

GT was founded in 2019 by a former Apple, Nest, and Google executive. GT’s mission is to connect the world’s best talent with product careers offered by high-growth companies in the UK, USA, Canada, Germany, and the Netherlands.

On behalf of the client, GT is looking for a Data Engineer.

Note: It's a contract role for 6 months (with potential prolongation to 12 months).

About the Client & the Project

Our client is a leading global management consultancy known for tackling some of the world’s most complex business challenges. With a focus on strategy, transformation, and performance improvement, the firm partners with major organizations across industries to drive lasting impact. Recognized consistently as a top workplace, it combines deep industry expertise with a collaborative, innovative culture. Its centralized European hub plays a key role in supporting operations across the EMEA region, ensuring excellence and efficiency at scale.

About the Role

The Senior Data Engineer will design, build, and optimize the batch data pipelines and algorithms that power this matching engine. This is a hands-on role focused on scale, performance, and accuracy, working with Spark, SQL, and modern orchestration tools. This is not a traditional analytics role — it is a deep data platform and algorithmic engineering at massive scale.

Responsibilities:

Build Data Pipelines: Design and maintain robust, scalable ETL/ELT pipelines to ingest and process third-party and first-party datasets.
Data Quality & Enrichment: Apply transformation, normalization, and enrichment rules to ensure data consistency and usability.
Collaborate Across Teams: Work with product managers, data architects, and content experts from Coro and Helix to align data structure with business needs.
Operationalize Matching & Merging Logic: Support the implementation of data matching and entity resolution processes using AI/ML tools and proprietary frameworks.
Monitor & Troubleshoot Pipelines: Build alerts, logs, and metrics to ensure data flows remain healthy and issues are identified and resolved quickly.
Documentation & Standards: Contribute to documentation, code quality standards, and internal best practices to ensure maintainability.

Essential knowledge, skills & experience:

Experience: 5+ years of experience in Big Data, Data Platform, or ETL engineering roles.
Tech Stack: Proficient in SQL, Python, and experience with Spark(PySpark or Scala), Airflow, Snowflake, and Azure Data Lake or similar technologies.
Cloud Platforms: Familiarity with Azure (preferred) or other major cloud platforms.
Proven experience designing and operating large-scale batch data pipelines
Solid understanding of distributed systems and algorithms (partitioning, shuffles, joins, scalability trade-offs).
Curiosity & Ownership: Proactive, detail-oriented, and eager to take ownership of projects and continuously improve systems.
Team Player: Comfortable working in a cross-functional environment and open to learning from and supporting teammates.

Nice to have:

Experience with, or strong interest in, fuzzy and semantic matching techniques
Exposure to ML-assisted data pipelines
Familiarity with search or retrieval systems (e.g., Elasticsearch, OpenSearch, vector databases).

Interview Process

Interview with GT Recruiter
Client intro call
Technical Interview
Final Interview

Why join our client?

Join a fast-growing, high-impact team
Contribute to an ambitious effort to create the highest quality, most comprehensive business directory in the world.
Be part of a startup-style group within the company that’s redefining how they deliver consulting through productization and data innovation.
Work with cutting-edge data tools, including AI/ML enrichment, semantic matching, and modern cloud-based infrastructure.