Company Overview:
We are building Protege to solve the biggest unmet need in AI — getting access to the right training data. The process today is time intensive, incredibly expensive, and often ends in failure. The Protege platform facilitates the secure, efficient, and privacy-centric exchange of AI training data.
Solving AI’s data problem is a generational opportunity. We’re backed by world-class investors and already powering partnerships with some of the most ambitious teams in AI. The company that succeeds will be one of the largest in AI — and in tech.
We’re a lean, fast-moving, high-trust team of builders who are obsessed with velocity and impact. Our culture is built for people who thrive on ambiguity, own outcomes, and want to shape the future of data and AI.
Role Overview
You will own how Protege understands, represents, and surfaces its data supply—both on-platform and off-platform. You’ll build a unified system of truth for what data exists, what’s accessible but not ingested, what partners can provide, and how supply is discovered and deployed across deals. Your work replaces institutional knowledge and spreadsheets with durable product primitives and sets the foundation for eventual self-service.
What You Will Own
Define the data model across title, asset, and partner levels.
Establish a clear state model (e.g., ingested, accessible, in pipeline, linked/enriched).
Own search and discovery across modalities; set metadata standards with Data Lab + Engineering.
Ensure catalog state is auditable and resilient to refreshes, moves, and deletions.
Create structured visibility into partner supply not yet ingested (potential, cadence, modality coverage, volume estimates).
Enable GTM to scope deals based on available + accessible supply—not only what’s already in-platform.
Provide visibility into partner inclusion in deals, utilization trends, and inventory footprint.
Reduce partner back-and-forth caused by unclear system truth.
Partner with Partnerships to ensure relationships are supported by scalable system representations
Cross-Functional Alignment
Work closely with:
Privacy, Rights & Trust: represent data eligibility and constraints
Data Access & Delivery: to ensure discoverable supply is deliverable
Solutions Architecture: to identify catalog gaps surfaced by deals
Engineering: owns ingestion execution and infrastructure
Who You Are
PM experience in data platforms, marketplaces, catalogs/search, or supply-side systems
Strong information architecture instincts and durable abstraction design
Comfortable driving cross-functional alignment across Product, Engineering, Partnerships, and Data teams
What Success Looks Like
Teams can answer “what do we have / what can we deliver?” without spreadsheets or memory.
Faster deal scoping and fewer partner inclusion disputes.
A catalog that becomes the backbone for scalable onboarding and eventual self-service.