AI Agents in Ads need a "Common Language"
In my earlier piece, “Embeddings: Next Frontier”, I focused on why embeddings are the right primitive for advertising: they compress meaning, travel fast, and let systems match intent without shipping raw data.
What I didn’t cover is the next constraint that shows up the moment embeddings leave a single platform: interoperability. If a buyer’s agent speaks one embedding language and a seller’s agent speaks another, you don’t just lose signal - you lose the ability to transact. This is where vector space alignment becomes practical infrastructure: projectors/adapters, capability discovery, and negotiated fallbacks that let agents translate meaning across heterogeneous models.
IAB - Agentic Real Time Framework (ARTF)
The Vector Alignment Handshake
A lot of people hear “AI agents” and think chatbots. In advertising, agents are simpler - and more powerful.
How do we describe an audience in a way both sides understand?
Vector Aligment sequence diagram
That’s what the attached diagram shows. It’s a clean 3-phase flow.
The cast
- Buyer Agent: works for the advertiser. It wants the right people at the right price.
- Seller Agent: works for a publisher or seller. It decides what it can offer.
- Agent Registry: a directory for trust and claims (who supports what). Think “verified profiles.”
This approach is being standardized by IAB Tech Lab.
Phase 1: Capability Discovery
Before anyone buys or sells, they do a quick “handshake.”
The Buyer asks:
“I speak ucp-v1 and bert-base. Do you?”
In plain terms:
- UCP is the format for sharing audience signals.
- Model-X / Model-Y is the “math language” used to describe audiences.
The Seller checks what it supports.
If there’s a match
Great. Both sides agree on:
- the same UCP version
- the same vector format (example: 512-dim float32)
Now they can move forward.
Phase 2: Signal Exchange (Embeddings)
This is where things change vs. old ad tech.
Instead of sending a simple label like:
“Segment = Sports Fans”
…the Buyer and Seller exchange an embedding.
An embedding is like a “meaning fingerprint.” It’s a compact list of numbers that represents a pattern in behavior.
The Seller then does one key step:
Calculate similarity.
That means it scores how close the Buyer’s request is to the Seller’s audience. If it’s close enough, the Seller returns:
- a match score
- and/or eligibility to buy
So the system can make smarter matches without shipping raw personal data around.
When the models don’t match: the “Projector” question
Sometimes the Seller says:
“I only support Model-Y. Do you have a projector?”
A projector is just a translator. It maps Model-X vectors into Model-Y space.
If the Buyer has it, the Seller can:
- apply the projector
- validate the output
- then proceed using embeddings
This is a big deal, because it avoids losing meaning.
Phase 3: Legacy fallback (AdCOM Segment ID)
If the Buyer does not have a projector, the Seller falls back to the old world:
“OK. Fallback to AdCOM Segment ID 12345.”
This is the key trade-off:
- Embeddings keep meaning rich and flexible.
- Segment IDs are easy, but coarse. They drop detail.
Fallback keeps the system running. But it’s not the future.
Why this matters (even if you’re not technical)
This handshake flow is how we get to a better ad ecosystem:
- More accuracy: match based on meaning, not just labels
- More portability: agents can work across platforms
- More privacy control: fewer raw signals moving around
- More automation: agents can negotiate and execute, fast
Tip: If you had to pick one priority for the next 12 months: standardize the “language” (UCP), or standardize the “translator” (projectors)?
2/16/2026 - UPDATE: I got a great question from Brian O’Kelley that I think is worth sharing publicly:
“Aligning on the underlying embedding model is important but what about the structure of what’s being embedded? Doesn’t that need to be the same too?”
Short answer: Yes
The concern is real. Two agents can use the exact same model e.g. “iab-techlab/ucp-distilled-6L-512d” or a “HuggingFace-style” model family and produce vectors in the same space but if one embeds browsing behavior and the other embeds purchase history, cosine similarity between those vectors is mathematically valid but semantically meaningless.
You get a number. It just doesn’t mean anything.This is the difference between geometric compatibility and semantic compatibility. The Vector Alignment Contract (VAC) in UCP handles geometric alignment through model_id and space_id matching. But semantic alignment ie ensuring “high-intent auto shopper” means the same thing on both sides requires something deeper. That’s what the Golden Test Set (GTS) does.
More to come in the Whitepapaer “Agentic Audiences Powered by the Universal Content Protocol (UCP) - A Technical Specification for Embeddings-Based Agent-to-Agent Advertising”
** UPDATE Q&A with **Bosko Milekic
“Why can’t a buyer just ask a seller for what they want? Why vectorize?”
Because what you actually want can’t be said in words.
A buyer can ask for “auto intenders.” They can’t ask for “people whose browsing pattern over the last 72 hours resembles someone approaching a purchase decision, with high cognitive engagement, low price sensitivity, in contexts that pattern-match to past converters.” That’s a 512-dimensional intent state. It has no natural language equivalent. And even if you could describe it, you couldn’t evaluate it against 10 million candidates in 3 milliseconds with a text query.
Vectors don’t replace the ask - they express what the ask can’t.
“This sounds like a data problem, not a vector problem. No single provider has all those signals.”
Exactly. And that’s why you need vectors.
No single provider knows browsing intensity AND TV exposure AND contextual relevance. But three providers do - and vectors are how you compose their partial knowledge into one intent picture.
Practical example: A car brand wants to reach people within 30 days of buying an SUV.
The DSP agent receives all three vectors and computes a composite similarity score against a learned ideal-buyer embedding - in under 3 milliseconds. The bid comes back at $14 CPM, not the $6 that a flat “Auto Intenders” segment would have justified.
No one shared raw data. The vectors composed. The bid reflected multi-signal understanding, not a label.
“But don’t the entities contributing vectors need to have the same data inputs for distance calculations to work?”
No - and this is the architectural insight that makes the whole system work. They don’t need the same inputs. They need the same output space.
A browsing signal provider and a TV exposure provider have zero overlapping raw data. But if both encode “high SUV purchase intent” to the same region of vector space, the dot product is meaningful. The browsing provider got there through configurator visits. The TV provider got there through ad exposure patterns. Different evidence, same meaning. Kasper Skou
This is what the Golden Test Set (GTS) validates. Before any two agents exchange embeddings, they independently encode 1,000+ concept pairs and prove their vectors land in the same neighborhoods - cosine similarity ≥ 0.90 on identity pairs, 95% of all pairs within tolerance. If “auto enthusiast” means something different in your space than mine, the GTS catches it before a single real impression is served.
**Same meaning. Different evidence. One math operation. That’s the shift from buying segments to composing understanding. **Daniel Landsman
More great questions from the field on Agentic Audiences (UCP). Sharing two that keep coming up:
“How do embeddings from different providers actually get into the ad request?”
**The same way signals flow today - just in a denser format. **Anthony Katsur
A publisher’s SSP already enriches bid requests with contextual signals. Data providers already append audience data via cookie sync or UID match. Measurement companies already contribute exposure signals through integrations. Today those arrive as segment IDs and key-value pairs. In UCP, they arrive as vectors. Same pipes, denser payload. The bid request carries a contextual embedding from the publisher, an identity embedding from the data provider, and a reinforcement embedding from the measurement company. Three vectors, each 2KB, riding the same infrastructure that currently carries segment lists.
“The gradient you’re describing already exists in the component data underneath the segment. A provider like Polk sends robust underlying data, the auto company blends it with their own purchase data, builds a bespoke segment or custom algo, and passes it to the DSP. It works. What’s the use case for vectors?”
Honest answer: for that workflow, I wouldn’t argue vectors are better. One data partner, one advertiser, one pre-built integration, human-orchestrated over days or weeks. That’s a well-oiled machine.
Where it breaks is the combinatorial wall.
Imagine an agent that needs to compose Polk’s intent signal with a publisher’s contextual signal with a measurement company’s competitive exposure signal with the advertiser’s own first-party data - per impression, in 4 milliseconds, without any of them sharing raw data with each other. You can’t pre-build a custom segment for every possible 3-5 party combination at impression time. The Polk → auto company pipeline works because it’s bilateral and pre-negotiated. Now multiply that by every possible provider combination, for every impression, at 10M+ decisions per second.
That’s the combinatorial wall. The vector is the custom segment, computed on the fly.
For a world where humans orchestrate bilateral data partnerships and pre-build segments, the current system works. UCP is infrastructure for a world where agents are doing that composition autonomously, at millisecond speed, across parties that don’t have pre-existing data relationships.
The question isn’t whether segments work today. It’s whether they scale to agentic.