Embeddings: The Next Frontier in Advertising?
Advertising has always been about connection: putting the right message in front of the right person at the right time. Traditionally, that connection has been enabled by identifiers cookies, device IDs, or deterministic logins. As those signals erode under privacy regulation and platform changes, the industry faces a critical question: what replaces IDs as the connective tissue of digital media?
One answer gaining momentum is embeddings dense numerical representations of content, users, or behaviors that make semantic similarity computable at scale. By applying mathematical techniques originally honed in natural language processing (NLP), advertisers can build privacy friendly systems for lookalike audiences, contextual targeting, and creative optimization.
What Are Embeddings?
An embedding is a vector a list of numbers that encodes the essence of some object. That object might be a word, a web page, a video, a user profile, or even a household’s TV viewing history. The key property is that embeddings capture semantic relationships: things that are similar in meaning or context end up close together in vector space.
For example, imagine a simple embedding space of three dimensions:
- “Football” → [0.72, 0.10, 0.58]
- “Basketball” → [0.70, 0.12, 0.55]
- “Cooking” → [ 0.45, 0.73, 0.11]
Even without knowing the labels, we can see that football and basketball lie close together, while cooking is far away. This closeness is not a coincidence it emerges from the patterns in the underlying data.
The similarity between two embeddings is typically measured with cosine similarity, which evaluates the angle between two vectors. A cosine value of 1.0 means the vectors are identical in direction (highly similar), while 0.0 means they are orthogonal (no similarity). Angle between vectors captures semantic proximity:
Why Embeddings Matter in Advertising
In advertising, we deal with a vast array of heterogeneous signals: pages visited, shows streamed, ads viewed, demographics, interests, and more. Each of these can be transformed into an embedding, enabling us to measure semantic similarity between audiences, contexts, and creative.
Three use cases illustrate the power of this approach:
- Lookalike Audiences (LALs) Without IDs Traditionally, lookalike modeling depends on user IDs: take a seed group of converters, model their attributes, and find others with similar IDs. Embeddings enable the same outcome without identifiers. By vectorizing a seed audience’s behaviors (say, households that watched Formula 1 races and engaged with luxury car ads), we can train a classifier in embedding space. Then, any new household with a vector close to that seed centroid is a potential lookalike. Cosine similarity becomes the arbiter of closeness: if Household A has cosine 0.95 with the seed centroid, it’s more “alike” than Household B with cosine 0.62.
- Contextual Targeting 3.0 Contextual targeting has historically been keyword-based match an automotive ad to pages tagged “cars.” Embeddings make it possible to measure deeper relationships. A car ad could be matched not only to “cars” pages but also to semantically adjacent contexts like “Formula 1” or “EV charging.” By embedding both creative assets (e.g., an image of a new sedan) and page text, advertisers can dynamically choose the best creative for the impression, maximizing relevance.
- Cross Media Similarity Embeddings are modality agnostic. A household’s CTV viewing profile, a mobile app usage pattern, and a website browsing history can all be embedded into the same space. This creates a lingua franca for measurement and targeting, breaking down silos between channels. For example, Samba TV ACR data can be vectorized to capture linear, streaming, and ad exposures. A household that watches CNBC and Lexus ads ends up near other high value “luxury intender” households, even if no IDs are shared.
Under The Hood: Embeddings + Classifier + Scorer
The methodology is straightforward:
1.** Embed Users and Contexts**
- Users/households: vectors from their behavior (pages, shows, ads).
- Pages/sites/ads: vectors from text, metadata, and co viewership patterns.
-
Train a Classifier - define a seed audience (positives) and compare to a random background (negatives). Train a classifier logistic regression or shallow neural net to separate them in embedding space.
-
Score New Users/Contexts - deploy the classifier as a scorer. For any new user embedding, output a probability of belonging to the target audience. In real time bidding, this score can be computed in milliseconds.
Example: Luxury Auto Intenders
Suppose we define a seed audience of households exposed to Lexus and Mercedes ads during live sports broadcasts. Their aggregated embedding centroid is:
Seed centroid = [0.71, 0.20, 0.55, 0.25]
Now consider two new households:
- HH_A = [0.70, 0.18, 0.57, 0.27]
- HH_B = [ 0.32, 0.81, 0.14, 0.48]
Cosine similarities:
- cos(Seed, HH_A) = 0.98 → very close (ideal lookalike).
- cos(Seed, HH_B) = 0.21 → not similar (exclude).
HH_A should be targeted in PMP deals or pre bid DSP scoring. HH_B should not.
Why This Approach Works
- Privacy safe: No cookies or device IDs. Embeddings abstract behavior into vectors.
- Cross platform: The same math applies to web, mobile, and CTV.
- Semantically rich: Goes beyond keywords or categories, capturing latent relationships.
- Deployable: Embeddings can be indexed in vector databases and scored in real time.
Closing Thought
As identifiers fade, advertising’s future hinges on semantic similarity. Embeddings transform behavioral and contextual signals into structured numerical space, where cosine similarity can drive targeting, lookalikes, and creative optimization.
But the next leap forward is the fusion of embeddings with Large Language Models (LLMs). While embeddings encode the “where” and “how close” of relationships, LLMs via AGI can provide the “why” and “what it means.” For example, an embedding cluster that machines see as vectors might be described by an LLM as: “households drawn to live sports, premium financial news, and luxury automotive brands.”
This synergy bridges raw math with marketing intuition. Embeddings supply the precision, scalability, and privacy-safety needed in the post-ID era. LLMs add interpretability, storytelling, and strategy turning patterns into narratives that marketers can act on.
The advertising industry’s opportunity is to combine these forces into semantically intelligent, privacy-safe, and cross-platform campaigns. Those who build embedding pipelines today and layer LLM-driven insights on top will not replace the loss of IDs; they will create a new standard of relevance, transparency, and effectiveness for the decade ahead.
**… and Finally, industry feedback… **a natural question arises, both confirmed by Brian O’Kelley and Ari Paparo, as following:
“Isn’t the main challenge portability such that the same embeddings can be used across many different players in the ecosystem?”
Indeed, one of the biggest hurdles is that embeddings are often **non-fungible **(while BOK compared it to hashes, which is a fair play!) the vectors depend on the training data, model architecture, dimensionality, and normalization choices of the platform that produced them. An embedding generated by one provider may not align meaningfully with an embedding from another. This makes direct exchange of vectors between DSPs, SSPs, publishers, and measurement partners difficult.
There are two promising approaches to addressing this challenge:
- Common Embedding Standards Just as IAB created taxonomies for audience and content categories, the industry could converge on shared embedding frameworks: fixed dimensionality, normalization schemes, and benchmarks so vectors from different players can interoperate.
- Translation Layers Alternatively, LLMs or graph neural networks can act as “translators” between embedding spaces. A publisher’s vectors can be mapped into an advertiser’s embedding space via learned alignment models, enabling semantic portability without requiring all parties to retrain from scratch.
In practice, embeddings don’t have to be universally fungible to create value. What matters is relative similarity within a given ecosystem. A DSP can still optimize bidding using its own embedding space, while a publisher uses another. Portability becomes most important in cross-platform measurement and data collaboration, where translation layers or federated standards will likely emerge as the connective tissue.