Lookalike Audiences Explained: The Complete Guide to Finding Your Best Customers on Google and Meta

If you have ever wished you could clone your best customers and find thousands more people just like them, you are in luck. That is precisely what lookalike audiences (or similar audiences) are designed to do. They are one of the most powerful tools in digital advertising, yet many marketers either do not fully understand how they work or are not using them to their full potential. In this guide, we will pull back the curtain on lookalike audiences, covering what they are, how to set them up on both Google and Meta, their benefits and limitations, and then get properly technical about what is happening under the hood. By the end, you will understand not just how to use them, but why they work the way they do.

What Are Lookalike Audiences?

If you've ever wished you could clone your best customers and find thousands more people just like them, you're in luck. That's precisely what lookalike audiences (or similar audiences) are designed to do. They're one of the most powerful tools in digital advertising, yet many marketers either don't fully understand how they work or aren't using them to their full potential.

In this guide, we're going to pull back the curtain on lookalike audiences. We'll cover what they are, how to set them up on both Google and Meta, their benefits and limitations, and then get properly technical about what's happening under the hood. By the end, you'll understand not just how to use them, but why they work the way they do.

What Are Lookalike Audiences?

At its core, a lookalike audience is a group of people who share similar characteristics with an existing group of your customers or engaged users. Think of it as asking the advertising platform to study your best customers, identify what makes them tick, and then find other people across their network who look and behave in similar ways.

Here's a simple analogy: imagine you run a coffee shop and you notice that your best customers all seem to be creative professionals who work from home, live within a few miles of the shop, and tend to visit mid-morning. You can't personally knock on every door in your area looking for similar people, but with lookalike audiences, the advertising platform essentially does this at scale. It analyses patterns in your existing customer data and finds new people who match those patterns, even if they've never heard of your business.

The concept has been around since Facebook (now Meta) introduced it in 2013, and it's since become an industry standard across most major advertising platforms including Google, LinkedIn, Snapchat, and TikTok.

What Are Lookalike Audiences Actually For?

Lookalike audiences are primarily designed for prospecting, which means finding new potential customers who haven't interacted with your business yet but are likely to be interested. They sit in the middle ground between two other common approaches:

Interest-based targeting is where you manually select demographics, interests, and behaviours you think your ideal customer might have. It's a bit of educated guesswork.

Retargeting focuses on people who've already visited your website or engaged with your brand. Great for conversions, but limited in reach.

Lookalike audiences bridge this gap. They help you reach cold audiences (people who don't know you yet) but with a much higher likelihood of being interested than random targeting. You're not guessing what your ideal customer looks like; you're using actual data from real customers to inform the targeting.

Common use cases include scaling successful campaigns to reach more potential buyers, entering new geographic markets using existing customer data as a template, launching new products to audiences similar to buyers of related products, and building brand awareness amongst the right demographics rather than broadcasting to everyone.

Setting Up Lookalike Audiences on Meta (Facebook and Instagram)

Meta's lookalike audiences have been around the longest and remain one of the most sophisticated implementations. Here's how to set them up and what you need to know.

Requirements

Before you can create a lookalike audience on Meta, you need a source audience (also called a seed audience). This must contain at least 100 people from a single country. However, Meta recommends having between 1,000 and 5,000 people in your source for best results. The more data the algorithm has to work with, the better it can identify meaningful patterns.

Your source audience can come from several places: a customer list you upload (email addresses, phone numbers, addresses), website visitors tracked via the Meta Pixel, app users and their activities, people who've engaged with your Facebook or Instagram content, video viewers, lead form submissions, or people who've interacted with your shopping catalogue.

You'll also need admin or advertiser permissions on the ad account and, if using pixel data, admin access to that pixel.

Step-by-Step Setup

First, log into Meta Ads Manager and go to the Audiences section (you can find this under All Tools or within Meta Business Suite). Click Create Audience and select Lookalike Audience from the dropdown.

Next, choose your source. This is the most important decision you'll make. Meta prefers value-based sources (where you've passed purchase values through the pixel), but you can also use custom audiences. If you're using a value-based source, select your pixel and the relevant event (purchases work best).

Then select the countries or regions where you want to find similar people. Interestingly, you don't need anyone from those locations in your source audience; Meta will find similar patterns regardless.

Finally, choose your audience size using a percentage scale from 1% to 10%. This percentage represents the proportion of the total population in your selected location. A 1% lookalike in the United States represents roughly 2 million people. Smaller percentages (1-2%) mean higher similarity to your source but smaller reach. Larger percentages (5-10%) give you more reach but less precision.

You can create multiple lookalike audiences at different percentage levels simultaneously, which is handy for testing. Once created, it can take anywhere from six to 24 hours for your lookalike audience to be ready.

Setting Up Lookalike Audiences on Google

Google's approach to lookalike audiences has evolved significantly. The original feature, called Similar Audiences (or Similar Segments), was retired in August 2023. In its place, Google now offers Lookalike Segments, but with some important differences from the old system and from Meta's implementation.

Requirements

Lookalike Segments on Google are exclusively available for Demand Gen campaigns. If you try to use them in other campaign types like Search or Display, they'll show as "Eligible - Limited" and won't be effective.

You need a seed list with a minimum of 100 active matched people (Google's documentation sometimes mentions 1,000 as ideal). This can come from customer match lists (uploaded customer data), remarketing lists from your website or app, YouTube channel interactions, or your existing data segments.

Unlike the old Similar Segments which were automatically generated, you need to manually create Lookalike Segments yourself.

Step-by-Step Setup

Create or open a Demand Gen campaign. Within your ad group, edit your audience targeting and look for the "+ New segment" option under the Lookalike segment section of Audience Builder.

Give your lookalike segment a descriptive name that includes details about the seed list, locations, and reach level. This makes it much easier to manage multiple segments later.

Select your seed list. Aim for high-quality, engaged users like purchasers, active app users, or YouTube subscribers rather than broad website visitors.

Choose your target location. Unlike Meta, Google requires you to specify countries, and you can include multiple countries in the same segment.

Select your reach level. Google offers three options: Narrow (approximately 2.5% of users in your target location, highest similarity), Balanced (approximately 5% of users, the default option), and Broad (approximately 10% of users, largest reach but less similar).

Save your segment. Google recommends creating your lookalike segment two to three days before your campaign starts, as it takes time to populate. The segment will then refresh automatically every one to two days based on updated customer data.

A key tip: when using lookalike segments, turn off optimised targeting in the ad group settings. This ensures your ads only target users in your lookalike segment rather than allowing Google to expand beyond it.

The Benefits of Lookalike Audiences

There are several compelling reasons why lookalike audiences have become a staple of digital advertising strategies.

Better ROI from cold traffic: When you target people who share traits with your existing customers, conversion rates from cold audiences tend to be significantly higher. Some studies have shown 1% lookalikes delivering around 26% lower cost-per-acquisition compared to interest-based targeting when creative and budget are held constant.

Removes the guesswork: Instead of manually selecting interest categories and hoping they correlate with purchase intent, you're using actual customer behaviour data. The algorithm is finding patterns you might never have identified yourself.

Scalability: Once you've found what works with a narrow lookalike, you can progressively expand to larger percentages. This gives you a clear path to scaling campaigns whilst maintaining relevance.

Automatic updates: Both Google and Meta automatically refresh lookalike audiences based on your latest customer data, meaning your targeting stays current without manual intervention.

Works across products: You can create different lookalike audiences from different customer segments. Buyers of Product A might look different from buyers of Product B, and you can target each group's lookalikes accordingly.

Flexibility in similarity vs reach: The percentage controls give you the ability to balance precision against scale depending on your campaign objectives and budget.

The Limitations You Need to Know About

Lookalike audiences aren't perfect, and understanding their limitations is crucial for using them effectively.

Quality in, quality out: Your lookalike audience is only as good as your source audience. If your seed list contains a random mix of customers with nothing meaningful in common, the algorithm will struggle to find coherent patterns. A focused seed of 1,000 highly engaged users typically produces better results than a broad list of 50,000 mixed-quality contacts.

Privacy restrictions have reduced effectiveness: This is the elephant in the room. Apple's App Tracking Transparency (ATT), introduced with iOS 14.5, has significantly impacted lookalike audience quality on Meta. With roughly 90% of iOS users opting out of tracking, there's simply less behavioural data available to build accurate models. The days of hyper-precise lookalikes based on extensive cross-app tracking are largely over for iOS users.

Platform limitations: Google's Lookalike Segments only work in Demand Gen campaigns. Meta's lookalikes work across their ad formats but are confined to their platforms. You can't export a lookalike audience to use elsewhere.

Black box algorithms: You can see the size of your lookalike audience and some basic demographics, but you can't see exactly what criteria the algorithm used. This makes troubleshooting poor performance difficult.

Audience fatigue: Performance typically declines over time as you continuously target the same lookalike audience. You need to regularly refresh source audiences with new customer data and rotate between different lookalike bases.

Minimum size requirements: If your business is new or operates in a niche market, you may struggle to hit the minimum seed size (100 people, though 1,000+ is recommended). Without enough data, the algorithms can't identify meaningful patterns.

Potential for bias: If your current customer base is skewed in some way (perhaps due to your historical marketing targeting), your lookalike will perpetuate that skew rather than help you reach genuinely new demographics.

Under the Hood: How Lookalike Modelling Actually Works

Now let's get into the technical details. Understanding what's happening behind the scenes helps explain both the power and the limitations of lookalike audiences.

The Basic Process

Lookalike modelling is fundamentally a machine learning technique that works through several stages.

Stage 1: Data Collection and Enrichment. When you upload a seed audience or create one from pixel data, the platform enriches this with its own first-party data. Meta, for example, uses over 200 factors to understand user behaviour. This includes demographic information (age, gender, location, education, job title), behavioural signals (purchase history, browsing patterns, app usage, device types), interest indicators (pages liked, groups joined, content engaged with), and network information (who users are connected to, community overlaps).

Stage 2: Feature Engineering. The algorithm identifies which characteristics are most predictive of belonging to your seed audience. This is where the magic happens. Rather than treating all data points equally, the model learns which combinations of features best define your audience. For example, it might discover that your customers aren't just "women aged 25-34" but more specifically "women aged 25-34 who have engaged with fitness content, own Apple devices, and have previously purchased from subscription-based services."

Stage 3: Similarity Scoring. Once the model understands what makes your seed audience distinctive, it scores the broader population on how closely each user matches. This typically uses distance functions or statistical models to calculate a similarity score for every user in the target geography. Users are ranked from most similar to least similar.

Stage 4: Audience Generation. When you select a percentage (say 1%), the platform takes the top 1% of scored users and creates your lookalike audience. A 5% lookalike includes a larger pool but necessarily includes users with lower similarity scores.

The Machine Learning Algorithms

While the exact algorithms used by Google and Meta are proprietary, lookalike modelling typically employs several machine learning techniques.

Logistic Regression is a foundational approach that predicts the probability of a user belonging to the seed group based on weighted feature combinations. It's computationally efficient and interpretable.

Random Forests combine multiple decision trees to improve predictive accuracy. They're particularly good at handling large datasets with many features and can capture complex, non-linear relationships between variables.

Neural Networks are increasingly used for more sophisticated modelling. They can identify patterns across hundreds of variables simultaneously and learn hierarchical representations of user behaviour.

K-Nearest Neighbours (KNN) is another common approach that uses proximity to make predictions. It identifies the 'k' users from the general population who are closest (most similar) to your seed audience members.

Support Vector Machines (SVMs) work by finding the boundary that best separates your seed audience from the general population. Users close to your side of the boundary become your lookalike audience.

A critical technical point: most lookalike models are trained using what's called PU learning (Positive-Unlabelled learning). This means the model only has confirmed positive examples (your seed audience) but doesn't have confirmed negative examples (people you know definitely aren't similar). This is different from traditional classification where you have both positive and negative training data.

This PU learning approach makes lookalike models easy to use (you just need your customer list) but also introduces potential for bias. If your seed audience shares some characteristic that's irrelevant to actual purchase intent (perhaps they all signed up during a specific promotional period), the model might latch onto that characteristic and create a lookalike based on it, even though it won't predict future conversions well.

How Meta's Algorithm Works Specifically

Meta's system analyses your source audience and extracts data about demographics, activities, behaviours, and interests. It then tries to generate a similar audience by finding users across Facebook and Instagram who exhibit matching patterns.

Importantly, while your lookalike audience remains fixed once created, Meta's ad delivery system continues learning within that audience. As people engage with your ads, the algorithm prioritises those more likely to convert, refining performance without changing the audience definition itself. This is why creative quality and offer strength still matter enormously, even with great targeting.

For value-based lookalikes, Meta also considers purchase amounts, allowing it to prioritise finding users similar to your highest-value customers rather than just any customer.

How Google's Algorithm Differs

Google's approach has shifted significantly since retiring Similar Audiences. The new Lookalike Segments focus heavily on first-party data from channels you control (website traffic, CRM data, YouTube engagement) rather than relying on third-party data.

Google analyses your seed list and identifies common characteristics, then finds users who share those traits. The three reach levels (Narrow, Balanced, Broad) correspond to different similarity thresholds, targeting 2.5%, 5%, or 10% of users in your selected location who are most similar to your seed.

Because Google's system is now tied specifically to Demand Gen campaigns, it can optimise across YouTube, Discover, and Gmail placements while maintaining the lookalike targeting.

The Privacy Revolution: Why Lookalikes Aren't What They Used To Be

The effectiveness of lookalike audiences has been fundamentally altered by privacy changes, and understanding this is crucial for modern advertisers.

Apple's App Tracking Transparency, introduced in iOS 14.5, requires apps to ask permission before tracking users across other companies' apps and websites. The opt-out rate has been devastating for advertisers, with roughly 88% of iOS users worldwide choosing not to allow tracking (96% in the United States).

For lookalike audiences, this means Meta has significantly less cross-app behavioural data for iOS users. The ability to create highly granular custom audiences and lookalikes based on detailed user behaviour across apps and websites has become substantially less effective for this large segment of users.

Similarly, the phasing out of third-party cookies (already gone in Safari and Firefox, with Chrome following suit) reduces the behavioural signals available for web-based tracking.

The practical impact: advertisers now need much larger audience sizes than before. Whereas an audience of 50,000 to 200,000 users was once sufficient for effective Facebook campaigns, you should now aim for at least 500,000 users to give the algorithm enough data to work with. Campaigns with smaller audiences risk getting stuck in the learning phase indefinitely.

Adapting to the New Reality

Several strategies can help maintain lookalike effectiveness in this privacy-first world.

Prioritise first-party data: Collect customer data directly through your website, email signups, and loyalty programmes. This data is unaffected by tracking restrictions and creates highly targeted lookalikes without relying on third-party trackers.

Use Meta's Conversions API (CAPI): This sends conversion data directly from your server to Meta, bypassing browser-level restrictions. It provides more robust data matching by including hashed email addresses, phone numbers, and other identifiers alongside conversion events.

Leverage platform-native engagement: Custom audiences created from Meta's own sources (video viewers, lead forms, Instagram engagement, Facebook Page interactions) aren't affected by ATT restrictions. These make excellent seeds for lookalikes.

Test broader targeting: Meta's Advantage+ campaigns and Google's optimised targeting increasingly use machine learning to find converters without traditional audience definitions. Some advertisers are finding these perform as well as or better than traditional lookalikes.

Real-World Case Study: How We Built Precision Lookalikes for an Automotive Marketplace

Theory is all well and good, but let's look at how this plays out in practice. We recently worked on a project for Auto-Prammer.at, an Austrian automotive marketplace, where we got rather crafty with event-based audience building and lookalike targeting. The results were impressive, and the approach illustrates just how granular you can get when you think creatively about your data.

The Challenge

Car buyers aren't a homogenous group. Someone shopping for an Audi A3 is a completely different prospect from someone looking at a BMW X5 or a Volkswagen Transporter. They have different budgets, different priorities, and respond to different messaging. The challenge was to move beyond generic "car buyer" targeting and create hyper-specific audiences based on actual browsing behaviour, then use those to find similar prospects at scale.

The Crafty Bit: Event-Based Audience Architecture

Rather than simply tracking page views, we built a sophisticated event tracking system that captured meaningful intent signals. Here's where it gets interesting.

Brand-specific browsing: We created custom events that fired when users viewed vehicles from specific manufacturers. Someone who looked at three different Audi listings got tagged as an "Audi Interested" user. Same for BMW, Mercedes, Volkswagen, and every other brand on the platform.

Scroll depth as an intent signal: Here's where we got clever. A page view doesn't tell you much, but scroll depth tells you everything. We tracked how far users scrolled on individual vehicle listings. Someone who lands on an Audi A3 page and immediately bounces is very different from someone who scrolls through all the photos, reads the full specification list, and spends time on the financing section. We set thresholds: users who scrolled past 75% of a listing were flagged as "high intent" for that specific vehicle type.

Model-level granularity: We didn't stop at brand level. The system tracked interest at the model level too. Users showing deep engagement with Audi A3 listings became part of an "Audi A3 High Intent" audience, separate from "Audi Q5 High Intent" or "Audi A6 High Intent." This gave us incredibly specific segments.

Price bracket segmentation: We also layered in price sensitivity. Users who consistently viewed vehicles in the €15,000 to €25,000 range were categorised differently from those browsing €50,000+ listings. This prevented us from showing premium vehicle ads to budget-conscious shoppers and vice versa.

Putting It Into Action: Retargeting Meets Prospecting

With these granular audiences built, we could do two powerful things simultaneously.

Precision retargeting: When someone showed high intent on Audi A3 listings (multiple views, deep scroll depth, time on page), we could retarget them specifically with Audi A3 inventory ads. Not generic "check out our cars" messaging, but "New Audi A3 listings matching your search" with creative featuring that exact model. The relevance was immediately apparent, and click-through rates reflected it.

Lookalike prospecting by vehicle type: Here's where the real magic happened. We took our "Audi A3 High Intent" audience and created a 1% lookalike from it. Think about what that means: we weren't just finding people similar to "car shoppers." We were finding people who shared characteristics with users who had demonstrated specific, deep interest in Audi A3 vehicles. These lookalikes were served Audi A3 focused ads, reaching cold prospects who had never visited the site but matched the profile of engaged Audi A3 shoppers.

We repeated this pattern across the most popular vehicle types on the platform. BMW 3 Series lookalikes got BMW 3 Series ads. Mercedes C-Class lookalikes got Mercedes C-Class ads. The messaging matched the intent signal that had qualified the original seed audience.

Why This Approach Works So Well

The beauty of this setup comes down to seed audience quality and consistency, which, as we discussed in the technical section, is everything for lookalike performance.

By using scroll depth and engagement time rather than simple page views, we ensured our seed audiences contained genuinely interested users, not accidental clicks or bounced visitors. The algorithm had clean, high-quality signals to work with.

By segmenting at the brand and model level, we kept each seed audience internally consistent. Everyone in the "Audi A3 High Intent" audience had demonstrated the same type of behaviour on the same type of content. There was no signal dilution from mixing SUV shoppers with hatchback shoppers or luxury buyers with budget hunters.

And by matching the creative to the lookalike source, we maintained relevance all the way through. The prospects in our Audi A3 lookalike were seeing Audi A3 ads because that's what people like them had shown interest in. The whole funnel was coherent.

The Results

The granular approach significantly outperformed generic automotive targeting. Cost per lead dropped because we weren't wasting impressions on poorly matched prospects. Conversion rates improved because the journey from ad to listing felt seamless and relevant. And we could scale each vehicle category independently, putting more budget behind lookalikes that performed well and pausing those that didn't.

Perhaps most valuably, we built a system that continuously improves itself. As more users engage with the site, the seed audiences grow and refresh. New high-intent users feed into the lookalike models, keeping them current. It's a flywheel effect: better targeting brings better traffic, which creates better seed audiences, which enables better lookalikes.

Key Takeaways for Your Own Implementation

If you're running an e-commerce operation with diverse product categories, this approach translates directly. Think about what behavioural signals indicate genuine interest in your context. Page views are table stakes; scroll depth, time on page, configurator interactions, size selector usage, colour swatch clicks, add-to-wishlist actions: these are the intent signals that separate browsers from buyers.

Build your audience architecture around product categories that have meaningfully different customer profiles. Don't mix them. Create separate seeds and separate lookalikes for each, and match your creative accordingly.

The extra setup work pays dividends. Generic lookalikes from "all website visitors" will always underperform compared to crafted lookalikes from "high-intent visitors to specific product categories." The algorithm is only as smart as the data you feed it, and thoughtful event architecture gives it much better data to work with.

Best Practices for Maximum Results

Based on how lookalike algorithms work, here are practical tips to get the best results.

Start with your best customers, not all customers: A seed list of your top 20% of customers by lifetime value will produce better lookalikes than your entire customer database. The algorithm needs clear patterns to identify, and mixing high-value and low-value customers dilutes those patterns.

Keep seed audiences consistent: If you're an e-commerce business with multiple product categories, create separate lookalikes for each category rather than mixing all purchasers together. Someone who buys gardening supplies looks very different from someone who buys electronics, and mixing them produces a confused lookalike.

Test percentage levels: Don't assume 1% is always best. Sometimes 5% or even 10% lookalikes outperform narrower ones, especially if your seed audience is small or your product has broad appeal. Always A/B test different levels.

Refresh regularly: Update your source audiences with new customer data monthly. Create new lookalikes based on recent conversions. Rotate between different lookalike bases to combat audience fatigue.

Layer thoughtfully: You can combine lookalike audiences with additional targeting like age ranges or geographic restrictions. This is useful when you know your product genuinely doesn't suit certain demographics, but be careful not to over-restrict and reduce your audience to an unworkable size.

Exclude existing customers: When running prospecting campaigns with lookalikes, exclude your existing customer list and recent website visitors. You don't want to pay to reach people who already know you.

Give campaigns time to learn: Don't judge performance after 24 hours. The algorithms need time (typically 3-7 days) to learn which users within your lookalike are most likely to convert. Early results are often misleading.

The Future of Lookalike Audiences

The advertising industry is adapting to a privacy-first world, and lookalike audiences are evolving along with it.

Meta's Andromeda update represents a shift in how their ad system works. The emphasis has moved from precise audience targeting to creative diversification as the primary lever for finding relevant audiences. The algorithm now rewards variety in ad creative and punishes narrow targeting and minor ad variations.

Google's replacement of Similar Audiences with Lookalike Segments (limited to Demand Gen campaigns) and increased reliance on features like Optimised Targeting signals a similar direction. The platforms are increasingly confident that their machine learning can find converters without explicit audience definitions, as long as they have sufficient first-party data and creative signals to work with.

For advertisers, the message is clear: lookalike audiences remain valuable, but they're one tool among many rather than the silver bullet they once were. Success increasingly depends on the quality of your first-party data, the strength of your creative, and your willingness to let algorithms optimise within broader parameters.

Wrapping Up

Lookalike audiences remain one of the most effective ways to find new customers who are likely to be interested in your business. By analysing patterns in your existing customer data and finding similar users at scale, they bridge the gap between random targeting and the limited reach of retargeting.

The key points to remember: your seed audience quality determines your lookalike quality; start with engaged, high-value customers; test different percentage levels; and refresh your audiences regularly to combat fatigue.

The privacy landscape has changed the game. iOS 14.5 and the death of third-party cookies have reduced the behavioural data available for building precise lookalikes. Adapt by prioritising first-party data collection, using server-side tracking like Conversions API, and leveraging platform-native engagement sources.

Understanding the machine learning behind lookalike modelling helps explain both their power and their limitations. These algorithms find patterns you'd never identify manually, but they can also latch onto irrelevant characteristics if your seed data isn't carefully curated.

Whether you're using Google's Lookalike Segments in Demand Gen campaigns or Meta's lookalike audiences across Facebook and Instagram, the fundamental principle is the same: give the algorithm high-quality data about your best customers, and it will find more people like them. In an increasingly automated advertising landscape, that data quality is your competitive advantage.

Need help setting up effective lookalike audiences or improving your paid media performance? I build data driven advertising strategies that actually convert. Let's talk about finding more customers like your best ones.