How to bypass Cloudflare (updated for 2025)

Cloudflare is the final boss of web scraping. Here's how to beat it with Crawlee, Playwright, and Camoufox.

Ask any developer who has built bots—whether for automation or web scraping—and they'll tell you that Cloudflare is the biggest challenge. That's because their anti-bot solutions are highly effective and continuously updated to thwart most automated scripts.

Plus, with the rise of AI agents, more and more sites are adopting Cloudflare to protect against bots. So it's becoming an increasingly critical hurdle.

In this guide, you'll learn how to build a solution to bypass Cloudflare efficiently and consistently across any protected site. We'll walk you through the entire process, from setting up your project to testing your Cloudflare-ready scraper in the cloud!

Apify logo
Don't get blocked again
Apify’s combined anti-blocking solutions let you extract data reliably, even from sites with advanced anti-scraping protections
Learn more

What are Cloudflare's anti-scraping defenses?

Cloudflare is a global network that many websites rely on for performance and web security services. In detail, it's largely known for protecting sites from malicious bots and abuse.

Specifically, its anti-scraping solutions pose a serious challenge through a mix of defenses: JavaScript-based challenges, IP reputation filtering, TLS fingerprinting, behavior analysis, rate limiting, and even an AI Labyrinth designed to stop AI crawlers.

If your scraper triggers any of these protections, you might receive an HTTP 403 Forbidden or 429 Too Many Requests error—or be denied access entirely.

How does Cloudflare detect bots?

To identify bots, Cloudflare relies on both server-side and client-side methods. If a request comes from an untrusted IP or looks suspicious, it gets blocked immediately. Otherwise, Cloudflare serves a client-side protection page to the browser.

If the client passes the background checks performed in the browser, access is granted automatically. If not, or if verification is required, a one-click Turnstile challenge appears:

The Cloudflare Turnstile challenge on stackoverflow.com
The Cloudflare Turnstile challenge on stackoverflow.com

Data about the challenge interaction is collected, analyzed locally, and sent to Cloudflare's servers to determine whether the visitor is a bot or a legitimate user.

Server-side detection techniques:

  • IP reputation and ASN checks
  • Header anomalies and malformed requests
  • TLS fingerprint mismatches and JA3/JA4 analysis

Client-side detection techniques:

  • JavaScript challenges and browser fingerprinting checks (e.g., canvas, fonts, WebGL)
  • Mouse movement and interaction tracking when clicking on the Turnstile checkbox
  • Cookie and local storage verification

How to bypass Cloudflare

In this guided section, you'll learn how to systematically bypass Cloudflare using Crawlee, Playwright, and Camoufox.

And what better site to test this setup on than Cloudflare itself? If we can bypass their anti-bot protection on their own domain, we can bypass it anywhere. Specifically, we'll target the Top Developer Discussions page from the Cloudflare Community forum:

Accessing the target page
Accessing the target page

As you'll notice, this site is protected by Cloudflare's anti-bot solution. Once we successfully bypass the bot protection, we'll also scrape some data from the page.

The entire process will involve the following steps:

  1. Prerequisites and project setup
  2. Get familiar with the "Crawlee + Playwright + Camoufox" template
  3. Residential proxy configuration
  4. Connect to the target page
  5. Implement the scraping logic
  6. Collect the scraped data
  7. Complete code
  8. Run the scraper

Let's dive in!

1. Prerequisites and setup

The easiest way to bypass Cloudflare protections is by using the "Crawlee + Playwright + Camoufox" template available on Apify. With this approach, you'll build a cloud-based scraper that can bypass Cloudflare every time—without worrying about local setup.

Everything—from configuration to coding and deployment—will be handled entirely in the Apify online platform.

To follow this approach, make sure you have:

📌
Note: If you prefer working locally, you can achieve the same result using the Apify JavaScript SDK in your own Node.js project.

Now, to initialize your Cloudflare-ready scraping project on the Apify platform:

  1. Log in
  2. Reach the Console
  3. Under the "Actors" dropdown, select "Development" and click the "Develop new" button:
Clicking the “Develop new” button
Clicking the “Develop new” button

Next, select the "View all templates" option:

Clicking the “View all templates” option
Clicking the “View all templates” option

In the "JavaScript" section, click on the "Crawlee + Playwright + Camoufox" card:

Selecting the “Crawlee + Playwright + Camoufox” template
Selecting the “Crawlee + Playwright + Camoufox” template

Inspect the starter project code and select "Use this template" to fork it:

Forking the “Crawlee + Playwright + Camoufox” template
Forking the “Crawlee + Playwright + Camoufox” template

Wait while Apify creates a new Actor based on the selected template.

You'll then be redirected to the Apify Web IDE, where you can customize your Actor. For example, name it "Cloudflare Community Scraper." Also, you can write your scraping logic directly in the browser—no need to install libraries or set up a local environment:

The Apify Web IDE with the forked code
The Apify Web IDE with the forked code

Under the hood, the "Crawlee + Playwright + Camoufox" template automatically sets up and integrates the following libraries for you:

  • Crawlee: A web scraping and automation library for Node.js and Python, simplifying the process of building reliable crawlers and scrapers with features like request management and task scheduling.
  • Playwright: A Node.js library developed by Microsoft for reliable cross-browser automation, enabling programmatic interaction with web pages.
  • Camoufox: An open-source anti-detect browser built for robust fingerprint injection and advanced anti-bot evasion. It's a stealthy, minimalistic, custom build of Firefox, purpose-built for web scraping, and it integrates natively with Playwright.

2. Get familiar with the "Crawlee + Playwright + Camoufox" template

Before jumping into coding, you should first get familiar with the existing code. In the Web IDE, you'll notice that the src/ folder contains two key files:

  • main.js: Handles the Crawlee initialization logic, including integration with Playwright and Camoufox
  • routes.js: Defines the Crawlee route-handling logic, where the actual scraping behavior is implemented

Here's what main.js should look like:

The main.js file in the Apify Web IDE
The main.js file in the Apify Web IDE

As you can see, the template has already configured PlaywrightCrawler to work with Camoufox for you. This means your Actor will automatically render pages using a Camoufox instance controlled by Playwright.

And this is route.js:

The routes.js file in the Apify Web IDE
The routes.js file in the Apify Web IDE
📌
Note: Some lines of code may differ slightly, as the template you used to create your project may be updated over time.

3. Residential proxy configuration

In main.js, notice how the Crawlee instance is already set up to work with Apify proxies:

const proxyConfiguration = await Actor.createProxyConfiguration();
📌
Note: If you want to use your own proxies, refer to the ProxyConfigurationOptions.proxyUrls option.

In the Crawlee + Playwright + Camoufox setup, connecting through a residential proxy is key to bypassing Cloudflare. The reason is that, if you deploy your code on a VPS or in a data center, your scraper will trigger Cloudflare's server-side bot detection. That's because the IPs from such servers typically have a low trust score and are commonly flagged as suspicious.

Thus, Cloudflare will block your request. After all, while Camoufox provides advanced anti-bot capabilities at the browser level, it cannot compensate for a low-trust IP.

📌

If you don't operate on trusted IPs, your Camoufox-based Cloudflare bypass script will likely fail with the following 403 error:

ERROR PlaywrightCrawler: Request failed and reached maximum retries. Error: Request blocked - received 403 status code.

To avoid that, your scraper must rely on reliable exit IPs, such as those offered by residential proxies. Apify provides residential proxies even on the free plan, so you do not have to pay to use them.

To view the available proxies in your Apify account, click on the "Proxy" link and switch to the "Groups" section:

Note the proxy groups
Note the proxy groups

You'll see a group named "RESIDENTIAL" available by default:

The residential proxy group
The residential proxy group

You can configure your scraper to use that group like so:

const proxyConfiguration = await Actor.createProxyConfiguration({
    groups: ['RESIDENTIAL'], // connect to residential proxies
    countryCode: 'US', // optional: restrict to U.S. IPs
});

In this case, we also specified that we only want IPS from the United States.

With this setup, your scraper will now:

  • Use Camoufox to evade browser-based detection
  • Operate over trusted residential IPs, improving your chances of bypassing Cloudflare every time

4. Connect to the target page

In main.js, you'll notice that if no start URLs are defined in the Actor's input, it defaults to the Apify homepage. Change that to your actual target URL, which is the Cloudflare Community site:

const {
    startUrls = ['https://community.cloudflare.com/c/developers/39/l/top?period=yearly'],
} = await Actor.getInput() ?? {};

Next, in routes.js, remove the addHandler() method for 'detail' pages. Also, simplify the logic by replacing the callback in addDefaultHandler() with a function that logs the raw HTML content of the page. Your routes.js file should now look like this:

import { Dataset, createPlaywrightRouter } from 'crawlee';

export const router = createPlaywrightRouter();

router.addDefaultHandler(async ({ request, page, log }) => {
    // retrieve the page HTML content and log it
    const html = await page.content();
    log.info(html);
});

Click "Save & Build" to build your Actor for the first time:

Clicking the “Save & Build” button

Reach the "Input" tab and paste the Cloudflare-protected URL you want to scrape in the "Start URLs" field:

Configuring the desired target page and running the Actor

Now, press "Save & Start" to run your Cloudflare Community Scraper Actor.

If everything works as intended, your Actor should bypass Cloudflare and log the full HTML of the target page. If not, you'll see a 403 Forbidden error instead.

Below is what you should see in the full log of your run:

Note the HTML of the target page in the full logs
Note the HTML of the target page in the full logs

As you can tell, the HTML of the Cloudflare-protected page was successfully retrieved and logged. That confirms that the setup works perfectly for bypassing Cloudflare.

5. Implement the scraping logic

Now that you've confirmed you can connect to the target page without issues, it's time to scrape some real content from it.

Specifically, we'll extract discussion topics from the Cloudflare Community target page. These are visible after scrolling a bit down the page:

The discussion table on the target page
The discussion table on the target page

First, open the target site in Incognito mode in your browser (to ensure a clean session), right-click on the discussion table, and choose the "Inspect" option:

The HTML of a discussion table row
The HTML of a discussion table row

If you're not familiar with how to use browser DevTools, read our guide on inspecting elements.

You'll see that each discussion thread is represented as an .topic-list .topic-list-item HTML element.

In the addDefaultHandler() method, use a Playwright locator to select all topic rows and loop through them:

const topicElements = await page.locator('.topic-list .topic-list-item');
for (const topicElement of await topicElements.all()) {
    // scraping logic...
}

Now, focus on the content inside each row of the discussion table. Start by analyzing the cells on the left:

The HTML of the left-side of a discussion table row
The HTML of the left-side of a discussion table row

Next, take a look at the cells on the right:

The HTML of the left-side of a discussion table row
The HTML of the left-side of a discussion table row

Note that, from each thread, you can extract:

  • The discussion title from the text of the .main-link a.raw-topic-link node
  • The relative URL to the discussion page from the href attribute of the same element
  • The number of replies from .posts
  • The number of views from the .views node
  • The last activity date (in UNIX format) from the data-time HTML attribute of the .activity span element

Implement the scraping logic with the following code:

const titleElement = topicElement.locator('.main-link a.raw-topic-link');
const title = (await titleElement.textContent())?.trim();
const url = `https://community.cloudflare.com${await titleElement.getAttribute('href')}`;

const repliesElement = topicElement.locator('.posts');
const replies = (await repliesElement.textContent())?.trim();

const viewsElement = topicElement.locator('.views');
const views = (await viewsElement.textContent())?.trim();

// convert the UNIX timestamp to ISO date
const activityElement = topicElement.locator('.activity span');
const unixTime = await activityElement.getAttribute('data-time');
const date = unixTime ? new Date(Number(unixTime)).toISOString() : null;

These few lines are enough to extract the key discussion data from the Cloudflare Community forum. Adapt this logic to match the structure of your own Cloudflare-protected target site and the specific data points you're interested in.

6. Collect the scraped data

Right now, your scraped data is stored in a JavaScript object. To save it to the Dataset of your Apify Actor, use the Dataset.pushData() method:

await Dataset.pushData(topic)

This way, the scraped data will become available via API or downloadable in several formats (JSON, CSV, Excel) from the Apify Console.

Now, the target page might contain a large number of discussion threads. Since this is just a test to show how to bypass Cloudflare, it's a good idea to limit the number of topics you scrape:

if ((await Dataset.getData()).total >= 30) {
    return
}

This limit isn't strictly required, but it helps keep test runs fast and manageable.

Great! The Cloudflare-bypass and scraping logic is complete.

7. Complete code

This is what your main.js file should contain:

import { Actor } from 'apify';
import { PlaywrightCrawler } from 'crawlee';
import { router } from './routes.js';
import { firefox } from 'playwright';
import { launchOptions as camoufoxLaunchOptions } from 'camoufox-js';

// Initialize the Apify SDK
await Actor.init();

const {
    startUrls = ['https://community.cloudflare.com/c/developers/39/l/top?period=yearly'],
} = await Actor.getInput() ?? {};


const proxyConfiguration = await Actor.createProxyConfiguration({
    groups: ['RESIDENTIAL'], // connect to residential proxies
    countryCode: 'US', // optional: restrict to U.S. IPs
});

const crawler = new PlaywrightCrawler({
    proxyConfiguration,
    requestHandler: router,
    launchContext: {
        launcher: firefox,
        launchOptions: await camoufoxLaunchOptions({
            headless: true,
            // custom Camoufox options...
        }),
    }
});

// launch the Apify crawler
await crawler.run(startUrls);

// exit successfully
await Actor.exit();

And this is what the router.js file should hold:

import { Dataset, createPlaywrightRouter } from 'crawlee';

export const router = createPlaywrightRouter();

router.addDefaultHandler(async ({ request, page, log }) => {
    // select all topic elements in the topic table
    const topicElements = await page.locator('.topic-list .topic-list-item');

    // iterate over them and apply the data parsing logic
    for (const topicElement of await topicElements.all()) {
        // scraping logic
        const titleElement = topicElement.locator('.main-link a.raw-topic-link');
        const title = (await titleElement.textContent())?.trim();
        const url = `https://community.cloudflare.com${await titleElement.getAttribute('href')}`

        const repliesElement = topicElement.locator('.posts');
        const replies = (await repliesElement.textContent())?.trim();

        const viewsElement = topicElement.locator('.views');
        const views = (await viewsElement.textContent())?.trim();

        // convert the UNIX timestamp to ISO string
        const activityElement = topicElement.locator('.activity span');
        const unixTime = await activityElement.getAttribute('data-time');
        const date = unixTime ? new Date(Number(unixTime)).toISOString() : null;

        // populate a new topic with the scraped data
        const topic = {
            title: title,
            url: url,
            replies: replies,
            views: views,
            date,
        }
        // append the scraped data to the Apify dataset
        await Dataset.pushData(topic)

        // avoid scraping more than 30 topics as this is just an example...
        if ((await Dataset.getData()).total >= 30) {
            return
        }
    }
});

8. Run the scraper

In the Apify Console, run your Actor by pressing the "Save, Build & Start" button:

Running the Actor
Running the Actor

Once the run is complete, move to the "Last run" tab, and you should be able to see the results as follows:

Note the scraped data in tabular format
Note the scraped data in tabular format

This contains the desired Cloudflare-protected data scraped from the Community forum.

Switch to the "Storage" tab to export the scraped data:

Exporting the scraped data
Exporting the scraped data

From here, you can export your scraped data in various formats—such as JSON, CSV, XML, Excel, HTML Table, RSS, and JSONL.

And that's it! You've successfully bypassed Cloudflare and scraped your target site.

Conclusion

In this tutorial, you learned how to bypass Cloudflare using a setup based on Crawlee, Playwright, and Camoufox—a quite new open-source anti-bot browser technology that's quickly gaining popularity.

As shown here, deploying your Cloudflare-bypass scraper to Apify simplifies the setup process and makes it easier to integrate your script with residential proxies—which are required for consistent results. To explore more web scraping and automation capabilities, check out the available code templates.


Frequently asked questions

Can you bypass Cloudflare?

Yes, you can bypass Cloudflare by using open-source, free tools like Crawlee, Playwright, and Camoufox. This setup mimics real user behavior to evade detection by anti-bot systems. For consistent results when deploying on a VPS or in the cloud, residential proxies may be required.

Why does Cloudflare block my IP?

Cloudflare blocks your IP because it either has a history of suspicious activity or a low reputation. This often happens when deploying scrapers on data centers and VPS. The reason is that the IPs of those servers are sequential. Thus, Cloudflare's anti-bot defenses can easily flag them as non-human traffic.

Is it easy to bypass Cloudflare?

No, it isn't easy to bypass Cloudflare because it uses advanced anti-bot techniques. Still, with the right tools, it's definitely possible. With solutions like Camoufox and Apify's Crawlee, you can achieve your goal. Remember that Apify supports startups with 30% off the Scale plan to help them grow using web data.

On this page

Build the scraper you want

No credit card required

Start building