What is Playwright?

We explore the features of Playwright that make it an awesome tool for web automation, testing, and scraping 🎭

Content

Why use Playwright?

We all know that technology moves fast, but even by modern standards, the rapid rise of Playwright is impressive.

Microsoft released Playwright in 2020 as an open-source Node library to automate Chromium, Firefox, and WebKit with a single API. Today, Playwright is one of the most popular frameworks for web automation, testing, and scraping. It provides automated control of a web browser with a few lines of code, making it particularly useful for data extraction, end-to-end testing, automating web page interaction, taking screenshots of web pages, and running automated tests for JavaScript libraries.

While similar to Puppeteer, Cypress, and Selenium, there are some differences. Let’s find out what they are.

Is Playwright a headless browser?

Not exactly. Playwright can be run in headful or headless mode (without a graphical user interface). By default, Playwright runs in headless mode, which means you won’t see what is happening in the browser when you run your script, but it will run faster. When you write and debug your scripts, it’s advisable to disable headless mode so you can see what your script is doing:

const browser = await chromium.launch({ headless: false })

On the other hand, if performance is the most important thing for you, headless mode is the way to go since headless browsers are quicker than real browsers.

What about Puppeteer and Selenium?

Speak of headless browsers, and the names Puppeteer and Selenium immediately spring to mind. So, how do these compare to their younger sibling? Puppeteer supports only JavaScript and TypeScript and works with Chromium, with experimental support for Firefox. Playwright supports Chromium, Firefox, and Safari with WebKit. You can use many programming languages with Playwright and one extra language with Selenium (Ruby). But Playwright’s greatest advantage over Selenium is its auto-waiting function.

What languages does Playwright support?

Playwright works with some of the most popular programming languages, including JavaScript, Python, Java, and C#. Its support of Chromium, Firefox, and WebKit provides a wide range of cross-browser automation and web testing capabilities.

What platform does Playwright support?

Playwright is a cross-platform framework. The browser binaries for Chromium, Firefox, and WebKit work across three platforms: Windows (and WSL), macOS (10.14 or above), and Linux (though you may need to install additional dependencies, depending on your Linux distribution).

How do I get started with Playwright?

One thing that isn’t said enough about Playwright: its documentation is superb. There you will find out how to install Playwright to get started.

You can install the VS Code extension. After installation, open the command panel and type Install Playwright. Alternatively, you can use the command line interface (CLI) and install Playwright using the appropriate package manager for your language. For example, NPM with Node.js:

npm init playwright@latest

That will give you the browsers and files you need to begin:

playwright.config.
tspackage.json
package-lock.json
tests/ 
	example.spec.ts
tests-examples/ 
	demo-todo-app.spec.ts

The tests folder contains a basic example test to get you started and the tests-examples folder contains a more detailed example, with tests written to test a todo app.

Alternatively, you can simply add Playwright to your existing project by calling:

npm install playwright

Why use Playwright for web automation and testing?


1. Faster communication with the Chrome DevTools Protocol

Most automation solutions use the WebDriver protocol to communicate with Chromium browsers, but Playwright provides much faster and more straightforward communication with the Chrome DevTools protocol. But Playwright isn’t just for Chrome and Edge and Playwright can be configured to test sites in Firefox and Safari, as well.

2. The auto-waiting function

Cross-browser and cross-language support aside, the auto-waiting function is Playwright’s greatest advantage over Puppeteer and Selenium. You don’t have to figure out when something is clickable because Playwright performs that action for you. You can emulate mouse clicks by using await page.click(), and wait for actions in the browser to finish by using convenient APIs like await page.waitForSelector() or await page.waitForFunction().

This unique automatic waiting feature eliminates the need to write custom waits or sleep statements in your test scripts. That means you can focus on writing high-quality tests instead of worrying about writing the perfect waiting logic.

3. Record scripts with Codegen

The Playwright documentation includes a test generator that shows you how to record your scripts with Codegen. You just need a single CLI command to kick off:

npx playwright codegen playwright.dev

This will open up an interactive browser and the Playwright inspector. Every action in the browser will be recorded in the inspector. You can then replay and adjust the generated script. In other words, Playwright generates test script code based on your interaction with the page. That means you can author tests out of the box without having to write the script manually.

4. Great debugging capabilities

Playwright has some excellent debugging features. You can debug scripts while you run them, which is handy during local development, and you can also analyze and debug failed tests. You can open Playwright Inspector to enable debug mode with npx playwright test --debug to debug all tests or npx playwright test example --debug to debug one test. Alternatively, you can set the PWDEBUG environment variable to run your scripts in debug mode.

5. Native mobile emulation

Playwright supports native mobile emulation, which means you can test your web applications on mobile devices without having to set up an actual device. Playwright can emulate Safari on iOS as well as Android devices. Playwright's test runner provides numerous predefined configurations, making it easy to test your web application on multiple devices and screen sizes to ensure that it works as expected for all users without having to manually set up each configuration.

6. Comprehensive reports

Playwright provides comprehensive reporting options for test results. You can:

a) Export results as a machine-readable JSON file.

This is useful if you want to integrate Playwright tests into a larger test suite or if you want to programmatically analyze the results.

b) Export the results as a stylish HTML page.

This is a great option if you want to share the test results with other members of your team or with stakeholders. The HTML report includes detailed information about the test runs, including the number of passed and failed tests, the duration of each test, and any errors that occurred during the test run.

Why use Playwright for web scraping?

We’ve touched upon the brilliance of Playwright when it comes to web testing and automation, but its capabilities can also come in very handy when it comes to web scraping and data mining. Here’s why:

It can be very difficult to scrape some websites with regular HTML tools. Dynamic pages and browser fingerprinting are two of the biggest challenges. Playwright’s headless mode helps overcome these problems.

1. Loading dynamic pages

When it comes to pages loaded dynamically with AJAX or data rendered using JavaScript, you’ll need to render the page like a real user. HTML scrapers can’t do that. Headless browsers can. So, in such cases, you’ll need web scraping tools like Playwright Scraper or Puppeteer Scraper to load the page, execute its JavaScript, and scrape the required data.

2. Combatting browser fingerprinting

Some websites now use fingerprinting to track users and block scraping bots. A scraper that uses a headless browser can emulate the fingerprint of a real device. Without a headless browser, it’s nearly impossible to pass the various anti-bot challenges that block your access to a website. This makes using Puppeteer or Playwright Scraper your best bet when getting blocked.

Also, you can go even further and develop your own web scraper with Crawlee, a Node.js library that helps you pass those challenges automatically using Puppeteer or Playwright.

Crawlee helps you build reliable scrapers fast. Quickly scrape data, store it, and avoid getting blocked with headless browsers, smart proxy rotation, and auto-generated human-like headers and fingerprints.

Web scraping with Playwright

If you want to find out more about Playwright and web scraping, this tutorial shows you how to build a scraper with Playwright in Node.js to extract data about GitHub topics.

Theo Vasilis
Theo Vasilis
Writer, Python dabbler, and crafter of web scraping tutorials. Loves to inform, inspire, and illuminate. Interested in human and machine learning alike.

Get started now

Step up your web scraping and automation