Since the launch of Apify actors last autumn, Apify is no longer just a tool to extract data from websites. It has become a full-featured serverless computing platform that enables people to automate workflows on the web, run data processing pipelines or integrate with third-party systems.
But until now, actors have been lacking one important feature that would unlock a wide range of use cases— access to a web server running inside actors on a public hostname. For example, with this feature actors can:
- Provide HTTP or WebSocket API to receive commands from external services
- Display HTML dashboard to show users what’s happening inside
- Provide API endpoint to stream real-time data from actor run
Each actor run is now assigned a unique secret hard-to-guess URL that proxies HTTP requests into the web server running inside the actor run’s container. The unique URL looks likes this:
If you open the URL in a web browser, it displays web content provided by the running actor. Out of the box, the HTTP traffic is protected using SSL/TLS encryption. The web content is now also displayed in the Apify app in a new tab called Live view in the actor run console.
To activate the live view for Puppeteer in your actors, simply pass the
liveView: true option to
Apify.launchPuppeteer function in the Apify SDK. Make sure to select the
apify/actor-node-chrome base image. Here’s an example:
Puppeteer crawler live view
A nice example of how to take advantage of live view was added to the
PuppeteerCrawler class in the Apify SDK for Node.js. The class provides a framework that helps you to easily build an automatically scaled web crawler based on headless Chrome and Puppeteer. It is easy to debug such a crawler on a local computer, since you can run Chrome in non-headless mode. However, once the crawler is deployed to the Apify cloud, it becomes difficult to see what’s happening inside of it.
By passing the
liveView: true option to the
PuppeteerCrawler constructor, the class automatically starts a web server that displays the list of tabs of the web browsers and allows you to inspect each tab, view screenshot of the web page and show its HTML code. This gives you insight into what’s happening in a running crawler directly from the actor run console:
Usage in your own actors
To add a publicly accessible web server to your actor, simply start a HTTP (or WebSocket) server at the port provided by the
APIFY_CONTAINER_PORT environment variable:
- View actors container web server documentation.
- Check a step-by-step tutorial on how to run a web server in actor in the knowledge base. The web server displays an HTML page with a form that enables user to control the operation of the crawler.
- If you are developing crawlers using the Apify SDK, then check the knowledge base article on Debugging your actors with Live view.