Live view of running Actors on public hostname

Since the launch of Apify actors last autumn, Apify is no longer just a tool to extract data from websites. It has become a full-featured serverless computing platform that enables people to automate workflows on the web, run data processing pipelines or integrate with third-party systems.

Content

But until now, Actors have been lacking one important feature that would unlock a wide range of use cases— access to a web server running inside actors on a public hostname. For example, with this feature actors can:

  • Provide HTTP or WebSocket API to receive commands from external services
  • Display HTML dashboard to show users what’s happening inside
  • Provide API endpoint to stream real-time data from actor run

Each actor run is now assigned a unique secret hard-to-guess URL that proxies HTTP requests into the web server running inside the actor run’s container. The unique URL looks likes this:

https://cllpjdx8e572.runs.apify.net

If you open the URL in a web browser, it displays web content provided by the running actor. Out of the box, the HTTP traffic is protected using SSL/TLS encryption. The web content is now also displayed in the Apify app in a new tab called Live view in the actor run console.

Puppeteer live view.
Hacker News page loaded in Apify actor live view

To activate the live view for Puppeteer in your actors, simply pass the liveView: true option to Apify.launchPuppeteer function in the Apify SDK. Make sure to select the apify/actor-node-chrome base image. Here’s an example:

Puppeteer crawler live view

A nice example of how to take advantage of live view was added to the PuppeteerCrawler class in the Apify SDK for Node.js. The class provides a framework that helps you to easily build an automatically scaled web crawler based on headless Chrome and Puppeteer. It is easy to debug such a crawler on a local computer, since you can run Chrome in non-headless mode. However, once the crawler is deployed to the Apify cloud, it becomes difficult to see what’s happening inside of it.

By passing the liveView: true option to the PuppeteerCrawler constructor, the class automatically starts a web server that displays the list of tabs of the web browsers and allows you to inspect each tab, view screenshot of the web page and show its HTML code. This gives you insight into what’s happening in a running crawler directly from the actor run console:

Apify actor beta configuration.

Usage in your own Actors

To add a publicly accessible web server to your Actor, simply start a HTTP (or WebSocket) server at the port provided by the APIFY_CONTAINER_PORT environment variable:

Getting started

Marek Trunkát
Marek Trunkát
CTO and one of the earliest Apifiers. Writing about challenges our development team faces when building and scaling the Apify platform, which automates millions of tasks every month.

Get started now

Step up your web scraping and automation