If you're an Apify user, you're no doubt familiar with the serverless cloud programs (or micro-apps) we call Actors.
You're also probably aware that you can integrate Actors and their tasks with your favorite web apps and cloud services. But now it's possible to integrate Actors with other Actors. That means you can reuse existing Actors instead of building your own to complete certain processes.
Say you have one Actor with a dataset containing URLs and another that takes URLs and downloads them as images. Integrating the two makes the whole process of retrieving and downloading images so much easier than it used to be.
What we want to do is grab all image URLs from the Apify Blog (because it's awesome) with Cheerio Scraper (because it's super fast) and then download them into a zip file using Download Images from Dataset. So, we're using one Actor to grab the URLs of the images, and with the other, we're downloading the images (not just the URLs) as a zip file.
Step 2. Create a task with the Actor
First, you need to create a task. Tasks are great for organizing your inputs, especially if you want to connect more than two Actors.
In this example, we'll create a task with Cheerio Scraper, which we'll call Apify Blog Image Grabber.
Next, we'll put the URL for the Apify Blog in the Start URLs field.
We have no need for Glob patterns or Link selectors, so we'll leave those blank. What we do need is code in the Page function.
Here, we're creating a pageFunction where we're iterating through all images on the page, getting a URL of each image, and pushing the full image URL to a dataset.
Step 3. Choose the Actor to integrate with
Now go to Integrations, click the Apify integration, and then choose the Actor you want to integrate it with.
The first Actors you'll see (in alphabetical order) are integration-ready Actors. You can also find these in Apify Store under the Integrations category.
We'll select Download Images From Dataset and click Connect.
Step 4. Choose a trigger
Now you can choose the Trigger that will start the Download ImagesActor. We'll stick with Run succeeded.
Because we're using an integration, the dataset ID field is already prefilled with the datasetID that Cheerio Scraper will generate.
Step 5. Save and test settings
Now you can save the integration settings. Once saved, you can test the integration with the test button. You have multiple test settings. We'll test it with last run.
You can see the test results in the log underneath.
If you go back to the integrations tab, you can see how the Actors and tasks are connected.
So now, when we start the Apify Blog Image Grabber task and go to our runs, we can see that the image downloader was also triggered.
Since the run is finished, let’s look at the results. If we go to Storage and the key-value store, we can see our images archive.
Now we can download it to our device.
Bonus step: schedule tasks
If you want to automate your workflow further, you can schedule your integration-infused tasks to run at specific times.
Which Actors do you want to integrate?
So, now you know how it works, which Actors will you choose to integrate first? There are well over a thousand pre-built web scraping and automation tools in Apify Store to choose from. Take your pick!
Don't forget, you can build your own Actors, run them locally on Apify's cloud platform, and publish them in Apify Store to reach people who need your solution and get paid.
I used to write books. Then I took an arrow in the knee. Now I'm a technical content marketer, crafting tutorials for developers and conversion-focused content for SaaS.