At Apify, we want to empower people not only to extract data from the internet, but to process it as well. That’s why we’re happy to announce that the Apify platform now supports writing actors in Python.
What can I do with it?
Python’s ecosystem of packages contains great web scraping tools like Scrapy or Beautiful Soup, and a large assortment of powerful tools for data processing and visualization like NumPy, Pandas, and Scikit-learn.
This gives you several ways to structure your data acquisition and processing pipelines.
- You can write one from scratch, using Python to both scrape and process the data.
- If you already have a scraping solution up and running using our scrapers, or your own actors using the Apify SDK or any other scraping library, you can now extend it with data processing using your favorite Python tools.
- Or perhaps you don’t need to scrape anything at all, and you just need to automate your data processing in the cloud.
The possibilities are endless! Tell us what you come up with on our Discord 😊
How do I get started?
You can use the Apify API Client for Python either in actors running directly on the Apify servers, or any other Python scripts or applications running wherever you choose.
To get you started, we’ve prepared two tutorials for writing actors in Python – one teaches you how to scrape data in Python using Beautiful Soup, and the other shows you how to process the scraped data using Pandas.
If you don’t feel like following tutorials and just want to start writing Python actors from scratch, head over to console.apify.com/actors, create a new actor, select the "Example: Hello world in Python" template to prefill the actor code, and you’re ready to go. The actor has the
apify_client package pre-installed, so you can directly start using it in your code.
If you need to install any other Python packages into your actor, add them to the
requirements.txt file, just like you normally would. For more in-depth information about Apify actor development, check out the Apify docs.
Using the API Client in your own scripts
If you want to interact with the Apify platform from the outside using the Apify API Client for Python package, install it by running
pip install apify_client in your project directory (note that the
apify_client package supports only Python 3.7 and newer).
The Apify API Client for Python is still only in beta and is under active development, so there might be some bugs (but we hope not too many). If you encounter any issues, please report them to us in the client’s GitHub repository.
We hope that you’ll enjoy writing actors in Python, and we’re excited to see what you will create. If you have a minute, please let us know about your use cases and needs in this survey.