So why use web scraping on Twitter? What kind of information can we get from the notoriously fast-moving website and how easy is it to get that information?
Twitter started off as a simple ‘microblogging’ system for users to share short posts called tweets. That straightforward idea of expressing your thoughts in just 140 characters (and now 280 characters) has made Twitter one of the most active discussion platforms on the internet. People engage and argue, both companies and individuals market their brands, and politicians even use it as a way to reach their voters.
Twitter has more than 340 million users and more than 500 million tweets are posted every day. As Twitter itself boasts: Twitter is what’s happening and what people are talking about right now.
As you might imagine, that means that there’s a lot of useful data just sitting around on Twitter, waiting to be used for other purposes.
A single tweet can tell you information about:
- the demographics of people who liked or retweeted the tweet
- total clicks on a profile
- how many people saw the tweet
And that’s just the tip of the proverbial iceberg.
For a Twitter user or marketer, access to data about how others engage with their tweets, can be vital for developing a brand. For companies, gathering data across Twitter can provide them with a competitive advantage. Academic researchers and journalists can make use of the data to understand how people interact and identify trends before they rise to the surface. Once you have the data, what you do with it is up to you.
The Twitter API is really great for developers. It gives you a lot of access to the platform underlying Twitter. You can use it to compose tweets, read profiles, access data about your followers, and get information on four main Twitter data points: Tweets, Entities, Places, and Users.
But we believe that web scraping can allow you to do more with Twitter than the API allows. Apify’s Twitter Scraper has the following advantages over the official API:
- you do not need to have an account
- our scraper is not rate-limited
- you don’t need a registered app and API key
What technology does Apify use?
And if you really want to use Python, don’t forget that Apify datasets can be downloaded in JSON and other formats that can easily be fed into other tools. Why reinvent the wheel and write a Python scraper for Twitter when you can use ours 😉
How to scrape data from Twitter — Apify’s step-by-step guide
1. First, you need to sign in at Apify https://apify.com/
2. You can log in or sign up with your email address, or with a Google or GitHub account.
3. Once you log in, you’ll be redirected to the Apify app. Click on the Store button. Apify Store is where you can find ready-made web scraping and automation tools.
4. When you’re on Apify Store, you can search for the Twitter Scraper actor. Apify actors are serverless cloud programs running on the Apify platform that can perform arbitrary computing jobs such as sending an email or crawling a website with millions of pages.
5. On the actor page, click Try for free and it will automatically redirect you back to the Apify platform.
6. An actor Task will have been created back on your Apify Dashboard. This will enable you to set the parameters for your Twitter scraping run.
8. Now you need to fill in the input fields for the scraper. There are lots of possible inputs, but we’ll keep it simple for this example, so you can just fill in a URL for a Twitter user, e.g. https://twitter.com/apify
But you can also change the following parameters:
- Fill in the username you want to scrape.
- Limit the number of max tweets to make everything go faster.
- Select the types of tweet you’re interested in
- Enter your credentials if you want to scrape a lot of information.
Once you’re ready, click on the “Save & Run” button and wait for the actor to finish its scraping run.
9. As soon as you see that the run has “Succeeded”, you can check the results in the Dataset tab. In fact, you can even check the Dataset tab before the scraper has finished its run, if you’re curious to see how it’s doing 😁
10. The Dataset tab contains your data in lots of useful formats. You can access them by clicking on “View” or “Download”. You can share the data or use it however you want.
JSON format preview:
HTML format preview:
And that’s it — you can learn lots more about how to scrape Twitter by studying the readme documentation over on the Twitter Scraper.
Have fun and happy scraping!
Did you know?
- 82% of B2B content marketers used Twitter for organic content marketing in the last 12 months
- 27% of B2B content marketers used Twitter ads in the last 12 months