Why use web scraping on Twitter? What kind of information can we get from the notoriously action-packed social media website, and how easy is it to get all that data?
Twitter started off as a simple ‘microblogging’ system for users to share short posts called tweets. That straightforward idea of expressing your thoughts in just 140 characters (and then later 280 characters) has made Twitter one of the most active discussion platforms on the internet. People engage and argue, both companies and individuals market their brands, and politicians even use it as a way to reach their voters.
Twitter has more than 330 million users and more than 500 million tweets are posted every day. As Twitter itself boasts: Twitter is what’s happening and what people are talking about right now. As you might imagine, that means that there’s a lot of useful data just sitting around on Twitter, waiting to be used for other purposes.
Extracting Twitter data with web scraping - image created by Midjourney
What can Twitter data be used for?
A single tweet can tell you information about:
1. the demographics of people who liked or retweeted the tweet 2. total clicks on a profile 3. how many people saw the tweet
And that’s just some of the vast amount of data ready to be extracted by web scraping Twitter. For a Twitter user or marketer, access to data about how others engage with their tweets can be vital for developing a brand. For companies, gathering data across Twitter can provide them gain a competitive advantage. Academic researchers and journalists can make use of the data to understand how people interact and identify trends before they rise to the surface. Once you have the data, what you do with it is up to you.
What can I do with the Twitter API?
The Twitter API is really great for developers. It gives you a lot of access to the platform underlying Twitter. You can use it to compose tweets, read profiles, access data about your followers, and get information on four main Twitter data points: Tweets, Entities, Places, and Users.
Web scraping allows you to do more with Twitter than the API allows. Apify’s Twitter Scraper creates an unofficial Twitter API and has three main advantages over the official API:
you do not need to have an account
our scraper is not rate-limited
you don’t need a registered app and API key
Independent research has also found that web scraping has advantages over the Twitter API in terms of speed and flexibility.
Web scraping is faster than Twitter API, and it is more flexible in terms of obtaining data Web Scraping versus Twitter API: A Comparison for a Credibility Analysis
And in 2023, with access to the Twitter API rumored to be pretty expensive, it makes even more sense to rely on alternative methods to get data from Twitter.
— Chris Stokel-Walker ~ @stokel@infosec.exchange (@stokel) March 10, 2023
Is web scraping Twitter legal?
You're not the first person to ask if Twitter scraping is legal, and you probably won't be the last. Since scraping basically automates tasks that could be done manually by a human, it is legal. So provided that you are obtaining data that is openly available, the answer is yes.
On top of that, in 2022, US Ninth Circuit Court of Appeals confirmed this with a ruling that scraping publicly accessible data is legal. The court's decision further confirms that everything posted on the internet is fair game for crawling and scraping. Recently, web scraping made it into the public eye during the defamation trial of Johnny Depp v. Amber Heard as a method of investigation:
Ron Schnell, director of Berkeley Research Group, explains how he scraped Twitter to track a spike in negative sentiment towards Heard.
Just want to scrape Twitter data the easy way?
Before we get to the tutorial, maybe you'd like to start off by scraping some very specific Twitter data? Apify Store also offers a few specialized Twitter scrapers to carry out smaller scraping tasks. You only need to insert a keyword or a URL and start your run to extract your results, including Twitter followers, profile photos, usernames, tweets, images, and more.
Remember that these smaller scrapers offer a narrower range of input settings and limited results. If you're planning large-scale scraping and need more detailed results, we recommend you use Twitter Scraper, so read on for our full guide to web scraping Twitter or check out the Twitter scraping FAQ section here.
Head over to the actor's page, and click the Try for free button. You will be redirected to Apify Console - your workspace to run tasks for your scrapers. You can log in or sign up for free using your email address or GitHub account.
Twitter Scraper's page on Apify Store
Step 2. Choose the data you want to scrape
So let’s go on Twitter and find something to scrape. How about the Apify profile? 😁 Just insert the URL or Twitter handle into the actor's input fields: @apify or https://twitter.com/apify.
In the scraper's input fields, insert the Twitter handle......or the Twitter page URL
But you can also change the following parameters:
Fill in the username you want to scrape.
Limit the number of max tweets to make everything go faster.
Select the types of tweets you’re interested in.
Enter your credentials if you want to scrape a lot of information.
Step 3. Run the scraper
Once you are all set, click the Start button. Notice that your task will change its status to Running, so wait for the scraper's run to finish. It will be just a minute before you see the status switch to Succeeded.
Your run has succeeded! 🏆
Step 4. Download your Twitter data
As soon as you see that the run has Succeeded, you can check the results in the Dataset tab. In fact, you can even check the Dataset tab before the scraper has finished its run if you’re curious to see how it’s doing. The Dataset tab contains your scraped data in many formats, including HTML table, JSON, CSV, Excel, XML, and RSS feed 😁
Find your results in the Dataset tab
Preview the data by clicking the Preview button or viewing it in a new tab if the dataset is too large. You can choose to download it onto your computer for further use as spreadsheets or in other apps and your projects.
Preview your results before downloading them
And that’s it — now that you know how to scrape Twitter, play around with the input parameters and see just how much data you can get from scraping tweets and other information from Twitter.
Click on the image below to get more information about each of our Twitter scrapers: