Web scraping case law: Van Buren v. United States

By ruling that the CFAA only applies to instances where someone breaches a technological barrier they are not authorized to pass, the Supreme Court has provided a clearer path for those accessing publicly available data online.

Content

Van Buren v. United States (2021) is a landmark Supreme Court case that significantly reshaped the legal landscape surrounding the Computer Fraud and Abuse Act (CFAA), providing clarity for activities like web scraping. Its decision, reinforced by the later HiQ v. LinkedIn case, signals a more restrained interpretation of "unauthorized access," offering web scrapers a measure of confidence in their practices while emphasizing the importance of understanding the legal boundaries.

In the Van Buren case, the Supreme Court ruled that the CFAA's scope should be limited to situations where someone breaches a technological barrier that they are not authorized to pass through. This set the stage for the later hiQ Labs ruling, where courts determined that bypassing anti-blocking systems (such as CAPTCHAs), which prevent specific methods of data collection but do not restrict general access to the data, should not be considered a lack of "authorization" under the CFAA.

Summary of Van Buren v. US

Van Buren v. United States (2021) refers to a case involving Nathan Van Buren, a former police officer, who was convicted under the CFAA for accessing a law enforcement database for an improper purpose, though he had authorization to access the database. Following an intense battle over the interpretation, the US Supreme Court ruled 6-3 in favor of Van Buren, holding that the CFAA does not criminalize the misuse of information that one is otherwise authorized to access.

The Computer Fraud and Abuse Act

The CFAA is a decades-old United States law that was put into effect in the 1980s when US lawmakers impulsively enacted an “anti-hacking law” that forbids anyone from using a computer “without authorization or exceeding authorized access.” 

The CFAA defined "computer" to include most electronic devices. But the internet has changed a lot since the law was first written, and the tech landscape was very different back then compared to how it is now, which makes the CFAA quite outdated.

Due to its broad and vague language, the CFAA could be used to prosecute relatively benign activities, such as breaches of workplace computer use policies, or even legitimate and important activities like testing systems for vulnerabilities. And as one can imagine, it has also become a boogeyman for web scrapers. 

The scope of applicability

Fortunately, in the Van Buren case, the Supreme Court narrowed the scope of applicability and clarified its approach to interpreting the definition “without authorization or exceeding authorized access” using a very fitting metaphor–the gates-up-or-down analogy. 

If the "gate is up," the individual has authorization to access the information or part of the system. This means they are allowed to enter and retrieve information within the system, regardless of their purpose. If the "gate is down," the individual does not have authorization to access that information or part of the system, and doing so would violate the CFAA.

Translating to the Van Buren case, the Supreme Court ruled that the CFAA's scope should be limited to situations where someone breaches a technological barrier that they are not authorized to pass through (authorization to access the database and information within the database without further technical restrictions, such as passwords). If someone has legitimate access (the gate is up), using that access for an unauthorized purpose (e.g., for personal gain) does not constitute a violation of the CFAA.

The impact on web scraping

Even though the Van Buren v. United States case does not specifically address web scraping, it sheds light on the heavily questioned applicability of the CFAA and thus has a far-reaching impact on the general perception of the web scraping activity itself. By narrowing the scope of the CFAA, the Van Buren case provides clarity that certainly works in favor of web scraping. 

The ruling is all about the importance of having proper authorization. In the web scraping world, if you have permission—whether explicitly granted or implied through open access to the website (generally available to anyone with an internet connection)—you're on solid ground.

The latter - implied permission through open access - makes an important difference. With the Van Buren case, accessing information from which you're not explicitly barred (gates are up) falls outside of CFAA. That remains the case even if such conduct is excluded by the website terms of use or robots.txt. In this context—and reaffirmed by the later hiQ Labs case—even a CAPTCHA or other anti-blocking system deployed on a website is deemed a selective denial of access, not as a lack of “authorization” under the CFAA.

However, it is important to understand that this court ruling does not give web scrapers a free pass. The case underlines the significance of understanding the legal framework. The key takeaway is that with careful attention to legal boundaries, web scraping remains a legitimate and valuable practice.

An informal talk with Apify's COO and former lawyer, Ondra Urban, explaining the legal status of web scraping

Conclusion: The CFAA does not cover publicly available data

The Van Buren v. United States case is a reminder to be prudent about web scraping, but not deterred. Most importantly, it underscores that the CFAA is not a broad tool to punish web scrapers or prosecute data extraction as a criminal activity. The decision clarifies that the CFAA is primarily concerned with unauthorized access, not legitimate use of publicly available data. By understanding the implications of this ruling and being mindful of where and how you scrape data, you can continue to scrape the web with confidence.

Recent relevant cases: HiQ v. LinkedIn

The Van Buren v. United States case built a strong foundation for the subsequent HiQ v. LinkedIn decision, where the Ninth Circuit noted that a key characteristic of public websites is that they don't restrict access. Using the earlier gate analogy, there were no gates to raise or lower in the first place. In other words, if no authorization is needed from the outset, there’s nothing to revoke later. Therefore, the CFAA’s idea of “without authorization” doesn't really apply to public websites.


We are lawyers, but we are not your lawyers. Even though we want to help as much as we can, we do not know the details of your project. For professional legal advice, please talk to a certified lawyer in your country.

Lenka Bidova
Lenka Bidova
Senior Legal Counsel at Apify

Get started now

Step up your web scraping and automation