How to use cURL in Python

Hi! We're Apify, a full-stack web scraping and browser automation platform. If you're interested in using Python for web scraping, this article shows you 3 ways to use cURL with Python.

cURL is a powerful tool used for transferring data with URLs. This article will guide you through cURL in Python using three different approaches: the PycURL library, Python's subprocess module, and the Requests library. Each method has its advantages and applications, which we'll explore in detail.

What is cURL?

cURL stands for Client URL, a lightweight command line tool for making network requests from the client side. It's suitable for many use cases, like making HTTP requests and testing APIs. It provides support for using different network protocols such as HTTP, HTTPS, FTP, and many more.

Let's say we have different endpoints of an API, and we want to test those endpoints. Rather than using web-based API calls, we can just use cURL to hit those endpoints and test our APIs.

💡

Learn more: What is cURL?

Using cURL in Python

We’re going to show three different approaches to using cURL in Python:

The PycURL library
Python's subprocess module
The Requests library

These methods will allow you to make network requests in Python effectively.

Comparing different approaches to using cURL in Python

Method	Description	Use Case
PycURL	A Python interface to libcurl, the library cURL is based on.	When you need low-level control over your requests.
subprocess	Runs cURL commands as subprocesses.	When you prefer using cURL directly within Python.
Requests	A high-level library for making HTTP requests in Python.	When you need a simple and intuitive interface.

Each method has its own strengths: PycURL offers detailed control, subprocess allows direct use of cURL, and Requests provides an easy-to-use API.

How to use cURL in Python with PycURL

1. Setting up the environment for cURL requests in Python

Before making cURL requests in Python, make sure you have Python and pycURL installed.

You can easily download Python from the official Python website. Once you've downloaded the Python installer, you can install Python and check the version by entering the following command on the command line.

python3 --version

You'll see the version of Python that's installed on your system.

📔

Note: We're using python3 here because we've downloaded the latest version, which uses the syntax of python3 instead of simple python.

2. Installing the PycURL library

Now that you have Python installed on your system, you can install pycurl using Python's package manager pip3.

pip3 install pycurl

Once the command is executed, confirm the installation by simply running the following command:

pip3 show pycurl

This should print the name of the library with other information. Once you're done with the installation, you're ready to make your requests using PycURL.

📔

Note: Be careful while installing packages through pip. If you're using Python 3 or above, you need to use pip3 with that. Otherwise, things might go wrong, like version issues.

3. How to make a GET request with PycURL

Before making GET requests, let's first see how a basic GET request works. When we make a GET request, we basically ask a server to give us a specific resource. The resource could be a file, an HTML page, a JSON object, or other data. It's called a GET request because it gets a resource from the server.

💡

Learn more: How to send a GET request with cURL

You can make GET requests with the pycurl library by following a series of steps. Don't worry. We'll go through everything in detail.

Let's see the code first.

import pycurl
from io import BytesIO
# Create a Curl object
curl = pycurl.Curl()
# Create a BytesIO object to store the response
responseBuffer = BytesIO()

# Set the URL
curl.setopt(curl.URL, 'https://www.google.com')
# Set the option to write the response to the buffer
curl.setopt(curl.WRITEDATA, responseBuffer)
# Make the request
curl.perform()
# Fetch the response body
response = responseBuffer.getvalue()

# Print the response
print(response.decode('utf-8'))
# Close the Curl object
curl.close()
# Close the buffer
responseBuffer.close()

In the code above, we first create a curl object using pycurl.Curl().

Then we set the URL we want to fetch using curl.setopt(curl.URL, 'https://www.google.com').

We can also set other options using curl.setopt(), such as storing the response data in the responseBuffer object.

We then perform the request using curl.perform(). This sends the HTTP request to the URL and retrieves the response.

We can get the response body using curl.getvalue(), which returns a bytes object.

We decode this response using response.decode('utf-8') to convert it to a string.

Finally, we close the curl and buffer objects using curl.close() and responseBuffer.close() respectively.

While executing this code on MacOS, you may encounter an error due to different versions of curl or ssl. You can try the following solution:

# Uninstall curl

brew uninstall curl

# Uninstall the openssl

brew uninstall OpenSSL

# Uninstall curl-openssl

brew uninstall curl-openssl

# Install openssl

brew install openssl

# Install curl

brew install curl

echo 'export PATH="/usr/local/opt/openssl@1.1/bin:$PATH"' >> ~/.zshrc

echo 'export PATH="/usr/local/opt/curl/bin:$PATH"' >> ~/.zshrc

source ~/.zshrc

source ~/.virtualenvs/foo_venv/bin/activate # or whichever venv you're using

export PYCURL_SSL_LIBRARY=openssl

export LDFLAGS="-L/usr/local/opt/curl/lib"

export CPPFLAGS="-I/usr/local/opt/curl/include"

# The below command will give you the location where the ssl is installed

brew --prefix openssl

# For example,/usr/local/opt/openssl@3, You have to write yours

ls -la /usr/local/opt/openssl@3

# Specify the location at the end of the next command as well, We have mentioned there

pip3 install --no-cache-dir --compile --ignore-installed --install-option="--with-openssl" --install-option="--openssl-dir=/usr/local/opt/openssl@1.1" pycurl

❓

You may be thinking: why is the syntax so confusing, and why are we using so many things to make just one request?

The PycURL library provides a low-level interface to cURL, giving us more control and flexibility. As we get closer to any computer system, we attain more control, but the syntax gets harder and less readable for humans.

The steps we saw above, like importing BytesIO, creating objects, and decoding responses are necessary because PycURL operates at a lower level and provides us with direct access to the raw HTTP response.

4. How to make a POST request with PycURL

The POST request is similar to the GET request. The only difference is that we just add the data we want to send through the request. But the data is encoded first. Let's see how that works.

import pycurl
# Import the urllib for encoding
import urllib.parse
from io import BytesIO

# Create a Curl object
curl = pycurl.Curl()
responseBuffer = BytesIO()
# Set the URL
curl.setopt(curl.URL, 'http://httpbin.org/post')
# Set the method to POST
curl.setopt(curl.POST, True)

# Data
data = {'name': 'John', 'age': '30'}
# Encode the data
dataString = urllib.parse.urlencode(data)
# Add data as POSTFIELDS
curl.setopt(curl.POSTFIELDS, dataString)
# Set the variable that will store data
curl.setopt(curl.WRITEDATA, responseBuffer)

# Make the request
curl.perform()
# Get the responseCode
responseCode = curl.getinfo(curl.RESPONSE_CODE)
print('Response Code:', responseCode)
# Get the responseBody
responseBody = responseBuffer.getvalue()
print('Response Body:', responseBody.decode('utf-8'))
# Close the object
curl.close()
# Close the buffer
responseBuffer.close()

We first create a curl object, and then we set the URL.

We set the request method to POST using curl.setopt(curl.POST, True).

After that, we set the data using a dictionary and encode it using urllib.parse.urlencode() method.

We set the encoded data as the request body using curl.setopt(curl.POSTFIELDS, dataString).

After performing the request, we print the response code and the request body.

Finally, we close the curl and buffer objects.

We've covered some basic concepts of the PycURL library and how it works. Now, let's cover some more advanced topics, including adding custom headers, handling redirects, authenticating requests, and handling errors.

💡

Learn more: How to send HTTP headers with cURL

How to add headers in the cURL request using Python

We can send additional information about the requests using the headers, like user agent, content type, or authorization credentials. As we have already mentioned that the .setopt() of PycURL allows us to add additional information about the requests, so we will use this method to add headers as well. Let's see how it's done.

import pycurl
from io import BytesIO

# Create a new Curl object
curl = pycurl.Curl()
# Create a BytesIO object
responseBuffer = BytesIO()

# Set the URL to fetch
curl.setopt(curl.URL, 'https://httpbin.org/headers')

# Set custom headers
customHeaders = ['Authorization: Bearer mytoken', 'User-Agent: MyCustomAgent']
curl.setopt(curl.HTTPHEADER, customHeaders)

# Set the WRITEFUNCTION option to redirect the response body to the responseBuffer
curl.setopt(curl.WRITEFUNCTION, responseBuffer.write)

# Make the request
curl.perform()

# Get the HTTP response code
responseCode = curl.getinfo(curl.RESPONSE_CODE)
print('Response Code:', responseCode)

# Get the response body from the response_buffer
responseBody = responseBuffer.getvalue()

# Print the body of the response
print('Response Body:', responseBody.decode('utf-8'))

# Close the Curl object and the response_buffer
curl.close()
responseBuffer.close()

We set two custom headers (Accept-Language and User-Agent) by setting the curl.HTTPHEADER option to a list of strings containing the header names and values. We then perform the request and retrieve the response body as before.

How to handle redirects in cURL requests with Python

An HTTP redirect is a way to tell a browser to request a different URL instead of the one originally requested. In simple words, when a user tries to access a specific URL, the server automatically takes the user to an alternative URL.

This happens when the owner of the website changes the URL or wants the user to see a better version. In the case of alternative URLs, the website is accessible through both the new and old URLs.

pycurl does not automatically follow the URL redirects. We can configure pycurl to follow these redirects using the curl.setopt() method.

import pycurl
from io import BytesIO

# Create a new Curl object
curl = pycurl.Curl()
# Create a BytesIO object
responseBuffer = BytesIO()

# Set the URL to fetch and follow redirects
curl.setopt(curl.URL, 'http://httpbin.org/redirect-to?url=https%3A%2F%2Fwww.google.com')
curl.setopt(curl.FOLLOWLOCATION, 2)
# Write the data in the responseBuffer
curl.setopt(curl.WRITEDATA, responseBuffer)

# Perform the request
curl.perform()
# Get the response body
response = responseBuffer.getvalue()
# Print the response
print(response.decode('utf-8'))

# Close the Curl object
curl.close()
# Close the buffer
responseBuffer.close()

We set the curl.FOLLOWLOCATION option to True to instruct cURL to follow redirects. We also set the curl.REDIRECT_LIMIT option to 2, which limits the number of redirects that it will follow.

How to handle cookies in cURL requests with Python

Cookies are small pieces of text that are used to save a user's state or information about the user. Pycurl provides us with an easy way to manage them. We just need to set a text file using the setopt() method and utilize two methods, COOKIEJAR and COOKIEFILE.

COOKIEFILE tells the pycurl to read cookies from a file before making a request.

If you use COOKIEJAR, pycurl will save the cookies to a file received in the response.

import pycurl
from io import BytesIO
# Create a Curl object
curl = pycurl.Curl()
# Create a buffer object
responseBuffer = BytesIO()
# Set the URL
curl.setopt(curl.URL, 'https://stackoverflow.com/')
# Set the buffer to receive data
curl.setopt(curl.WRITEDATA, responseBuffer)
# Save cookies to a file
curl.setopt(curl.COOKIEJAR, 'cookies.txt')
# Load cookies from a file
curl.setopt(curl.COOKIEFILE, 'cookies.txt')

# Perform the request
curl.perform()

# Read cookies
cookies = curl.getinfo(pycurl.INFO_COOKIELIST)

# Print the cookies
print("Cookies are:")
for cookie in cookies:
    print(cookie)

# Close the Curl object
curl.close()
# Close the buffer
buffer.close()

In this example, we are specifying a file through setopt() . If the file is not created, it will create a new file and save cookies into it. We get the cookies using the pycurl.INFO_COOKIELIST method and print them.

💡

Learn more: Basic auth in cURL

Authenticating cURL requests with Python

While working with websites and requests, we may encounter some websites that require authentication and credentials to use the content. PycURL provides a way to handle this situation as well through the .setopt() method. We add the credentials through encoding and make the POST request.

Let’s see the code in detail.

import pycurl
from io import BytesIO
from urllib.parse import urlencode

# Create a new Curl object
curl = pycurl.Curl()

# Create a buffer object
responseBuffer = BytesIO()

# Enable cookie handling
curl.setopt(curl.COOKIEJAR, 'cookies.txt')
curl.setopt(curl.COOKIEFILE, 'cookies.txt')

# Set the login URL
curl.setopt(curl.URL, 'https://newsapi.org/login')

# Set the request method to POST
curl.setopt(curl.POST, 1)

# Add the data 
postData = {'email': 'yourEmail', 'password': 'yourPassword'}
# Encode the data
postfields = urlencode(postData)
# Add the post fields
curl.setopt(curl.POSTFIELDS, postfields)

# Add the buffer variable
curl.setopt(curl.WRITEDATA, responseBuffer)
# Perform the login request
curl.perform()

# Get the HTTP response code
responseCode = curl.getinfo(curl.RESPONSE_CODE)
print('Response Code:', responseCode)

# Clear the response buffer
responseBuffer.truncate(0)
responseBuffer.seek(0)

# Set the URL to the home page
curl.setopt(curl.URL, 'https://newsapi.org/account')

# Make the request to the home page
curl.perform()

# Fetch the response body
responseBody = responseBuffer.getvalue()

# Print the body of the response
print('Response Body:', responseBody.decode('utf-8'))

# Close the Curl object, and the responseBuffer
curl.close()
responseBuffer.close()

In the code above, we're using an extra module urlencode for encoding the credentials. Then, we have cookies to store the session, and we are saving them in a file.

This curl.setopt(curl.POSTFIELDS, postfields) line of code is adding the credentials to the required fields.

After that, we make the POST request and get the response that would be 200.

After clearing the responseBuffer, we are again making a GET request to get the home page of the account.

💡

Learn more: How to follow redirects using cURL

How to handle errors in cURL requests with Python

While writing code, you might not always be sure whether a piece of code will work. For example, when making a request or reading a file, it's possible that the file isn't available. At that point, we get an error.

To handle such situations, we use try except blocks in pycurl.

Here's an example of using a try except block to perform a curl request.

import pycurl
# Create a new Curl object
curl = pycurl.Curl()
# Set the URL
curl.setopt(curl.URL, 'https://www.googlecom')
try:
   # Perform the request
   curl.perform()
except pycurl.error as error:
  # Handle the error
  errorNumber, errorString = error.args
  print('Error: %s %s' % (errorNumber, errorString))
# Close the Curl object
curl.close()

In this example, we have intentionally set an invalid URL to demonstrate how to handle errors.

We've used a try except block to catch any pycurl.error exceptions that may be raised during the request.

We then extract the error number and error string from the exception using error.args and print an error message.

We've covered some pretty advanced stuff, but let's go a step further. In this next section, we'll cover some even more advanced topics in cURL requests with Python, including performing file uploads and working with SSL/TLS certificates.

💡

Learn more: How to download a file with cURL

How to perform file uploads in cURL requests with Python

We may need to upload a file along with our HTTP request, such as when working with file storage or API endpoints that accept file uploads. PycURL provides us with an easy way to upload files using the same method setopt().

We just need to set our request to POST and specify the path of the file with its type, and we're good to go.

Let's see how easy this process is.

import pycurl
from io import BytesIO

# Create a new Curl object
curl = pycurl.Curl()
# Create a buffer
responseBuffer = BytesIO()
# Set the URL for the file upload
curl.setopt(curl.URL, 'https://httpbin.org/post')
# Set the file to be uploaded and other options
curl.setopt(curl.HTTPPOST, [('file', (curl.FORM_FILE, '/content/cookies.txt'))])
# Specify the buffer to receive response
curl.setopt(curl.WRITEDATA, responseBuffer)
# Apply the try block
try:
    # Perform the request
    curl.perform()
except pycurl.error as error:
    # Handle the error
    errorNumber, errorString = error.args
    print('Error: %s %s' % (errorNumber, errorString))
# Close the Curl object
curl.close()
# Decode and print the response
response = responseBuffer.getvalue().decode('utf-8')
print(response)
# Close the buffer
responseBuffer.close()

Our code is mostly the same, and we've already explained the code pretty extensively in the previous examples.

The line that requires explanation here is just 13, where we're specifying the method to POST and giving the file name.

curl.HTTPPOST sets the method the HTTP method to POST and the next argument is a list that includes information about the file.

file is the name of the form field on the server side that will receive the uploaded file.

(curl.FORM_FILE, '/content/cookies.txt') is a tuple that specifies the type of the form field and the file to be uploaded.

💡

Learn more: How to use cURL with a proxy

How to handle SSL/TLS certificates in PycURL

There are two common methods for creating secure connections over the internet. Secure Sockets Layer (SSL) and Transport Layer Security (TLS).

These methods make the connections between the browser and server by encrypting the data transferred between them. These certificates are issued by an organization called a Certificate Authority (CA) to websites that ensure that a website is owned by an organization and is trustworthy.

When we visit a website with an SSL/TLS, our web browser checks the SSL/TLS certificate that acts as a digital stamp of approval that verifies the authenticity and trustworthiness of the website.

As certificates change over time, PycURL doesn't provide such support.

We can also add certificates from our local directory, but if we don't have one, we can use the certifi Python package.

Before using this package, let’s install it with following command.

pip3 install certifi

Now, we can use this package in our code.

import pycurl
import certifi
# Create a Curl object
curl = pycurl.Curl()
# Set the URL
curl.setopt(curl.URL, 'https://blog.apify.com/')
# Check the CA certificates through certifi
curl.setopt(curl.CAINFO, certifi.where())
# Perform the request
curl.perform()
# Retrieve the response code
responseCode = curl.getinfo(curl.RESPONSE_CODE)
print(f'Response code: {responseCode}')
# Close the Curl object
curl.close()

In the code above, we're using an additional package that confirms the certificate of any website.

The curl.CAINFO sets the Curl to check for a CA certificate of a website and certifi.where() extracts the path to the default certificates provided by the package.

We've covered the basics and some advanced concepts of cURL combined with the simplicity and ease of use of Python. This combination allows you to perform complex web operations with fewer lines of code and provides better control over HTTP requests.

Let’s put everything we’ve learned so far together and look at a script that covers most aspects.

Final code

In this script, we'll also use Beautiful Soup to parse and extract information from the content we get through our requests.

If you’re not familiar with it, you should read more about web scraping with Beautiful Soup.

Install by entering the command pip install beautifulsoup4 and run the script below.

import pycurl
import certifi
from io import BytesIO
# Import BeautifulSoup
from bs4 import BeautifulSoup
# Make a new curl object
curl = pycurl.Curl()
# Make a new Buffer
responseBuffer = BytesIO()
# Set the URL
curl.setopt(curl.URL, "https://blog.apify.com/")
# Check the certificates
curl.setopt(curl.CAINFO, certifi.where())
# Allow redirections
curl.setopt(curl.FOLLOWLOCATION, 3)
# Save the response data
curl.setopt(curl.WRITEDATA, responseBuffer)
# Make the request
curl.perform()
# Decode the response
htmlResponse = responseBuffer.getvalue().decode('utf-8')
# Add the information to the parser
soup = BeautifulSoup(htmlResponse, 'html.parser')
# Extract articles
articles = soup.find_all('div', class_='post-info-wrap')  # Use class_ instead of class
# Loop through all the articles
for article in articles:
    title = article.find('h2', class_='post-title').text.strip()  # Use class_ instead of class
    author = article.find('a', class_='post-author').text.strip()  # Use class_ instead of class

    print("Title:", title)
    print("Author:", author)
    print("-" * 25)

curl.close()
responseBuffer.close()

In this script, we're making a GET request to the Apify Blog and using setopt() to check the SSL/TLS certificate and apply redirection.

In the end, we're using Beautiful Soup to parse the HTML response.

First, we retrieve posts on the website, and then we extract the Title and Author of each post.

How to use cURL in Python with subprocessor

We've covered PyCURL. Now for a much more concise explanation of the second method of using cURL in Python: with Python's subprocesser module.

Using the subprocess module allows you to run cURL commands directly within your Python code. This is quick and straightforward for those familiar with the cURL command line.

import subprocess

response = subprocess.run(
    ['curl', 'http://example.com'],
    capture_output=True,
    text=True
)

print(response.stdout)

This approach is useful when you want to use existing cURL commands in a Python script without translating them into another library's syntax.

How to use cURL in Python with Requests

Finally, the third way to use cURL in Python: Requests.

The Requests library is a popular, high-level library for making HTTP requests in Python. Its advantage over the previous methods is that it provides a simple and intuitive interface.

Here are a few examples of using cURL with the Requests library.

1. Making a GET request with Requests

import requests

response = requests.get('http://example.com')
print(response.text)

2. Making a POST request with Requests

import requests

response = requests.post('http://example.com', data={'key': 'value'})
print(response.text)

3. Adding headers to a request with Requests

import requests

headers = {'Content-Type': 'application/json'}
response = requests.get('http://example.com', headers=headers)
print(response.text)

4. Handling redirects with Requests

Requests handles redirects automatically, but you can disable this feature if needed:

response = requests.get('http://example.com', allow_redirects=False)
print(response.status_code)

5. Managing cookies with Requests

Requests makes cookie management straightforward:

import requests

response = requests.get('http://example.com')
print(response.cookies)

cookies = {'session_id': '12345'}
response = requests.get('<http://example.com>', cookies=cookies)
print(response.text)

6. Authenticating with Requests

Requests supports various authentication methods:

from requests.auth import HTTPBasicAuth

response = requests.get('http://example.com', auth=HTTPBasicAuth('username', 'password'))
print(response.text)

7. Handling errors with Requests

Requests raises exceptions for various errors:

import requests

try:
    response = requests.get('http://example.com')
    response.raise_for_status()
except requests.exceptions.HTTPError as err:
    print(f'HTTP error occurred: {err}')
except Exception as err:
    print(f'Other error occurred: {err}')

8. Uploading files with Requests

Uploading files is straightforward with Requests:

import requests

files = {'file': open('file.txt', 'rb')}
response = requests.post('http://example.com', files=files)
print(response.text)

9. Handling SSL/TLS certificates with Requests

Requests verifies SSL certificates by default:

import requests

response = requests.get('https://example.com', verify='/path/to/cacert.pem')
print(response.text)

💡

Learn more: How to handle timeouts with Python Requests

Frequently asked questions

What is cURL in Python?

cURL in Python refers to using Python libraries or modules to make network requests similar to the command-line tool cURL. This can be achieved using libraries such as PycURL, subprocess, or Requests, which provide various levels of control and simplicity.

Can you use cURL in Python?

Yes, you can use cURL in Python through several methods. PycURL, subprocess, and the Requests library are popular ways to execute cURL commands or perform equivalent network requests in Python, each offering different features and levels of abstraction.

What is the Python equivalent of cURL?

Python equivalents of cURL are PyCURL and the Requests library. Requests offers a simple and user-friendly API for making HTTP requests, similar to cURL. For more complex or specific needs, PycURL provides a direct interface to the cURL library.

What is the difference between Python Requests and cURL?

Python Requests is a high-level library for making HTTP requests, known for its simplicity and ease of use. cURL is a command-line tool and library for transferring data with URLs, offering more low-level control and features. Requests abstracts many complexities of cURL.