How to send HTTP headers with cURL

Learn how manipulating HTTP headers can enhance the data transfer process.

Hi! We're Apify, a full-stack web scraping and browser automation platform. Following our introduction to cURL functions and how to use cURL in Python, this guide explores using cURL to send HTTP headers.

Introduction to HTTP headers and cURL

HTTP headers are vital components of web scraping as they contain crucial metadata about the request and the client making the request. Manipulating these headers allows for mimicking different clients, handling authentication, controlling caching behavior, and navigating through various parts of a website. In this guide, we'll explore the significance of HTTP headers in data transfer and show you how manipulating these headers can enhance the data transfer process.

What are HTTP headers?

HTTP headers are key-value pairs that are encoded in the requests or response headers of HTTP messages. These headers provide necessary information about the client, the server, or the body of a message itself. They are often used to pass important details about a request or to modify the behavior of a server or client.

The role of cURL in data transfer

When it comes to data transfer, especially in the context of HTTP, cURL's versatility and user-friendly features make it an excellent tool for crafting HTTP requests, setting headers, handling cookies, following redirects, and efficiently extracting data from websites. With support for protocols like HTTP, HTTPS, and FTP, cURL is often the preferred choice for a wide range of scraping tasks.

How HTTP headers are used

HTTP headers are essential for conveying metadata and control information about the HTTP message. They provide details like the type of content being sent, the capabilities of the server or client, and the authentication method.

How HTTP headers are structured

HTTP headers consist of key-value pairs separated by a colon (:) and a space, with each pair representing a different aspect of the request or response. Common headers include User-AgentContent-TypeAccept, and Cache-Control, each serving specific purposes in the communication between the client and server.

Here is an example: Header-Name: Value. Multiple HTTP headers are separated by line breaks.

Common HTTP headers

Some common HTTP headers include:

  • User-Agent: Provides information about the user agent originating the request.
  • Content-Type: Specifies the media type of the requested content being sent or received, such as text/html or application/json.
  • Accept: Communicates the media types that the client is willing to receive from the server.
  • Cache-Control: Directs how caching of the response should be handled, specifying directives like max-age and no-cache.

Using cURL to send HTTP headers

cURL provides the -H (shortcut for --header) option to include customer headers in HTTP requests. By specifying the header name and its value, users can send requests with tailored headers to meet specific requirements.

Using the -H or --header option

To send a single HTTP header using cURL, you can use the -H option followed by the header in the format Header-Name: Value.

Sending custom headers

  • Sending the User-Agent header:
curl -H "User-Agent: Mozilla/5.0" <https://api.apify.com/v2/users/apify>

This command sends an HTTP request to example.com with a custom User-Agent header indicating a Mozilla browser.

  • Sending the Accept header:
curl -H "Accept: application/json" <https://api.apify.com/v2/users/apify>

This command lets the server know that the client will prefer responses in JSON

Advanced techniques for using headers with cURL

Sending multiple headers in a single command

You can send multiple headers in a single cURL request by using the -H option for each header, like the one below:

curl -H "Content-Type: application/json" -H "Authorization: Bearer oauth_token" <https://api.example.com/data>

In this example, we sent both Content-Type and Authorization headers.

Viewing response headers from a server

You can use the -I option to view the response headers from the server, providing insights into the server's configuration and response metadata. This option includes the HTTP response headers in the output like this:

curl -I <https://api.apify.com/v2/users/apify>

Sending empty headers and removing default headers

To send an empty header using cURL, you can use the -H option of any custom header with an empty value.

curl -H "Empty-Header:" <https://api.apify.com/v2/users/apify>

In the same way, cURL allows users to remove default headers by not including them in the request.

For example, here’s how you would remove the User-Agent header:

curl -H "User-Agent:" <https://api.apify.com/v2/users/apify>

Using verbose mode for detailed information

The -v (shortcut for --verbose) option can be used to enable verbose mode, providing detailed information about the request and response, including the headers, status codes, and any other data on the page.

curl -v <https://apify.com/store>

Saving headers to files

You can save the response headers to a file using the -D or --dump-header option. This will save the headers to the specified file along with the downloaded data.

curl -D headers.txt [<https://api>](<https://api.example.com/data>)fy.com

By using the -D option followed by a file name, you can save the headers of an HTTP response to a file for later analysis or reference.

In case you do not need to access all the headers but only a part of the header data, a more technical approach is to use the piping feature (available in UNIX systems) with cURL. This is what we'll cover next.

Sending headers through piping with curl commands

Piping is a powerful feature of UNIX-based systems, which allows the output from one command to be sent as an input for another command. To save only the date header from the response header (instead of dumping all headers) to the headers.txt file, here’s how you would use piping to achieve that:

curl -I <https://apify.com> | grep date: >> headers.txt

To save only the content-length header from the response header to a lengths.txt file, you would do this:

curl -I <https://apify.com> | grep content-length: >> lengths.txt

In these examples, curl -I <https://apify.com> sends a request to the Apify server and retrieves only the HTTP headers; | sends the HTTP headers to the next command (grep); grep filters the output to only include lines containing the specified header and date: or content-length:>> appends the filtered output to the specified file name: headers.txt or lengths.txt.

Use cases for custom headers with cURL

Changing response format (e.g. JSON, XML)

Custom headers can always be used to request a specific response format from the server. By setting a custom header like Accept to specify the desired media type, you can request the server to provide the response in your desired format, such as JSON or XML.

Conditional requests using headers like If-Modified-Since

Headers like If-Modified-SinceIf-Unmodified-Since, and If-None-Match can be used to make conditional requests, allowing the server to respond with a full or partial response depending on the conditions specified in the headers. Below are some examples.

  • If-Modified-Since
curl -H "If-Modified-Since: Sun, 11 Feb 2024 00:00:00 GMT" <https://example.com/resource>

This command sends a GET request to https://example.com/resource with the If-Modified-Since header, indicating that the server should only send the requested resource if it has been modified since the specified date. This helps reduce unnecessary data transfer.

Suppose the resource has been modified since the date specified. In that case, the server will return a status code of 200 OK, but if the resource has not been modified, the server will return a status code of 304 Not Modified, and cURL will not output any content, indicating that the cached version of the resource can be used.

  • If-None-Match
curl -H "If-None-Match: "123456789"" <https://example.com/data>

This command sends a request to https://example.com/data with the If-None-Match header, indicating that the server should only send the requested resource if the provided entity tag (ETag) does not match the current entity on the server.

Including a Referer header for source tracking

The Referer header provides information about the source of the request, allowing servers to track the origin of incoming requests, which can be useful for analytics and security purposes. It can be included in a request body to indicate the referer URL, which can be useful for source tracking and analytics. However, in some scenarios where server-side privacy measures are stringent, such that a noreferrer attribute has been utilized within the anchor tags of the HTML source, the receiving server will be restricted from obtaining information about the referring URL. On the server side, the noreferrer attribute helps improve user privacy.

curl -H "Referer: [<https://example.com>](<https://example.com/>)" <https://api.example.com/data>

Custom authentication headers (e.g. X-Api-Key)

APIs often require authentication headers to authenticate client requests securely, ensuring access control and data confidentiality. Custom authentication headers like X-Api-Key can be used to authenticate requests to APIs or services.

curl -H "X-Api-Key: your_api_key" <https://api.example.com/data>

Troubleshooting issues with cURL and HTTP headers

Double-checking header syntax

When encountering issues with headers, it's important to double-check the syntax of the headers, ensuring they are in the correct format and separated by a colon followed by a space. Header names should always follow the syntax rules specified in the HTTP protocol to avoid errors.

Verifying header support and case sensitivity

Some servers may have specific requirements for headers. It's important to verify that the headers being used are supported by the server and to be aware of any case sensitivity for headers. Always double-check server documentation to ensure compatibility.

Examining server responses for error diagnosis

If you are encountering issues with headers, analyze the server's responses using cURL's verbose (-v) mode or by inspecting the response headers for error messages or inconsistencies. The server responses may include details about unsupported headers, incorrect headers, or other issues.

FAQs

How to add headers in cURL?

You can add headers in cURL using the -H or --header option followed by the header name, colon, and value.

Does cURL automatically add headers?

cURL does automatically add some headers, such as Host and User-Agent. However, you will need to manually add most custom headers. If you add a header that has already been automatically added by cURL, curl will customize or override the header as needed.

How to check HTTP headers in cURL?

You can check HTTP headers in cURL by using the -i or -I option, which includes the response headers in the output.

How is -I different from -v?

I is specifically for retrieving only the response headers while v is for enabling verbose mode to display comprehensive information about the entire HTTP request and response exchange, including the response headers.

Can I send empty headers with cURL?

Yes, you can send empty headers with cURL by using the -H option with an empty value.

How to remove a default header in cURL?

To remove a default header, you can use the -H option with the header name followed by a colon and an empty value, effectively overriding the default header. This also applies to headers that are automatically added by curl.

On this page

Build the scraper you want

No credit card required

Start building