How to scrape LinkedIn jobs with Python

The job market is constantly evolving, making it essential to stay updated on new opportunities. However, manually tracking job listings across multiple sites can be time-consuming. So why not automate it?

In this guide, you'll learn how to build a LinkedIn job scraper in Python. While we’ll focus on LinkedIn, the same techniques can be applied to job sites like Indeed and others.

We'll guide you through every step - from setting up your project to deploying your automated scraper on the Apify platform!

If you want a quicker, easier method, skip to the Use Apify's ready-made LinkedIn Jobs Scraper section or check out LinkedIn Jobs Scraper on Apify Store

Guide for scraping LinkedIn jobs with Python

Learn how to build a LinkedIn job scraper in Python with a step-by-step guide. In this tutorial, we’ll automatically retrieve job postings from the LinkedIn search page:

This section will walk you through the process of job scraping via the following steps:

Prerequisites and project setup
Analyze LinkedIn jobs structure
Connect to the target API endpoint
Extract job data
Handle pagination
Save data to CSV
Complete code
Deploy to Apify

1. Prerequisites and project setup

To follow along with this tutorial, make sure that you meet the following prerequisites:

A basic understanding of how the web works
Familiarity with the DOM, HTML, and CSS selectors
Knowledge of AJAX and RESTful APIs

Since Python is the primary language for this LinkedIn scraping guide, you'll also need:

Python 3+ installed on your local machine.
A Python IDE, such as Visual Studio Code with the Python extension or PyCharm
Basic knowledge of Python and asynchronous programming

To set up your Python project, start by creating a new folder and initializing a virtual environment inside it:

mkdir linkedin-scraper
cd linkedin-scraper
python -m venv venv

To activate the virtual environment on Windows, run:

venv\\Scripts\\activate

Equivalently, on Linux/macOS, execute:

source venv/bin/activate

In an activated virtual environment, install the required libraries for LinkedIn jobs scraping:

pip install httpx beautifulsoup4 lxml

These dependencies include:

httpx: A fast, modern HTTP client for making web requests
beautifulsoup4: A library for parsing HTML and extracting data from HTML documents
lxml: The underlying HTML parsing engine used by Beautiful Soup

Now, open your project in your IDE and create a scraper.py file to implement the scraping logic.

2. Analyze LinkedIn jobs structure

If you visit the LinkedIn Jobs Search page in your browser while being logged out, you might encounter this login wall page:

To bypass that, visit the LinkedIn homepage and click on the "Jobs" button:

This time, you’ll see the correct job search page:

The difference? The first page has this URL:

https://www.linkedin.com/jobs/search

Whereas the second page has the URL below:

https://www.linkedin.com/jobs/search?trk=guest_homepage-basic_guest_nav_menu_jobs&position=1&pageNum=0

The extra query parameters signal to LinkedIn that you are a legitimate visitor, allowing access to the job search results.

Now, try searching for a specific job title like "AI Engineer" in the United States:

Searching for AI engineer job positions in the US

The URL of the page will become:

https://www.linkedin.com/jobs/search?keywords=AI%20Engineer&location=United%20States&trk=public_jobs_jobs-search-bar_search-submit&position=1&pageNum=0

You might be tempted to scrape this page directly, but there’s a smarter approach!

Right-click on the page and select the “Inspect” option. In the DevTools window, reach the “Network” tab and enable the “Fetch/XHR” filter. Now, scroll down the page to trigger the loading of more job listings.

You'll notice that LinkedIn sends an AJAX request to fetch more job postings:

The AJAX request used by LinkedIn to retrieve data

Specifically, the page calls the GET /jobs-guest/jobs/api/seeMoreJobPostings/search endpoint with some parameters:

https://www.linkedin.com/jobs-guest/jobs/api/seeMoreJobPostings/search?keywords=AI%20Engineer&location=United%20States&trk=public_jobs_jobs-search-bar_search-submit&position=1&pageNum=0&start=25

The above endpoint follows this structured format:

https://www.linkedin.com/jobs-guest/jobs/api/seeMoreJobPostings/search?keywords=><keyword>&location=<location>&trk=public_jobs_jobs-search-bar_search-submit&position=1&pageNum=0&start=<start>

Where:

<keyword> is the job title you are searching for (e.g., "AI Engineer").
<location> is the location for the job search (e.g., "United States").
<start> controls pagination (e.g., 0 starts from the first job, 10 skips the first 10 jobs and fetches the next set, etc.)

If you inspect the API response, you'll see it returns raw HTML with job listings:

As you can tell, the response consists of <li> elements containing job postings. Essentially, that API can be targeted for web scraping, a technique known as “API scraping.”

If you copy and paste the API URL into your browser, you'll see the raw HTML response:

The rendered HTML of the response from the AJAX request

Each job posting contains:

The job title
The URL to the specific LinkedIn job page
The job location
The posting date

By targeting this API instead of scraping the entire webpage, you can extract job listings more efficiently.

3. Connect to the target API endpoint

To scrape LinkedIn job listings, you'll first need to make an HTTP request to the target search API/page. First, import HTTPX in your Python script:

from httpx import AsyncClient

Then, remember that the AJAX request made by the page in the browser contains multiple headers. If you don’t include them in your script, LinkedIn may block your request since it won’t appear to be coming from a browser. To avoid blocks, replicate the GET request headers when making the request with HTTPX:

url = "https://www.linkedin.com/jobs-guest/jobs/api/seeMoreJobPostings/search"
params = {
    "keywords": "AI Engineer",
    "location": "United States",
    "trk": "public_jobs_jobs-search-bar_search-submit",
    "start": "0"
}
headers = {
    "accept": "*/*",
    "accept-language": "en-US,en;q=0.9",
    "priority": "u=1, i",
    "sec-ch-ua": '"Chromium";v="134", "Not:A-Brand";v="24", "Google Chrome";v="134"',
    "sec-ch-ua-mobile": "?0",
    "sec-ch-ua-platform": '"Windows"',
    "sec-fetch-dest": "empty",
    "sec-fetch-mode": "cors",
    "sec-fetch-site": "same-origin"
}
async with httpx.AsyncClient() as client:
    response = await client.get(url, headers=headers, params=params)

Note that <start>has been set to 0 to get the job posting elements from the start.

Instead of using the API, you could directly target the LinkedIn job search page:

url = "https://www.linkedin.com/jobs/search?keywords=AI%20Engineer&location=United%20States&trk=public_jobs_jobs-search-bar_search-submit&position=1&pageNum=0"
# same code as above...

As mentioned earlier, the API returns only the essential job listings in raw HTML—making data parsing much easier. Instead, the full job search page contains many additional elements. Also, targeting the search webpage makes it harder to scrape job listings across multiple pages. In this case, the API approach is recommended.

Right now, your scraper will contain:

import asyncio
import httpx

async def main() -> None:
    url = "https://www.linkedin.com/jobs-guest/jobs/api/seeMoreJobPostings/search"
    params = {
        "keywords": "AI Engineer",
        "location": "United States",
        "trk": "public_jobs_jobs-search-bar_search-submit",
        "start": "0"
    }
    headers = {
        "accept": "*/*",
        "accept-language": "en-US,en;q=0.9",
        "priority": "u=1, i",
        "sec-ch-ua": '"Chromium";v="134", "Not:A-Brand";v="24", "Google Chrome";v="134"',
        "sec-ch-ua-mobile": "?0",
        "sec-ch-ua-platform": '"Windows"',
        "sec-fetch-dest": "empty",
        "sec-fetch-mode": "cors",
        "sec-fetch-site": "same-origin"
    }
    async with httpx.AsyncClient() as client:
        # Perform a GET HTTP request to the target API
        response = await client.get(url, headers=headers, params=params)
        # Print the HTML response
        print(response.content)

# Run the async function
asyncio.run(main())

If you execute the script, you’ll get raw HTML similar to the content below:

<li>
  <div class="base-card relative w-full hover:no-underline focus:no-underline base-card--link base-search-card base-search-card--link job-search-card"
       data-entity-urn="urn:li:jobPosting:4172359372"
       data-impression-id="jobs-search-result-0"
       data-reference-id="UwAVpEJKGozQZChiRkVAqA=="
       data-tracking-id="IqYOsCsaz16pk9P++5/Qlg=="
       data-column="1" data-row="26">
    <a class="base-card__full-link absolute top-0 right-0 bottom-0 left-0 p-0 z-[2]"
       href="https://www.linkedin.com/jobs/view/machine-learning-engineer-at-tagup-inc-4172359372?position=1&amp;pageNum=2&amp;refId=UwAVpEJKGozQZChiRkVAqA%3D%3D&amp;trackingId=IqYOsCsaz16pk9P%2B%2B5%2FQlg%3D%3D"
       data-tracking-control-name="public_jobs_jserp-result_search-card"
       data-tracking-client-ingraph
       data-tracking-will-navigate>
      <span class="sr-only">Machine Learning Engineer</span>
    </a>
    <div class="search-entity-media">
      <img class="artdeco-entity-image artdeco-entity-image--square-4"
           data-delayed-url="https://media.licdn.com/dms/image/v2/D4E0BAQGukdIwLnnfsA/company-logo_100_100/company-logo_100_100/0/1737909623270/tagup_logo?e=2147483647&amp;v=beta&amp;t=r41szu5-PE_bSt4ehtf5wHPFSpw5kF58dLqbw-WClE4"
           data-ghost-classes="artdeco-entity-image--ghost"
           data-ghost-url="https://static.licdn.com/aero-v1/sc/h/6puxblwmhnodu6fjircz4dn4h"
           alt>
    </div>

    <div class="base-search-card__info">
      <h3 class="base-search-card__title">Machine Learning Engineer</h3>

      <h4 class="base-search-card__subtitle">
        <a class="hidden-nested-link"
           data-tracking-client-ingraph
           data-tracking-control-name="public_jobs_jserp-result_job-search-card-subtitle"
           data-tracking-will-navigate
           href="https://www.linkedin.com/company/tagup?trk=public_jobs_jserp-result_job-search-card-subtitle">
          Tagup, Inc.
        </a>
      </h4>

      <div class="base-search-card__metadata">
        <span class="job-search-card__location">New York, NY</span>

        <div class="job-posting-benefits text-sm">
          <icon class="job-posting-benefits__icon"
                data-delayed-url="https://static.licdn.com/aero-v1/sc/h/8zmuwb93gzlb935fk4ao4z779"
                data-svg-class-name="job-posting-benefits__icon-svg"></icon>
          <span class="job-posting-benefits__text">Be an early applicant</span>
        </div>

        <time class="job-search-card__listdate" datetime="2025-03-03">1 week ago</time>
      </div>
    </div>
  </div>
</li>
<li>
 <!-- omitted for brevity... -->
</li>
 <!-- other <li> elements... -->

That is the raw HTML structure of a job listing returned by the LinkedIn jobs search API.

4. Extract job data

Now that you have raw HTML, feed it to Beautiful Soup for parsing it with lxml. First, import Beautiful Soup:

from bs4 import BeautifulSoup

Then, pass the raw HTML to the BeautifulSoup constructor:

html = soup = BeautifulSoup(response.content, "lxml")

To define the HTML parsing logic, you must get familiar with the structure of the HTML snippet returned by the API. Copy the API URL into your browser and visit it. Next, inspect the resulting HTML with DevTools:

Here, you can see that each <li> contains:

The job post URL in an a[data-tracking-control-name="public_jobs_jserp-result_search-card"] element
The job title in an h3.base-search-card__title HTML element
The company name in an h4.base-search-card__subtitle node
The publication date in a job-search-card__listdate element

To scrape LinkedIn job postings, you first need a structure to store the scraped data. Since the page contains multiple job listings, an array is ideal:

job_postings = []

First, select all <li> elements and prepare to iterate over them:

job_li_elements = soup.select("li")

for job_li_element in job_li_elements:
    # Scraping logic...

select() from Beautiful Soup returns all HTML elements that match the specified CSS selector.

Next, scrape data from each <li> element by selecting the elements identified earlier and the content of interest from them:

link_element = job_li_element.select_one("a[data-tracking-control-name=\\"public_jobs_jserp-result_search-card\\"]")
link = link_element["href"] if link_element else None

title_element = job_li_element.select_one("h3.base-search-card__title")
title = title_element.text.strip() if title_element else None

company_element = job_li_element.select_one("h4.base-search-card__subtitle")
company = company_element.text.strip() if company_element else None

publication_date_element = job_li_element.select_one("time.job-search-card__listdate")
publication_date = publication_date_element["datetime"] if publication_date_element else None

select_one() works similarly to select(), but it returns only the first element that matches the specified CSS selector. The text attribute retrieves the text content within the HTML element, while square bracket syntax is used to access HTML attributes. The strip() method is used to remove any extra spaces from the text content.

If you're not familiar with the syntax above, read our guide on web scraping with Beautiful Soup.

Keep in mind that not all LinkedIn job posting HTML elements contain the same data. For this reason, some HTML elements you're looking for in the code may not be part of the <li> element. In that case, select_one() will return None. To prevent errors, you can use this syntax:

<variable> = <operation> if <html_element> else None

This ensures that <operation> is performed only if <html_element> is not None. Otherwise, it assigns <variable> to None.

Finally, use the scraped data to populate a new job posting object and add it to the list:

job_posting = {
    "url": link,
    "title": title,
    "company": company,
    "publication_date": publication_date
}
job_postings.append(job_posting)

At the end of this step, job_postings will contain something like:

[
  {'url': 'https://www.linkedin.com/jobs/view/machine-learning-engineer-at-netflix-4118831761?position=1&pageNum=0&refId=Kd5%2F7JLKUdxzWL%2FdaPidwA%3D%3D&trackingId=4iC4PH5kiqfIRN%2B0D9s64Q%3D%3D', 'title': 'Machine Learning Engineer', 'company': 'Netflix', 'publication_date': '2025-03-05'},
  # omitted for brevity...
  {'url': 'https://www.linkedin.com/jobs/view/machine-learning-engineer-at-tagup-inc-4172355933?position=10&pageNum=0&refId=Kd5%2F7JLKUdxzWL%2FdaPidwA%3D%3D&trackingId=lff%2BDfyg%2B0C3z9cQAFDxIg%3D%3D', 'title': 'Machine Learning Engineer', 'company': 'Tagup, Inc.', 'publication_date': '2025-03-03'}
]

5. Handle pagination

Don't forget that the LinkedIn job search endpoint has a start parameter that allows you to handle pagination. By default, the API returns 10 job postings at a time. So, you can access the second page by setting start to 10, the third page by setting it to 20, and so on.

To implement pagination, you can write a simple for loop as shown below:

# url = ...
# params = ...
# headers = ...

# The number of pagination pages to scrape
pages = 3
# Iterate over each pagination page
for page in range(pages):
    async with httpx.AsyncClient() as client:
        # Set the right pagination argument
        params["start"] = str(page * 10)
        response = await client.get(url, headers=headers, params=params)

The argument pages is set to 3, meaning the script will scrape 3 pages of job postings.

Note that to make the data storing logic work, you must move job_postings outside of the for loop:

job_postings = []
# for loop...

This way, job_postings will store data across all pages, rather than being reset for each page.

6. Save data to CSV

You now have the scraped LinkedIn job postings in a Python array. To make the data easier to share and analyze, export it to a CSV file. Note that you don’t need any new dependencies for this task, as the Python Standard Library provides everything you need.

First, import csv from the Python Standard Library:

import csv

Then, use it to export job_postings to a file called job_postings.csv as follows:

with open("job_postings.csv", mode="w", newline="", encoding="utf-8") as file:
    # Initialize the CSV writer
    writer = csv.DictWriter(file, fieldnames=["url", "title", "company", "publication_date"])
    # Write the CSV header
    writer.writeheader()
    # Populate the CSV with the data in the dictionary array
    writer.writerows(job_postings)

writerows() from csv.DictWriter will populate the output file with your scraped data.

7. Complete code

This is the final code of your Python LinkedIn jobs scaper:

import asyncio
import httpx
from bs4 import BeautifulSoup
import csv

async def main() -> None:
    url = "https://www.linkedin.com/jobs-guest/jobs/api/seeMoreJobPostings/search"
    params = {
        "keywords": "AI Engineer",
        "location": "United States",
        "trk": "public_jobs_jobs-search-bar_search-submit",
        "start": "0"
    }
    headers = {
        "accept": "*/*",
        "accept-language": "en-US,en;q=0.9",
        "priority": "u=1, i",
        "sec-ch-ua": '"Chromium";v="134", "Not:A-Brand";v="24", "Google Chrome";v="134"',
        "sec-ch-ua-mobile": "?0",
        "sec-ch-ua-platform": '"Windows"',
        "sec-fetch-dest": "empty",
        "sec-fetch-mode": "cors",
        "sec-fetch-site": "same-origin",
    }

    # Where to store the scraped data
    job_postings = []

    # The number of pagination pages to scrape
    pages = 3
    # Iterate over each pagination page
    for page in range(pages):
        async with httpx.AsyncClient() as client:
            # Set the right pagination argument
            params["start"] = str(page * 10)

            # Perform a GET HTTP request to the target API
            response = await client.get(url, headers=headers, params=params)

        # Parse the HTML content returned by API
        soup = BeautifulSoup(response.content, "lxml")

        # Select all <li> job posting elements
        job_li_elements = soup.select("li")

        # Iterate over them and scrape data from each of them
        for job_li_element in job_li_elements:
            # Scraping logic
            link_element = job_li_element.select_one("a[data-tracking-control-name=\\"public_jobs_jserp-result_search-card\\"]")
            link = link_element["href"] if link_element else None

            title_element = job_li_element.select_one("h3.base-search-card__title")
            title = title_element.text.strip() if title_element else None

            company_element = job_li_element.select_one("h4.base-search-card__subtitle")
            company = company_element.text.strip() if company_element else None

            publication_date_element = job_li_element.select_one("time.job-search-card__listdate")
            publication_date = publication_date_element["datetime"] if publication_date_element else None

            # Populate a new job posting with the scraped data
            job_posting = {
                "url": link,
                "title": title,
                "company": company,
                "publication_date": publication_date
            }
            # Append it to the list
            job_postings.append(job_posting)

    # Export the scraped data to CSV
    with open("job_postings.csv", mode="w", newline="", encoding="utf-8") as file:
        # Initialize the CSV writer
        writer = csv.DictWriter(file, fieldnames=["url", "title", "company", "publication_date"])
        # Write the CSV header
        writer.writeheader()
        # Populate the CSV with the data in the dictionary array
        writer.writerows(job_postings)

# Run the async function
asyncio.run(main())

Execute the script with the following command:

python scraper.py

At the end of the execution, a job_postings.csv file will appear in your project directory. Open it, and you will see something like this:

Great! Your LinkedIn scraping script is working like a charm.

8. Deploy to Apify

Suppose you want to deploy your LinkedIn job scraper to Apify to execute it in the cloud. The prerequisites for using Apify are:

An Apify account
A basic understanding of how Apify works

To initialize a new LinkedIn web scraping project on Apify:

Log in
Reach the Console

Under the "Actors" dropdown, select "Development," and press the “Develop new” button:

Next, select one of the many Apify templates. In this case, choose the "Start with Python" template, which sets up a Python Actor using HTTPX and Beautiful Soup:

Selecting the “Start with Python” template

Review the starter project code and click "Use this template" to fork it:

You will be redirected to an online IDE:

Here you can customize your Actor, writing your LinkedIn scraping logic directly in the cloud.

Now, instead of hardcoding the LinkedIn job search API parameters directly in the code, it's better to configure your code so it can read them from the Apify input configuration. This way, you can programmatically adapt your scraping script to work with different job searches.

To make your Apify actor configurable, open input_schema.json in the Web IDE and write this JSON content:

{
    "title": "Scrape data from the LinkedIn job search pages",
    "type": "object",
    "schemaVersion": 1,
    "properties": {
        "keyword": {
            "title": "Job search keyword",
            "type": "string",
            "description": "The LinkedIn job search keyword argument",
            "editor": "textfield",
            "prefill": "AI Engineer"
        },
        "location": {
            "title": "Job search keyword",
            "type": "string",
            "description": "The LinkedIn job search location argument",
            "editor": "textfield",
            "prefill": "United States"
        },
        "pages": {
            "title": "Pagination pages",
            "type": "integer",
            "description": "Number of pagination pages to scraped",
            "editor": "number",
            "prefill": 3,
            "default": 1
        }
    },
    "required": ["keyword", "location"]
}

This defines the following three arguments:

keyword: The job search keyword (e.g., "AI Engineer").
location: The job search location (e.g., "United States").
pages: The number of pagination pages to scrape (e.g., 3).

Make sure to enable the autosave feature:

In main.py, you can read those arguments to populate the params object and the pages variable as shown below:

actor_input = await Actor.get_input()

# ...
params = {
    "keywords": actor_input.get("keyword"),
    "location": actor_input.get("location"),
    "trk": "public_jobs_jobs-search-bar_search-submit",
    "start": "0"
}

# ...

pages = actor_input.get("pages")

Actor.get_input() loads the input you can then access by name with actor_input.get().

Put it all together and you’ll get the following Apify actor code:

from apify import Actor
import httpx
from bs4 import BeautifulSoup

async def main() -> None:
    async with Actor:
        # Access the Apify input data
        actor_input = await Actor.get_input()
        url = "https://www.linkedin.com/jobs-guest/jobs/api/seeMoreJobPostings/search"
        params = {
            "keywords": actor_input.get("keyword"),
            "location": actor_input.get("location"),
            "trk": "public_jobs_jobs-search-bar_search-submit",
            "start": "0"
        }
        headers = {
            "accept": "*/*",
            "accept-language": "en-US,en;q=0.9",
            "priority": "u=1, i",
            "sec-ch-ua": '"Chromium";v="134", "Not:A-Brand";v="24", "Google Chrome";v="134"',
            "sec-ch-ua-mobile": "?0",
            "sec-ch-ua-platform": '"Windows"',
            "sec-fetch-dest": "empty",
            "sec-fetch-mode": "cors",
            "sec-fetch-site": "same-origin",
        }

        # The number of pagination pages to scrape
        pages = actor_input.get("pages")

        # Iterate over each pagination page
        for page in range(pages):
            async with httpx.AsyncClient() as client:
                # Set the right pagination argument
                params["start"] = str(page * 10)

                # Perform a GET HTTP request to the target API
                response = await client.get(url, headers=headers, params=params)

            # Parse the HTML content returned by API
            soup = BeautifulSoup(response.content, "lxml")

            # Select all <li> job posting elements
            job_li_elements = soup.select("li")

            # Iterate over them and scrape data from each of them
            for job_li_element in job_li_elements:
                # Scraping logic
                link_element = job_li_element.select_one("a[data-tracking-control-name=\\"public_jobs_jserp-result_search-card\\"]")
                link = link_element["href"] if link_element else None
                title_element = job_li_element.select_one("h3.base-search-card__title")
                title = title_element.text.strip() if title_element else None
                company_element = job_li_element.select_one("h4.base-search-card__subtitle")
                company = company_element.text.strip() if company_element else None
                publication_date_element = job_li_element.select_one("time.job-search-card__listdate")
                publication_date = publication_date_element["datetime"] if publication_date_element else None

                # Populate a new job posting with the scraped data
                job_posting = {
                    "url": link,
                    "title": title,
                    "company": company,
                    "publication_date": publication_date
                }

                # Register the scraped data to Apify
                await Actor.push_data(job_posting)

Note that the CSV export logic is no longer needed because it's handled by the push_data() method:

await Actor.push_data(job_postings)

This allows you to retrieve the scraped data via the API or export it in multiple formats supported by the Apify dashboard.

Now, click the “Save & Build” button:

Visit the “Input” tab and fill out the input manually as shown below:

Press “Save & Start” to launch the LinkedIn scraper. The result will look like this:

Move to the “Storage” card:

Exporting the data to a file in one of the supported formats

Here, you can export the data in multiple formats, including JSON, CSV, XML, Excel, HTML Table, RSS, and JSONL.

Et voilà! You’ve successfully performed LinkedIn jobs web scraping on Apify.

Next steps

This tutorial has covered the basics of web scraping on LinkedIn. To elevate your script and make it more powerful, consider implementing these advanced techniques:

Automated interaction: Use Python web browser automation to mimic real user behavior, reducing the likelihood of your script getting blocked.
Specific job data scraping: Navigate to individual job posting pages using their URLs to extract more detailed data beyond what's available on the main job listing page.
Proxy management: Integrate proxies into your actor to avoid IP bans and blocks. Discover more in the official documentation.

Use Apify’s ready-made LinkedIn Jobs Scraper

Scraping jobs from LinkedIn may not be as simple as we’ve shown in this article. As long as you stick to basic methods like in this tutorial, everything is a breeze. However, if you aim to retrieve data at scale, you'll have to deal with anti-scraping measures like IP bans, browser fingerprinting, CAPTCHAs, and more.

The easiest way to overcome these obstacles is by using a pre-built LinkedIn scraper that handles everything for you. Some benefits of this approach include:

No coding required: Start scraping instantly.
Block bypass: Avoid IP bans and CAPTCHAs automatically.
API access: Easily integrate scraped data into your applications.
Scalability: Handle large volumes of job listings effortlessly.
Regular updates: Stay compliant with LinkedIn’s latest changes.
Reliable data extraction: Minimize errors and inconsistencies.

Apify offers over 4,000 Actors for various websites, including nearly 200 specifically for LinkedIn.

If you're interested in scraping LinkedIn job postings without building a scraper yourself, simply visit Apify Store and search for the "linkedin" keyword:

Selecting the “LinkedIn Jobs Scraper” Actor

Select the "🔥 LinkedIn Jobs Scraper" Actor, then click "Try for free" on its public page:

The Actor will be added to your personal Apify dashboard. Configure it as needed, then click "Start" to rent the Actor:

Press “Start” again, wait for the Actor to finish, and enjoy your LinkedIn job data:

And that’s it! You’ve successfully scraped job data from LinkedIn with just a few clicks.

Why scrape LinkedIn jobs?

Having access to fresh LinkedIn job data through web scraping is valuable for multiple use cases:

Job market analysis: Identify industry trends and hiring patterns.
Salary trend monitoring: Compare salaries across roles and locations.
Competitor analysis: Track hiring activity and job openings from rival companies.
Skill demand insights: Analyze which skills are most sought after in various industries.
Geographic workforce trends: Monitor job availability and demand across different regions.

In particular, extracted LinkedIn data points include job titles, company names, locations, salary estimates, posting dates, and more. This information benefits recruiters, job seekers, HR professionals, market analysts, business strategists, and many other professionals. It empowers individuals to make informed career decisions and find their dream jobs. For example, LinkedIn data supports data-driven decisions in hiring and workforce planning.

In short, by automating LinkedIn job data collection, you can uncover hiring trends, compare salaries across industries, and enhance job recommendation systems.

Conclusion

In this tutorial, you used Beautiful Soup and HTTPX to build a LinkedIn web scraper to automate the retrieval of job postings. In particular, you extracted job data from LinkedIn and deployed the scraper on Apify.

This project showed how Apify enables efficient, scalable job scraping while reducing development time. You can explore other templates and SDKs to expand your web scraping and automation capabilities.

As demonstrated in the blog post, using a pre-made LinkedIn Actor is the recommended approach to streamline job data retrieval.

Frequently asked questions

Can you scrape LinkedIn jobs?

Yes, you can scrape LinkedIn jobs using a simple Python scraping script by leveraging LinkedIn's API or parsing its HTML pages. For ethical scraping and avoiding bans, make sure to comply with LinkedIn’s terms of service and respect its robots.txt file.

Is it legal to scrape LinkedIn jobs?

Yes, it is legal to scrape LinkedIn jobs as long as you do not scrape sensitive data behind login walls. To avoid legal issues and potential violations of LinkedIn's terms, it's recommended to perform scraping without logging into your account.

How to scrape LinkedIn jobs?

To scrape LinkedIn jobs, you can use the GET /jobs-guest/jobs/api/seeMoreJobPostings/search API to fetch job listings. From the returned HTML, you can then extract job results.