Fortunately, since there are increasingly more mobile apps supplementing or even replacing traditional websites, a new, highly-efficient way to collect data is emerging — directly tapping into mobile app APIs, also known as mobile API scraping. With this technique, you can get a list of participants from your favorite meetup app, automate a food delivery order or extract a list of hotels and their prices from a hotel booking app.
The great thing about mobile APIs is that they are very concise and efficient, and typically employ far fewer anti-scraping protections than websites. Many mobile apps do not require a login to fetch and show data and only use IP address rate limiting to block access for bots, which can be easily circumvented using proxies. In other words, scraping data from mobile APIs is extremely efficient.
In the following sections, you’ll learn how to get started scraping mobile APIs. All the steps are demonstrated on an Apple iPhone with iOS 12.3 and MacBook Pro with macOS 10.14.1. If you are using Windows or Android, or have a different OS version, you might need slightly different tooling, but the principles are the same.
1) Set up a HTTP proxy server on a computer
To intercept requests that are going out of the phone app to an external backend API, we’ll need to set up an HTTP proxy on a computer to which the phone will connect. Most mobile apps use HTTPS encryption to communicate with their backend APIs, so we’ll need to effectively perform a man-in-the-middle attack on your own phone to be able to intercept the traffic. No worries, this is very safe 😎 There are many tools that can do this job, but here we’ll use mitmproxy. It can be easily installed using Homebrew by running the following command in the macOS terminal:
$ brew install mitmproxy
You can check whether mitmproxy was correctly installed by running:
If everything goes well, you should see a window like the following. Don't worry if you don't see any requests yet.
2) Connect your phone to the HTTP proxy
Now it’s time to set up the phone. First of all, ensure your phone is connected to the same Wi-Fi network as your computer so that they can see each other. Then on your iOS device, go to Settings → Wi-Fi and click on the current Wi-Fi network:
Then click Configure Proxy at the very bottom:
Select the Manual option so that you can enter the Server and Port properties:
The Server property is the internal IP address of your computer, where the HTTP proxy is running. You can obtain it by running the following command in your macOS terminal:
Look for the
en0 network adapter, which is usually assigned an internal IP address in the
The Port property indicates the TCP port number where mitmproxy is running, by default it is 8080.
Once you enter the Server and Port settings, you should start seeing requests going through the proxy in the mitmproxy app on your computer. If you don’t see any requests, make sure your phone can access your computer, e.g. that there is no firewall blocking the access.
While you can see the requests flowing through the proxy, most likely they use the HTTPS protocol with SSL/TLS encryption, so you won’t be able to see the content of the requests. We will fix that in the next step.
3) Install a self-signed root SSL/TLS certificate on the phone
Now, on your phone, open http://mitm.it/ in a browser. You should see a list of operating systems. Just click on Apple:
The certificate should download to your phone.
To install the certificate, navigate to the Settings of iPhone and below your Account name should row with text Profile Downloaded:
Click on it and follow the installation process. You can find more details about how to install the downloaded profile here.
In the next step, you’ll need to enable it. On your iOS device, go to Settings → General → About → Certificate Trust Settings:
WORD OF CAUTION: Once you complete the following step, all traffic from your phone can be intercepted and monitored from the computer running mitmproxy, including login credentials to all your apps. Only do this if you trust the computer, and close all the mobile apps that you don’t want to be monitored. Once you’re done, make sure to disable the certificate again. Never share the logs from mitmproxy with untrusted parties!
Now the only thing you have to do is to enable the mitmproxy certificate:
4) Scraping the mobile app API
Now it’s time for the exciting hacking part! First, install and open the Swiggy app on your phone. When you open the app, you should see unencrypted HTTP requests flowing through the mitmproxy tool:
From this initial screen, you can already get a good idea of what API endpoints are used by the mobile app. You can probably see a few REST endpoints. Just select one and view its details.You can now see all the necessary details to replicate that request.
Now comes a bit of trial and error where you’ll be reverse-engineering the mobile API. You have to figure out what specific API endpoints do, which query parameters and HTTP headers are required by them and which can be omitted. In case of this one all you need to do is to pass the correct
And that's it you can now get all restaurants for a particular location!
If you get stuck, something is unclear or if you find some other problem, please let us know so that we can improve this tutorial. Any feedback would be greatly appreciated!
You can reach us at https://twitter.com/apify