September updates from Apify

Check out our new piece on the legality of web scraping, our tech post on MongoDB performance, 2 use cases of scraping real estate for research, and a couple of small how-to tutorials on scraping Google and Instagram.

Content

Hello there, web scraping enthusiast! Our September edition of updates offers you a great read on the legality of web scraping, 2 excellent use cases of web scraping for real estate research, a couple of handy guides on how to scrape Google, Instagram and set up a watchdog, as well as first out of many upcoming technical blogposts.

There are many nuances to the legal side of web scraping and we're here to clear them up. Read our blog post about the legality of web scraping if you ever wondered whether it's on the dodgy side of web activities (spoiler: it's not). You can also expect a video on this topic coming up next month. Stay tuned and keep it legal!

legality of web scraping

How to tune MongoDB performance ๐ŸŽ๏ธ

For Apify, MongoDB is a crucial element that can affect both UX and our platform's performance. In early 2021, our users started reporting degraded performance of our UI. The cause? Over-utilized drives in our MongoDB cluster. It was time to take action and improve our overall usage of MongoDB.

Read about some of the techniques and MongoDB Cloud features we've used to debug performance issues and expose sub-optimal queries.

Improving MongoDB performance

Scraping real estate for research ๐Ÿก

Looking at real estate prices may hurt, but not if you're armed with real data. Check out how one American student used our Zillow Scraper to analyze the real estate market in white picket fence areas of Boston, MA. Or see for yourself how easy it is to compare prices of thousands of cottages all over Czechia, just like these ladies from Czechitas did. This is some pretty impressive research done via data extraction - try it yourself๐Ÿ‘จโ€๐Ÿ”ฌ๐Ÿ‘ฉโ€๐Ÿ”ฌ

real estate research with web scraping

How to scrape Instagram ๐Ÿ’…

You can stop scrolling now - with our tutorial and a newly polished Instagram scraper you can now extract data from thousands of posts, comments as well as stories, provided you're logged in.

Apify's Instagram page

How to scrape Google ๐Ÿ“•๐Ÿ”Ž

That's right, you can scrape the whole Google now. If you're feeling lucky today ๐Ÿ˜‰ here's two tutorials - one on how to scrape Google SERPs with a small trip down memory lane, and one for scraping Google Trending Searches for staying ahead of the curve in whatever your web-related goals are. There's also a collection of SEO-oriented actors that can replace many SEO tools. Enjoy!

old Google UI

How to set up a content change watchdog for any website ๐Ÿ‘๏ธโœ‰๏ธ

Wouldn't you want to get a notification when your favorite item goes on sale? Or when a concert of your most loved jazz band is planned in your area? Read about how, with just some basic JavaScript, you can set up a watchdog for events or items appearing on sale more in just 5 minutes.

How to set up a watchdog

Actors updated and/or running at top speed ๐Ÿ๐ŸŽ๏ธ

  • Make good use of the IMDb scraper for anything cinema-related
  • Use our GIF Scroll Animation Actor for testing UI or showcasing your work
  • Check out our new Shopify scraper๐Ÿ’ฅ for keeping an eye on the most precious items you'd like to add to your collection
https://downloads.intercomcdn.com/i/o/393777178/81139773b3938de4485986fa/Screenshot+2021-09-23+at+13.43.43.png

As you can see, we've started a series of dev-oriented blog posts. Keep an eye on our blog, since we're planning to publish articles on the following subjects:

... and many more! We'll keep you posted ๐Ÿ—ž๏ธ

Minor and major UI improvements โœ…

Last but not least, we also rolled out some tweaks to the app UI:

  • New Actor UI version has been perfected with the help of your feedback, ready to be relaunched
  • List inputs now support large amounts of data (by switching to json editor)
  • You can now access your own Actors&tasks by referring to them as ~resource-name
  • Enabled token editing validation
  • You can now see a Python example in the API tab in public Actor pages
  • We've added API endpoints for actor environment variables
UI change on actors: Python tab

Are you our next amazing teammate? ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ‘ฉโ€๐Ÿ’ป You might be exactly who we're looking for:

We kicked off this month by making #HackerCamp happen. If you haven't heard of it yet - perhaps you will next time, as it was our first year this September. You can find some snapshots of this epic getaway on our LinkedIn, but be warned: they don't reflect the whole vibe. Do join us in 2022 ๐Ÿ˜‰๐ŸŒฒ ๐Ÿ•๏ธ The only thing running faster than our scrapers this month was our amazing #VltavaRun team, who spent 32h running in total, and took 71st place out of 272 teams. Look at them go! ๐Ÿƒ๐Ÿƒโ€โ™€๏ธ๐Ÿƒโ€โ™‚๏ธ๐Ÿƒโ€โ™€๏ธ

Apify team of runners
Natasha Lekh
Natasha Lekh
Crafting content that charms both readers and Googleโ€™s algorithms: readmes, blogs, and SEO secrets.

Get started now

Step up your web scraping and automation