The accidental click that changed everything: the Apify origin story

10 years ago, on October 20, 2015, Apify launched on Hacker News. This is chapter 1 from our new book chronicling Apify's first decade - from side project to processing billions of web pages.

Read or download the book to read the full story of near-failures, pivots, and what it really takes to build a lasting company ⬇️


Apify was conceived in a place where necessity bumped up against imagination.

The year is 2014. Apple unveils the bigger, sleeker iPhone 6, Tesla announces self-driving cars, and VR glasses come into view with Google releasing their Cardboard thingy alongside their much more expensive cousins from Samsung and Oculus Rift. Meanwhile under the hood, dynamic JavaScript is truly coming into its own, powering the interactive experiences that make us fall in love with the web all over again.

It's all very pretty.

But beneath that polish hides a problem. These same dynamic interfaces, the menus that load on demand, the content that updates without refreshing, and the interactive elements that make the web more accessible and intuitive for humans to navigate, are the very things that make the web inscrutable to deeper dives.

Human-first design has come at a cost: Computers can't easily access the deeper data hidden behind these web pages.

Here in Prague, two young developers have been thinking deeply about this problem, and are about to embark on a project that will unlock the web itself.

Prefer video? Watch the short version of the Apify story here

Before the beginning: Two students with a dream

Apify founders: Jakub Balada and Jan Curn

There's another beginning to this story. One that stretches back many years before this to the Faculty of Mathematics and Physics of the Charles University in Prague, where two Information Technology students, Jan Čurn and Jakub Balada, first cross paths.

It is here that the seeds of what would eventually become Apify begin to sprout. Jan, working with other colleagues (though not yet with Jakub), on a school project, builds an app designed to extract data from used car websites and list the results from multiple portals in one place (aptly called najednommiste.cz). The project goes beyond a simple aggregator. Jan and his team have built what he calls a "proto AI system" – long before LLMs and the AI revolution. It's ambitious for its time: a semantic web analyzer that can theoretically parse information from any used car listing across dozens of different sites.

They launch it. Defend it academically. Even create a live website to display the offering. For a moment, it seems like they've cracked the code.

But the system is "fairly clunky," in Jan's words. It works on some websites and fails on others. The accuracy is poor. Most critically, people just aren't using it. Despite the technical achievement, they can't get any real traction. The stars are not yet aligned for this piece of technology, and the project is relegated to a drawer in the corner of someone's office.

Semweb, the early semantic web analyzer for used car listings that foreshadowed the data extraction technology that would later become Apify.

Fast forward a few years later. Jan and Jakub join forces to form a company called Dev tank, an entity that allows them to market their various and respective consulting talents under a unified name in the hopes of giving their endeavors more weight. "We felt that having a brand, a company name, would give us more credibility," Jan explains. "It was just a little weird that we were behaving like solo contractors working for these large corporations, so we decided to group ourselves under a company to be more organized and more memorable."

Dev tank becomes a training ground, a place to practice the technical expertise and client-handling skills that will later prove crucial for Apify's success. Life goes on. But as is often the case with these things, both founders are becoming restless.

"We had the same feeling, me and Jan, that we didn't want to operate like a consultancy. We wanted to try to build a product," Jakub recalls. "This was the main reason we started thinking about how we could start some other type of business, something more along the lines of a tech startup than a regular agency."

This growing dissatisfaction with selling their time rather than building something lasting will become the catalyst for their next chapter.

The Dev tank website. What started as a consultancy would soon pivot to product development, setting the stage for Apify's future.

The puzzle that unlocked everything

Back in the world of mundane things like paying bills and such, Jan Čurn and Jakub Balada are busy on a consulting project. Curiously, the client for this project had seen a mention of the dormant Semantic Web Analyzer project on their Dev tank site and requested something along similar lines; They want an app that will extract large amounts of data from various real estate websites regularly and reliably. Seems simple enough. But there's a hurdle. None of the tools available at that time is suitable for the job.

All the existing options are fundamentally flawed; software tools that look impressive in demos but break the moment a website adds a discount banner; or solutions that treat every website like a static HTML document, unable to process the dynamic JavaScript rendering used. These tools work reliably only for a dwindling number of websites.

Absurd! The pair thinks. Here they are, web developers who live and breathe JavaScript every day, who can write jQuery selectors to manipulate any element on any website. Why are they forced to use tools that can't handle the dynamic, JavaScript-driven web they actually work with?

It is at this point that they make the decision that sets them on the journey about to unfold in the pages of this book: they will simply have to write their own tool.

“... what we actually needed was something that could crawl websites with arbitrarily complex or irregular structure, could handle dynamic websites (understand JavaScript), and be simple for developers to use.”
👻
PhantomJS: The little ghost that helped unlock the dynamic web

PhantomJS became a key component in Jan and Jakub's emerging solution. While traditional scraping tools could only read static HTML like a text document, PhantomJS emerged as the first headless browser on the market. It could execute JavaScript, wait for dynamic content to load, and interact with websites exactly as a human user would. It meant they could write jQuery selectors to extract data from any website, no matter how complex its JavaScript interactions were.

An accidental application

By the summer of 2015, our protagonists are making headway. They have a working command-line crawler based on PhantomJS, some early traction, a list of possible domain names for what might become a cloud service, and a growing sense that they might be onto something. Then, in a fortuitous twist of fate, they get wind of Silicon Valley's latest experiment in startup evangelism: the newly instantiated Y Combinator Fellowship program.

Sam Altman, taking over from Paul Graham, wants to experiment with scaling Y Combinator beyond its traditional scope. The Fellowship is one of several experiments being run by the accelerator (YC Research, which spawned OpenAI, being another).

The timing is perfect, albeit compressed, recalls Jakub, "They announced it, and it was like, okay, the deadline for applications is one week from now."

The duo set about preparing their submission, and in what will later become a legendary tale among their friends, Jan accidentally clicks 'submit' instead of 'save draft' on their application, sending it in a full three days before the deadline.

With that, the die is cast.

Says Jan of that fateful click, "I think it actually helped because if you get 6,000 applications on the last day, but only five in the middle of the week, you can more easily read the early ones but you might run out of time to evaluate the later ones properly."

Reading their submission years later, you can see how the seeds of Apify's mission and strategy were already formed.

Even then their vision jumped off the page:

  • Apifier will be a cloud service for developers to turn any website into an API… quick and simple.
  • There are millions of websites whose content can only be consumed by humans but is unusable by apps. We see a huge opportunity there.
  • None of [the competitors] is able to crawl websites with irregular structure, or work on responsive sites with JavaScript.
6,500 applications. 30 final spots. A very small chance of success. Sam Altman's tweet at the time perfectly expresses just how competitive and special it is to get selected.

10,000 kilometers for a 10-minute interview

The good news arrives a week later. They are among the 60 projects selected from 6,000 applicants. But as Jakub explains, there is a catch: only 30 out of 60 will go forward, so to maximise their chances they must fly to San Francisco to attend the interview in person.

With nothing to lose (except the cost of the flights) and everything to gain, Jan and Jakub head west, like so many pioneers before them, in search of their own treasure – validation from the world's most prestigious startup accelerator.

Prague to San Francisco for a 10-minute interview! "If you asked us that day how it went we couldn't tell you whether we did well or not," Jan writes of the interview. "The time flew by as quickly as people said it would. The interview was over before we had a chance to mention most of the points that we planned."

With the interview over, Jan and Jakub spend the evening in San Francisco, before hitting scenic Highway 1 toward Los Angeles the next morning. Large parts of the highway have no cell coverage, making it impossible to check for the follow-up email.

The uncertainty is torture. No signal. No email. No way to know if their 10,000-kilometer gamble has paid off.

Eventually, they get the news they are hoping for. They're in. And now the hard part begins.

At Y Combinator

The legendary hacker house

The pair move into a so-called Hacker House in Mountain View, run by two local heroes, Lily and David, a couple who have sheltered startup dreamers for years.

At the Y Combinator

The pair were told the house had a good spirit and all the companies who used to live there eventually did very well, so they needn't worry.

What follows is a kind of beautiful madness informed by YC's famous intensity. "For us, for two guys from Prague, it was something completely different," Jakub remembers. "They told us we should do nothing else but work on the product. No meetings, no conferences, no meetups, no anything. And forget about having a social life."

Soylent, YC's 'ultimate productivity hack' that fueled the 'beautiful madness' of startup life in Mountain View.

To drive the point home, YC gives them the ultimate productivity hack: "The next day we just got a huge pack of Soylent" Jakub continues, " which was a kind of bottled food that you can just drink and not need to eat anything else."

Days blur into nights, screens glow at all hours, and empty Soylent bottles stack up around their workspace.

Wake. Code. Exercise. Consume Soylent. Code. Sleep. Repeat.

"We basically worked like that day and night, seven days a week, from morning to late night."

It's pretty amazing how much work can be done if you work 12–14 hours a day, 7 days a week.

Hello world: October 20, 2015

After what seems like an eternity of long hours and endless soylent-filled days, Apifier (the original name before Apify) is ready for the world.

📰
Hacker News: Where startups get discovered.

Hacker News is Y Combinator's online community and news aggregator. Think of it as the internet's startup watercooler, where developers, entrepreneurs, and tech nerds gather to argue about the latest innovations and discover what's actually worth paying attention to.

For new startups, posting on the site is basically the modern equivalent of standing on a soapbox in Silicon Valley and hoping people listen. Get it right, and thousands of potential users will check out what you've built. Get it wrong, and… well, at least you tried. Either way, it's where you go when you're ready to find out if your idea actually makes sense to anyone besides you.

On October 20, 2015, they release it into the wild with a post on Hacker News. They get 2,200 visitors, 101 upvotes, and most importantly, 120 people who actually sign up to try what they have built.

The post itself captures their philosophy perfectly.

"Today, we're bringing you what we built for ourselves. With Apifier, you can define your own crawler or web scraping tool in just a few minutes. There is no need to setup any servers, proxies, cron jobs, databases and it is easily programmable using simple JavaScript. We hope you'll like it!"

They sign off simply: Jan & Jakub

And with that, Apifier (later Apify) was born. What had started as a solution to their own frustrations had become a tool that anyone can use to extract data from the web without the traditional headaches.

The journey from frustrated developers to Silicon Valley Fellows had taken about a year, but those 120 early users were just the beginning.

📝
What's in a name? How Apifier became Apify
The vision for Apifier was clear from the beginning: to build a tool that would let anyone easily turn any website into an API. Hence the word Apify (as in 'to API-fy'). The only problem was that apify.com was already taken.

So the founders went hunting for alternatives and discovered that apifier.com was available. An 'Apifier,' they reasoned, was simply someone who API-fies things. Close enough. They bought the domain and got back to building.

But 'Apify' never really left their minds. It was the name that captured exactly what they were trying to do, and they quietly harbored hopes that someday they'd find a way to claim it.

Spoiler alert: it all worked out in the end.
In just two months Jan and Jakub transformed a PhantomJS-based tool into a web application for anyone to use, giving birth to Apifier.

On this page

Build the scraper you want

No credit card required

Start building