If all you have is a hammer, everything looks like a nail. At the moment, the hammer everyone wants to wield is an AI agent promising to nail everything - from communicating with external systems to planning and executing tasks effectively. But not every problem needs an agent.
In this post, we’ll highlight the difference between an AI agent and a workflow. To do that, we’ll examine two distinct approaches. One employs a fixed workflow that executes SQL queries predictably for well-defined tasks. The other uses an agent capable of interpreting natural language queries and adapting to more nuanced scenarios.
We'll show you how to build an AI agent for querying Apify datasets and highlight when the added complexity of advanced reasoning truly pays off.
The problem
We’ve scraped data and saved it in an Apify dataset, and now we need to work with it.
The traditional approach is to export the dataset to an external tool for analysis and processing.
Another option is to use the data directly in an agentic or LLM workflow. However, this is often limited by the dataset size exceeding the LLM’s context window, which again leads to data exports and processing.
To eliminate these extra steps, we can create a dataset query engine that automatically loads the dataset into an SQL database. This allows us to retrieve and process data using natural language queries, with SQL queries and processing steps generated automatically.
Why not just write SQL queries?
To write an SQL query, you first need to:
- Know the dataset structure.
- Identify relevant fields.
- Be proficient in SQL or skilled in LLM prompting.
However, this is unnecessary because the dataset query engine can handle this complexity for you. You only need to provide a dataset ID and ask a natural language question, such as:
Provide a list of top restaurants with the best reviews, along with their phone numbers.
The AI agent will process the query and return relevant results.
Solutions: Automated workflow or AI agent?
Solution #1: Automated workflow
A workflow is a deterministic sequence of steps designed to achieve a specific goal. The user provides either a natural language query or an SQL query. The system detects whether the input is in SQL and processes it accordingly.
- If the query is already in SQL, it’s executed directly.
- If the query is in natural language, it is converted to SQL using an LLM and then executed.
This approach is straightforward but has limitations - it strictly follows predefined steps and cannot handle complex or dynamic queries requiring reasoning.
Workflow-based query engine
The workflow-based approach follows a structured execution path. The example below is just a snippet; it’s not working code. For a full example, check out the GitHub repo here.
# Fixed workflow approach with predefined steps
class QueryWorkflow(Workflow):
@step
async def analyze_query(self, ctx, event):
# Check if input is already SQL
if is_query_sql(event.query):
sql = event.query.replace('dataset', event.table_name)
results = execute_sql(sql)
return SynthesizeEvent(sql=sql, schema=event.schema, results=results)
return NLtoSQLEvent(query=event.query, schema=event.schema)
@step
async def convert_to_sql(self, ctx, event):
# Convert natural language to SQL
sql = await nl_to_sql(event.query, event.table_name, event.schema)
results = execute_sql(sql)
return SynthesizeEvent(sql=sql, schema=event.schema, results=results)
@step
async def format_response(self, ctx, event):
# Create human-readable response
response = format_results(event.query, event.sql, event.results, event.schema)
return CompletedEvent(response=response)
def is_query_sql(query):
# Detect if query is SQL
sql_keywords = ["SELECT", "FROM", "WHERE", "GROUP BY", "ORDER BY", "JOIN"]
query_upper = query.upper()
return any(keyword in query_upper for keyword in sql_keywords)
Solution #2: AI agent
An AI agent is more flexible than a workflow. It dynamically reasons about the task, selects tools and determines the best approach for execution. The AI agent follows the ReAct (reasoning and acting) framework, where it:
- Interprets the user query
- Decides if it needs to convert it to SQL
- Executes the SQL query if applicable
- Synthesizes the results into a human-readable response
This allows the AI agent to adapt to different types of queries and handle more complex scenarios. For example, you can skip the summarization step by requesting raw data or provide a partial SQL query and let the LLM complete it. Additionally, you can integrate more tools for data conversion and processing, giving the LLM the flexibility to decide whether to use them.
AI agent-based query engine
The AI agent dynamically decides how to process the query. Below is a simple example, but for a full working snippet, you should go to the full code in GitHub.
# Agent-based approach using ReActAgent with tools
async def run_agent(query, table_name, table_schema, llm):
# Register tools for the agent
tools = [
FunctionTool.from_defaults(fn=is_query_sql),
FunctionTool.from_defaults(fn=user_query_to_sql),
FunctionTool.from_defaults(fn=execute_sql),
FunctionTool.from_defaults(fn=synthesize_results)
]
# Create context with table information
context = f'Table name: {table_name}. Table schema: {table_schema}'
# Initialize ReAct agent with tools and context
agent = ReActAgent.from_tools(
tools,
llm=llm,
allow_parallel_tool_calls=False,
context=context
)
# Process the query through the agent
response = await agent.achat(query)
return response
Key takeaways
Feature | Workflow | AI agent |
---|---|---|
Predictability | High | Medium |
Flexibility | Low | High |
Adaptability | Low | High |
Handling complex queries | Limited | Strong |
Reliability | High | Medium |
A workflow is ideal for structured, repeatable tasks where predictability is key. An AI agent, on the other hand, is better for dynamic, reasoning-based tasks.
Choosing the right AI agent/workflow strategy
When designing a system to query structured datasets, both workflows and AI agents have their place. The workflow-based approach guarantees stability and predictability, while the AI agent approach provides flexibility and reasoning power. Depending on the use case, one may be preferable over the other - or a hybrid approach can be used.
With the dataset query engine, users can now extract insights from datasets using simple natural language queries without needing deep SQL knowledge or external data exports.
