Complete Beginner's Guide To Genspark
Introduction to the Genspark Super Agent and its many AI tools.
Note: Prior version of this article first appeared as a guest post for ’s AI Supremacy:
Since the article first ran, Genspark launched many new features, including AI Drive, AI Docs, , AI Secretary, AI Browser, and AI Pods.
This primer will help you understand how the above tools fit into Genspark’s overall ecosystem and the agentic future the company is building.
Since mid-2024, Genspark’s been quietly building a powerful one-stop platform for AI-powered research and agentic tasks.
It’s time to start paying attention.
So let’s look at Genspark, the tools it offers, and how to get the most out of its many features.
What is Genspark?
Genspark launched in June 2024 as a sort of Perplexity-Wikipedia hybrid.
At the time, Genspark used AI to research your topic and create a custom Sparkpage about it. Anyone could then explore a Sparkpage like a Wikipedia article and talk to an AI chatbot about its content.
Over time, Genspark evolved to offer new specialized AI tools and agents, even launching what might have been the world’s first “Deep Research”-style product (before the official Google version).
In April 2025, Genspark raised $100 M in a Series A funding round.
One month later, Genspark made a seemingly radical decision to kill off its search product and pivot towards a new agentic positioning.
Why?
From AI search to AI agents
This strategic pivot is key to understanding what Genspark is today.
Genspark co-founder Kay Zhu explains the rationale in an article called “Why I Killed Our AI Search Product With 5 Million Users.”
In short, Genspark felt traditional AI search was too constrained by predefined workflows.
They wanted to build an agentic solution that could handle “truly adaptive, context-rich problem-solving.”
The goal was to move from simple information retrieval to something that could independently reason through problems and—crucially—take action on the user’s behalf.
The answer was the Genspark Super Agent.
Notably, the Super Agent doesn’t do away with Genspark’s built-in search capabilities. Instead, it integrates search into a system of agents that can perform complex tasks and support a broader range of possible use cases.
But what is the Super Agent, exactly?
Genspark Super Agent: What is it good for?
The best way to understand the power of the Super Agent is to see it in action:
The Super Agent can be seen as an executive-level AI with an army of sub-agents at its disposal.
When you hand it a task, the Super Agent:
Proactively thinks through the problem
Breaks it down into sub-tasks
Selects the most relevant specialist agents for each sub-task
Executes the sub-tasks, monitoring results and adjusting as needed
Integrates the individual results into a coherent final output, such as a landing page
This approach makes Genspark remarkably flexible.
The same interface can handle everything from “find a smartphone for my needs” to “research my company and design a landing page with brand images, videos, a product table, and interactive components.”1
So, how does Genspark’s Super Agent compare to classic AI search products like Perplexity?
Genspark Super Agent vs. Perplexity
AI search tools like Perplexity specialize in exactly that: search.
And while Genspark Super Agent can also perform research on your behalf, it’s built on a fundamentally different philosophy. Here are the key differences:
1. Core function
Perplexity is an AI-powered search engine. It parses your request, searches the web, and returns cited answers. You can view Perplexity as a research assistant that provides information.
Genspark Super Agent is designed to perform tasks. While that might involve finding information, Super Agent’s real strength lies in taking that information and doing something with it: creating a website, coding an app, or even making a phone call.
For search-focused tasks, Perplexity might be sufficient. But for anything that requires going beyond pure research, Super Agent is the better option.
2. Output format
Perplexity’s core output is text. It may occasionally include tables and charts, but it's primarily built for text-based summaries and reports.
Genspark Super Agent can output a range of formats. It can code entire pages or apps from scratch, create slide decks and spreadsheets, and generate AI images and videos.
Use Perplexity if you’re primarily after a well-researched report about your topic. Use Genspark Super Agent if you need to create rich media or interactive elements.
3. Cost considerations
Perplexity offers a solid free tier with unlimited basic searches and three daily Pro searches (which dig deeper and find more sources). The $20 Pro plan unlocks unlimited Pro searches and the Deep Research feature, which can reason through its steps and proactively seek out information.
Genspark uses a slightly less straightforward credit-based system. How many credits are used depends on the task, with expensive video models and advanced agents consuming them much faster.
As such, Perplexity is the more cost-efficient option if you make dozens of daily searches and reports. Genspark is best used sparingly for high-complexity tasks, but I share some tips for optimizing your credits later in the article.
To help me explain the differences, I asked Genspark Super Agent to make an interactive site and supplement it with additional research. The Super Agent built this page from scratch.
The page definitely needs further work to clean up AI hallucinations and fine-tune the visuals, but this gives you a good idea of what the Super Agent is capable of.
Genspark’s clever three-tier structure
While the Super Agent is the most powerful Genspark feature, it’s far from the only tool on the platform.
In fact, what makes the Super Agent possible in the first place is the dozen or so underlying sub-agents.
I find it helpful to think of Genspark as having a sort of three-tier agent structure:
Genspark Super Agent (Tier #1): The Super Agent sits at the top of the pyramid and autonomously calls on Tier #2 and Tier #3 agents to accomplish your request.
Advanced agents (Tier #2): These are designed for complex, multi-step workflows.
Basic agents (Tier #3): These specialized tools perform simple, well-defined tasks.
The Super Agent is a good starting point for most of your needs, but you can also access each sub-agent independently to perform a one-off task. For basic requests, that’s often faster and more cost-effective.
Let’s look at the Tier #2 and Tier #3 agents and why you might want to use their standalone capabilities.2
Advanced Agents (Tier #2)
Advanced Agents can handle complex, multistep workflows. They tend to consult different sources and output formatted, structured results. The following Advanced Agents are available3:
Agentic Fact Check
Agentic Data Table
Agentic Deep Research
AI Slides
AI Sheets
Call For Me
1. Agentic Fact Check
As the name suggests, this agent is designed to verify the accuracy of a claim.
Simply input a fact or statement, and the agent gets to work. It performs dozens of parallel searches for related keywords, identifies reputable sources, and cross-references claims to reach a well-reasoned conclusion.
Real case examples by Genspark:
2. Agentic Data Table
This agent organizes information into neat tables to provide an at-a-glance view of a given topic. Simply input your data or search request, and the agent will structure its findings accordingly.
Real case examples by Genspark:
3. Agentic Deep Research
This works a lot like similar “Deep Research” offerings by OpenAI and Google.
The agent performs a comprehensive dive into your topic and creates an entire Wikipedia-like page with its findings. The page (formerly known as Sparkpage) comes with an AI chatbot that can answer your follow-up questions about the topic.
Real case examples by Genspark:
4. AI Slides
This agent turns any research into a polished, interactive slide deck. Behind the scenes, it performs something similar to the Deep Research agent but with the goal of preparing an engaging visual presentation:
Real case examples by Genspark:
5. AI Sheets
Just like AI Slides, this agent performs research behind the scenes and creates a detailed spreadsheet with the resulting information. It can even bulk populate spreadsheet cells with images and other visuals:
Real case examples by Genspark:
List 20 key concepts from AP assignment chapter and assess understanding
Find 20 Genspark YouTube videos and analyze metrics and comments
6. Call For Me
At the time of writing, this is the only Genspark tool that can directly interact with the physical world. It uses a voice agent to make phone calls on your behalf to handle reservations, gather information, and so on. For now, this only works with phone numbers in Japan and the US.
Real case examples by Genspark:
Book a table at Lechon for next Wednesday's birthday celebration
Check if BusterPro Tennis has Yonex 2025 EZONE tennis racket in stock
Basic Agents (Tier #3)
If I’m honest, “agents” might be too strong a word for most tools in this tier. At their core are regular LLMs, text-to-image, and text-to-video models from third-party providers.
At the same time, these tools do come with a “Mixture-of-Agents” system that runs several models in parallel and outputs a robust, synthesized outcome.
Let’s look at all the options on this tier:
AI Chat
Image Studio
Video Generation
Translation
1. AI Chat
This interface will be familiar to anyone who’s ever talked to a chatbot like ChatGPT.
But what sets Genspark apart is that you can freely pick from almost a dozen leading models on the fly.
Right now, you can access:
GPT-4.1
o3
o4-mini-high
Claude Sonnet 4
Claude Opus 4
Gemini 2.5 Flash
Gemini 2.5 Pro
Grok 4
The “Mixture-of-Agents” option runs your query through at least three separate models, evaluates their answers, and outputs a more reliable synthesized response:
Why use “AI Chat”?
Compare different LLMs and their responses in one place
Use web search to supplement models that don’t otherwise offer it
Fact-check queries and reduce hallucinations via the “Mixture-of-Agents” feature
2. Image Studio
This works a lot like AI Chat but for text-to-image models. It gives you access to:
FLUX.1 Kontext Pro
FLUX.1 Kontext Max
FLUX.1 [dev]
FLUX.1 [schnell]
FLUX 1.1 [pro] Ultra
Gemini Imagen 4 Preview
GPT-4o Image Generation
Ideogram 2.0
Ideogram 2a (New)
Recraft V3
DALL·E 3
Gemini Imagen 3
Bytedance SeedDream v3
There are a few additional options:
Remix lets you reimagine an uploaded image with one of the above models.
Auto Prompt can expand and enhance your short prompt to create better images.
Style and Size dropdowns give you direct control over those two aspects:
Image Studio also has a “Mixture-of-Agents” option that runs your prompt through four different models in parallel:
I discussed the benefits of this feature in an earlier article.
Additionally, Image Studio offers several handy image editing tools, which are largely self-explanatory:
Why use “Image Studio”?
Compare different text-to-image models in one place.
Enable specific features (e.g. Remix) for models that normally don’t have it.
Make edits to images using dedicated AI tools.
3. Video Generation
This agent gives you access to over a dozen AI video models:
HunyuanVideo
King V1.6 Pro
King V2.0 Master
King V2.1 Master
MiniMax Hailuo‑02 Standard
MiniMax Subject Reference
PixVerse V4.5 Turbo
Runway
Seedance Lite
Seedance Pro
Veo 2
Veo 3
Veo 3 Fast
Vidu
Wan V2.1
Auto Prompt can help flesh out your simple input for better results. Image to Video lets you use an image as the starting frame of your generated video.
Finally, the “Mixture-of-Agents” option runs your prompt through four separate video models at once. (Note that this can quickly consume a lot of your monthly credits, so use it sparingly.):
Above, Veo 2 returns a failure, so this tool is helpful for testing the censorship limits of different models. (My guess is that “school children” might have tripped Google up in this case.)
Why use “Video Generation”?
Test different video models in one place.
Enable specific features (e.g. Image to Video) for models that normally don’t have it.
Use Auto-Prompt to improve your video results.
4. Translation
The last basic agent specializes in translation, powered by one of the following:
Claude 4 Sonnet
Google Translate
DeepL
GPT-4.1
Here, “Mixture-of-Agents” also runs the translation through multiple models to arrive at a consensus answer:
Why use “Translation”?
Compare different translation models in one place.
Identify the best model for a given type of translation.
Use “Mixture-of-Agents” for robust, consensus translations of complex text.
Genspark tips & tricks
Here are my best practices for working with the different Genspark tools.
1. Personalize Genspark to your needs
Update July 16, 2025: As of today’s date, this feature doesn’t seem to be available. I’m not sure if it will return, but keep an eye out.
The feature is somewhat hidden, but it’s extremely helpful for providing better context to all Genspark agents.
From the dashboard, click the Settings wheel in the bottom-left corner, then select Personalization:
This lets you provide your background and tell Genspark how you want it to act. Paid users can even set up “Custom Instructions” for Genspark to follow, similar to those in ChatGPT.
The best part?
You can get Genspark to fill this out on your behalf by clicking the Auto Research option at the top and pasting links to your social media profiles:
Genspark will use the research tool to learn more about you and personalize your profile.
The more context you give Genspark, the more relevant its future responses will be.
2. Pick the right tool for the job
With so many available options, it’s easy to get overwhelmed.
But here’s a good rule of thumb for when to select each Genspark agent or tool:
Basic agents: Ideal for straightforward, one-off tasks like creating an image, translating a string of text, or simple AI chat.
Advanced agents: Best for more complex tasks with a well-defined output type: detailed research report, slide deck, spreadsheet, etc.
Super Agent: Use this one for bigger, vaguely defined tasks that require reasoning and pulling together outputs from different tools and models. It’s surprisingly robust at working through open-ended problems while intelligently calling on the underlying Genspark capabilities.
3. Keep an eye on your credits
Simple agents consume only a few credits.
But expensive video models like Veo 3 and advanced agents can burn through credits quickly.
Familiarize yourself with the pricing here: www.genspark.ai/pricing (click “Show All Features” to see the cost breakdown for each agent and model):
You can always check how many credits you have left for the month in the bottom-left corner:
Here are a few tips for getting more out of your credits.
4. Optimize Genspark credit usage
Normally, I recommend working iteratively with chatbots, such as starting with a minimum viable prompt, then going back and forth to refine the output in collaboration with the model.
But with Genspark’s credit system, it often makes sense to take the opposite approach.
That’s why I suggest the following process:
Frontload the context: Provide advanced agents or the Super Agent with as much background detail about your task as possible. This sets them up for success and reduces the chance of you burning more credits to refine mediocre initial outputs.
Reduce the back-and-forth: When using the Super Agent for tasks like coding an app or building a webpage, don’t waste credits asking it to troubleshoot minor errors or make cosmetic tweaks. Instead, treat its initial output as your “80%” draft, then switch to an external tool.
Finalize in third-party tools: Give your WIP draft to a third-party model or tool to refine it. For instance, if building an app or a webpage, I tend to copy Super Agent’s initial code into ChatGPT and use the o3 model to fix errors and make adjustments. This Super Agent + o3 combo works extremely well and helps me reduce credit usage.
5. Hide your images and videos if needed
By default, any creative assets you generate with Genspark’s basic agents are public.
But you can make your creations private by selecting the dropdown at the top and switching from “Public Creative” to “Private Creative”:
So if you don’t want your generations to be shared with the wider community, remember to toggle this on!
Wrap-up
In less than a year, Genspark has evolved from a traditional AI search tool into a powerful agentic platform that can help you:
Automate complex tasks using the Super Agent
Compare different GenAI models
Generate videos, images, and voice clips
Perform deep research and analysis
Create slide decks, spreadsheets, and fully interactive pages
Signing up is free, and you get 200 daily credits, which is enough for a single advanced agent task.
So you can take Genspark for a spin and see if it’s right for you.
🫵 Over to you…
Have you worked with Genspark before? Do you have preferred agents or tools? If you have any tips and tricks to share, I’d love to hear them.
Leave a comment or drop me a line at whytryai@substack.com.
Thanks for reading!
If you enjoy my writing, here’s how you can help:
❤️Like this post if it resonates with you.
🔄Share it to help others discover this newsletter.
🗣️Comment below—I love hearing your opinions.
Why Try AI is a passion project, and I’m grateful to those who help keep it going. If you’d like to support my work and unlock cool perks, consider a paid subscription:
To appreciate the range of possible use cases, I invite you to explore the following real examples from Genspark:
The interface has changed slightly since this article and the accompanying screenshots were first published. But the core principles remain the same.
Update: As I wrote in the intro, the following tools have been added to the list since I wrote the article: AI Drive, AI Docs, , AI Secretary, AI Browser, and AI Pods.
Ping me about Open AI's agent when you're on dry land and comfy!