Back to Case Studies

Case Study

A Comparative Study Between Maze and Vectorial

When you need directional insight quickly, how much can you trust simulated users? When you run a real study, what disappears when answers collapse into percentages alone?

01

Introduction

This study compares structured responses from real participants (via Maze) with structured responses from simulated audience groups (via Vectorial) on the same research questions.

Can simulated users produce insights that meaningfully reflect real human behavior?
  • Maze captures responses from recruited participants through surveys and tasks; results are often summarized as distributions and top-line themes.
  • Vectorial generates responses from defined audience groups, including qualitative reasoning aligned to each persona.

We compared parallel scenarios focused on trust in system-generated insights and on how people evaluate new software quickly.

The goal is not to declare a winner, but to show what each method makes easy to see — and what it leaves implicit.

02

Background & Motivation

Traditional user research is powerful but constrained by recruitment, scheduling, and sample size. Teams often see what users chose without a clear picture of why, or who drove which pattern.

Simulated users promise speed and iteration: you can explore hypotheses without waiting for a full panel. The open question is whether those outputs align with real responses in ways that support decisions — not just directionally, but interpretably.

We designed this comparison to situate both tools side by side on the same prompts. For more on recruiting-based research workflows, see Maze.

03

Study Design

3.1 Research Objective

The goal of this study was to evaluate whether Vectorial can generate insights that:

  • align with real user responses
  • capture meaningful variation across users
  • provide actionable direction for decision-making

3.2 Methodology

We designed parallel studies and ran them across both systems.

Step 1 — Real User Study (Maze)

Participants were recruited via Maze and shown product-related questions and prompts. Structured responses were collected, including both quantitative ratings and qualitative feedback.

Step 2 — Simulated Study (Vectorial)

The same study was replicated in Vectorial. Instead of recruiting participants, we defined audience groups using structured attributes. Each group generated responses reflecting a different user perspective.

Step 3 — Comparison

For each study, we compared:

  • top insights (what users identified as important)
  • patterns of agreement (where responses aligned)
  • divergences (where and why opinions differed)
04

Study Results

Example 1

What is the question asked?

We gave both real users and simulated users a list of factors and asked:

How important are the following factors in building your trust in system-generated insights?

Patterns of Agreement

There is strong alignment across both systems on what builds trust.

In Maze, demonstrated accuracy is rated extremely important by 60% and very important by 40%, indicating that users rely heavily on proof and validation.

Extremely important Very important Moderately important
Transparent reasoning
Real user data evidence
Demonstrated accuracy
Peer endorsements
Clear insights
0%20%40%60%80%100%
This aligns closely with what we see in Vectorial. Multiple simulated users emphasize the need for concrete evidence rather than abstract claims:
PM

Product Marketing Manager

"Examples of real use cases hit me first — I want to see how other companies like mine are actually using it and what results they're getting."

PD

Product Designer

"Seeing actual outputs tells me everything about quality and capability... nothing beats seeing what the tool actually produces."

PM

Product Manager

"I need to know what actionable insights I'll actually get, not just what features exist."

Product Manager Positive

As a product manager, I'd select examples of real use cases, seeing the output or results, and a short demo video.

I need to assess tools quickly — seeing how other teams structure dashboards and what outputs look like matters more than a generic feature list.

Practical application beats "all-in-one" language when I'm deciding what to trust.

Together, these responses show that what Maze captures as 'accuracy' translates in practice to seeing real use cases, outputs, and tangible results.

From Results to Understanding

One key difference lies in what we can observe beyond the final answer.

In Maze

We know that 60% of users rated demonstrated accuracy as extremely important and 40% as very important, but we do not know:

  • which types of users selected which options
  • how they interpreted 'accuracy' in this context
  • or how they arrived at their decision

In Vectorial

We can directly observe both preference and reasoning behind those same signals. For example, while all simulated users emphasize accuracy, they interpret and prioritize it differently based on their role:

  • The Product Marketing Manager focuses on real-world validation.
  • The Product Designer evaluates accuracy through output quality.
  • The Product Manager prioritizes actionable outcomes.

These differences are not random — they reflect how each role approaches evaluation from a distinct perspective.

Beyond that, each user explains their reasoning. This turns a single aggregated signal ("accuracy is important") into a set of interpretable decision processes across different types of users.

Example 2

What is the question asked?

We gave both real users and simulated users the same open-ended prompt:

When evaluating a new software tool, what helps you understand what it does the fastest?

One key difference lies in what we can observe beyond the final answer.

In Maze, we can see that users value demos, speed, and clarity, but we do not know:

  • how different types of users prioritize these factors
  • what "demos" or "clarity" specifically mean to them
  • or how they translate these preferences into decision-making

In Vectorial, we can directly observe both preference and reasoning behind those same signals. While all simulated users emphasize interactive understanding, they interpret it differently based on their role:

  • The Product Designer focuses on usability and intuition through hands-on exploration ("click around... see how it actually behaves").
  • The Product Manager prioritizes mapping features to real workflows and use cases ("map capabilities to my specific use cases in real time").
  • The Growth Marketer emphasizes speed and outcome-driven clarity ("see the tool solving a real problem within the first 30 seconds").

These differences are not random — they reflect how each role approaches evaluating a new tool from their specific perspective.

At the same time, Vectorial's responses are more consistent and idealized, often converging on best practices like interactive demos and rapid time-to-value. While this makes the insights highly actionable, it also means they may underrepresent the variability and less structured thinking seen in real user responses in Maze.

05

Key Findings

Maze vs Vectorial — key tradeoffs

How the two systems compare across five research dimensions

Depth of insight

Maze Short, surface-level responses — what users think, rarely why
Vectorial Structured reasoning — what users think and how they decide

User variation

Maze Implicit — hidden inside aggregate percentages
Vectorial Explicit — role-based perspectives surfaced directly

Realism

Maze Messy and variable — captures real confusion and edge cases
Vectorial Polished and idealized — may miss ambiguity and outliers

Speed

Maze Days — recruiting, running, and reporting takes time
Vectorial Minutes — full response set generated on demand

Best used for

Maze Validating patterns and uncovering gaps in real behavior
Vectorial Early directional research and hypothesis exploration
The strongest research approach combines both — use simulated insights to identify patterns early, then validate and fill gaps with real users.

5.1 Core Insights Are Largely Replicated

Across studies, Vectorial consistently reproduces the main signals identified in real user research, including the importance of transparency, accuracy, and clear outputs. This demonstrates that simulated users can reliably capture what matters at a high level.

5.2 Differences Reflect Structured Variation — Not Sampling Noise

A key source of difference comes from how users are represented.

User recruitment limitation (traditional research):

Traditional research is constrained by who is recruited. The results reflect a specific sample, and variation is often implicit in aggregate percentages.

Structured perspectives (Vectorial):

Vectorial makes this variation explicit. Instead of a single aggregated answer, it reveals how different types of users—product managers, designers, marketers—prioritize different signals and evaluate products through different lenses.

This makes differences interpretable, rather than hidden within averages.

5.3 Reasoning and Decision Processes Are Observable

When reviewing Maze responses, we found that real users rarely provide detailed reasoning in open-ended responses. Most answers are short and high-level, often limited to a single sentence.

In contrast, Vectorial consistently provides detailed explanations of how users think:

  • what they look at first
  • how they evaluate value
  • what triggers engagement or drop-off

For example, multiple simulated users describe abandoning a product if value is not clear within seconds, and relying on outputs rather than descriptions to form judgments.

This adds a critical layer: not just what users think, but how they decide.

5.4 Speed and Structure

Vectorial produces results significantly faster. A full set of responses can be generated in minutes, while even with a platform like Maze—which is optimized for recruiting real users—publishing a study, collecting responses, and generating a report typically takes several days. In more traditional setups (e.g., manual survey distribution), this process can take even longer.

At the same time, simulated users are available 24/7. You can ask follow-up questions, test edge cases, or refine an idea the moment a new question comes up, without waiting to recruit or schedule another study. It feels like having users in the room with you as you think.

This makes it possible to iterate continuously. Whether you're shaping an early concept or refining a prototype, you can run dozens of variations in minutes, quickly seeing how different types of users might respond.

5.5 Clear Patterns vs Real-World Variability

Across all examples, Vectorial consistently captures the dominant signal—quickly identifying what matters most, such as interactive demos, real workflows, and strong visual hierarchy. This makes it highly effective for spotting core patterns and guiding product direction. In contrast, Maze reveals greater variability, including partial understanding, misinterpretation, and surface-level responses. While less consistent, this variability reflects how real users actually think, highlighting gaps in comprehension and areas where the product may not communicate clearly.

5.6 Reasoning vs Surface-Level Responses

A key difference lies in depth of insight. Maze responses tend to be short and outcome-focused, showing what users think but not why. Vectorial, however, provides structured reasoning, explaining how users interpret interfaces, evaluate value, and make decisions. It also surfaces role-based perspectives, showing how different personas (e.g., designers vs. managers) prioritize different aspects of the experience. This makes simulated insights more actionable, as teams can trace feedback back to specific decision processes.

5.7 Actionability vs Realism Tradeoff

Vectorial produces clear, polished, and actionable insights, often aligned with best practices and expert thinking. However, this also makes the responses more idealized, sometimes missing the ambiguity, inconsistency, and edge cases seen in real users. Maze, on the other hand, captures the messiness of real behavior, including confusion and unexpected priorities. Together, this highlights a core tradeoff: simulated users provide clarity and speed, while real users provide realism and nuance. The strongest research approach combines both—using simulated insights to identify patterns and real users to validate and uncover gaps.

06

What This Means

This study demonstrates that simulated user technology has reached a level of maturity where it can meaningfully reflect real user behavior.

Simulation can be used to:

  • explore hypotheses before running real studies
  • understand how different audience segments might respond
  • generate early directional insights when time or access is limited

However, this does not mean replacing traditional user research. Instead, it points to a complementary workflow—especially for teams that do not have the time or resources to run formal studies.

More importantly, this introduces a different way of working with user understanding.

Not just as aggregated data, and not as a single persona, but as structured, explainable representations of how different users think and decide.

Ready To Get Started?

Start Simulations