If no real person answered a question, can it still be called research?
That question sits at the heart of the market research industry’s latest ethical dilemma. Synthetic respondents – AI-generated personas trained to mimic real human behaviour – are being promoted as the next big step in survey efficiency. They promise faster fieldwork and lower costs. But when machines speak instead of people, what happens to public trust?
At TKW, we believe innovation only works when it protects credibility. The point of research has never been data for data’s sake. It is to capture authentic voices, with all their complexity and contradiction.
TL;DR
- Synthetic respondents can speed up testing, but they can’t replace real human insight.
- Transparency and disclosure are essential to maintain research credibility.
- ISO 20252:2019 and ESOMAR’s frameworks already set clear boundaries for ethical AI use in data collection.
What Are Synthetic Respondents?
Synthetic respondents are AI-generated profiles designed to mimic the behaviour of human participants in a survey. They are trained on census, behavioural, and historical response data to predict how real people might answer.
There are three common types:
- Pure synthetics – entirely generated from datasets without real-world input.
- Digital twins – virtual versions of actual people that simulate their opinions or actions.
- Scenario synthetics – used to model hypothetical situations, such as how consumers might react to future market events.
These models can answer survey questions instantly and at scale. Some claim to replicate population-level diversity, while others can simulate entire focus groups. Yet the data they generate is ultimately a reflection of past patterns, not present human experience. It tells you what a model predicts, not what people feel.
The Appeal – and the Illusion of Efficiency
It’s easy to see why some agencies are tempted. Synthetic respondents can eliminate scheduling, reminders, and dropout management. They provide fast, clean datasets that seem ready for analysis within minutes.
Used appropriately, these systems can help with concept screening, ad testing, or low-incidence sampling. They can model edge cases or supplement hard-to-reach audiences where recruitment is difficult.
However, the gain in convenience can hide a deeper loss. Real people introduce nuance – uncertainty, hesitation, humour, even inconsistency – all the things that make insights real. Synthetic agents generate patterns, not meaning. The MRS Delphi Report makes this clear, noting that not asking questions of people limits our access to diverse perspectives – in other words, a poll without people is really just a simulation.
That distinction matters. Research should reveal human truth, not algorithmic probability.
The Trust Gap
The biggest risk is not technological but reputational. If the public learns that “polls” are being run with artificial voices, confidence in the entire research process will erode.
A 2024 AP–NORC/USAFacts study found that two-thirds of U.S. adults do not trust AI-generated information, especially around sensitive topics such as politics or elections.
When respondents are synthetic, we lose the very foundation of representativeness. Decisions based on fabricated consensus, whether for marketing budgets or public policy, are inherently flawed.
This is not just a communications issue, it’s a quality one. ISO 20252:2019 requires clear documentation of respondent sourcing and sampling. That means organisations must show who was surveyed, how they were recruited, and what data was captured. Synthetic personas cannot satisfy those requirements unless they are clearly declared as simulations.
Where the Line Must Be Drawn
Leading industry bodies have begun to set those boundaries.
ESOMAR, the Insights Association, and CINT all agree on the same fundamentals:
- Synthetic data must never be presented as genuine public opinion.
- Any use of AI-generated responses must be clearly disclosed to clients and end users.
- Validation protocols should be documented and reproducible.
Collaborations such as Ipsos–Stanford’s research into synthetic data quality are helping to develop benchmarks and validation frameworks.
These projects test how closely synthetic answers align with real human data, and under what conditions the gap widens.
For agencies adopting these tools, transparency must come first. That means:
- Adding an “AI participation” statement in methodology notes.
- Separating synthetic and real data within weighting structures.
- Explaining the intended purpose of synthetic augmentation to clients.
Without these safeguards, the line between insight and invention becomes dangerously blurred.
The False Comfort of Clean Data
Synthetic respondents create data that looks perfect. Too perfect.
Because they are built on mathematical averages, their responses lack the contradictions and cultural texture of genuine human input. This can create a false sense of reliability, as smooth curves often mask bias rather than reveal it.
The risk of bias amplification is well documented. Models inherit the limitations of their training data, which often underrepresent the voices of marginalised individuals. Instead of fixing representativeness, synthetic sampling can quietly reinforce exclusion.
Even real online panels struggle with this issue. Pew Research found that 4–7 per cent of respondents in opt-in polls are bogus – often bots or inattentive humans – and their inclusion can distort “right direction/wrong track” results by up to four points. Synthetic data doesn’t remove this problem, it just replaces it with new uncertainties about authenticity.
TKW’s position is straightforward: efficiency should never take precedence over veracity. Authentic voices remain irreplaceable because representativeness is the foundation of credible research.
Responsible Innovation – What “Good AI” Looks Like
Not all AI use in fieldwork is risky. The challenge is governing it.
At TKW, our approach aligns AI tools with ISO 20252:2019 and ESOMAR’s “20 Questions to Help Buyers of AI-Based Services”. That means using automation where it adds value – but never in place of human accountability.
Our workflow includes:
- Human-verified recruitment to ensure genuine participant identities.
- AI-assisted sample optimisation, not replacement, to reduce non-response bias.
- Transparent documentation of all automated steps for auditability.
AI has a powerful role to play in research design, analysis, and quality control – from flagging duplicate respondents to checking question logic. But it should always be a supporting technology, not a surrogate respondent.
Reclaiming Trust in the Age of Simulation
The future of research depends on how we manage this balance. The more data collection moves towards automation, the more critical it becomes to defend authenticity.
If we allow simulations to replace participation, we risk hollowing out the very idea of public opinion. The industry must act before regulation catches up, by setting its own red lines and enforcing transparency from within.
Emerging frameworks such as the EU AI Act are likely to formalise these disclosure standards globally.
Synthetic data can be useful for modelling and scenario planning – but only when clearly labelled as such. As a sector, we must remember that technology should amplify human voices, not fabricate them.
Key Takeaways
- Synthetic respondents can support predictive modelling and scenario testing, but they cannot replace genuine human data.
- Transparency and disclosure are non-negotiable to preserve credibility.
- ESOMAR and ISO 20252:2019 already provide frameworks for responsible AI use.
- Trust is the industry’s most valuable currency – and once lost, it’s hard to rebuild.
Want to explore how to modernise your data collection without losing the human touch?
Download our AI in Market Research White Paper, or talk to our fieldwork specialists on +61 (0)4 89 934 300.









