© Art_Photo - stock.adobe.com
In A Nutshell
- Across six studies, a spokesperson whose voice was more similar in timbre to a listener’s was trusted more and was more persuasive, with the effect running specifically through trust in the speaker’s competence.
- Evidence spanned 7,002 “Shark Tank” pitch pairings, 2,091 Kickstarter campaigns, and four lab experiments covering radio ads, restaurants, headphones, and a sparkling drink.
- The pull weakened when an outside credibility signal (a Kickstarter “staff pick” badge or third-party endorsement) was present, and when listeners were already familiar with the product.
A voice that resembles a listener’s own may be working on that listener’s wallet. New research finds that when a spokesperson, pitchman, or recommender sounds vocally similar to the person hearing them, that person trusts them more and is more likely to be persuaded. The similarity in question is not accent or pitch, but timbre, the particular texture and color of a voice that makes one person’s “hello” sound different from another’s.
Researchers at the University of Cincinnati, Georgia Institute of Technology, and the University of Michigan ran six studies measuring this effect, and the pattern held up across pitch contests, crowdfunding campaigns, radio ads, restaurant picks, headphones, and seltzer. The study found that people leaned toward the voice that sounded more like their own. They believed that voice was more competent, and they followed its advice more often.
Timing gives the work a sharper edge. Voice-cloning tools can now copy a person’s voice from under a minute of audio, and AI assistants such as Siri and Alexa are everywhere. A finding about whose voice people trust lands differently in a world where a company can build a voice to order.
Measuring vocal similarity by the numbers
Past studies of “similarity” usually leaned on human judges, who would listen to two voices and rate how alike they sounded. That approach is slow and subjective. So the team built an objective yardstick instead, using a tool from audio engineering called mel-frequency cepstral coefficients, or MFCCs. In plain terms, MFCCs break a voice into a set of numbers that capture its timbre, qualities that tend to stay fairly stable for a given speaker even as the words change. A smaller mathematical distance between two voices means they are more similar. As the authors put it, their method “utilizes machine learning methods to capture similarity in a manner beyond human hearing.”
That measure became the engine for everything that followed. Two voices could be compared without anyone having to guess.

Vocal similarity on the biggest stages
To test the idea where real money was on the line, the researchers started with “Shark Tank.” They analyzed 7,002 entrepreneur-investor pairings from the show and checked whether vocal similarity between a pitcher and a shark predicted a deal. It did. Greater similarity in timbre between an entrepreneur and an investor was linked to a higher chance of that pair closing a deal. Similarity in vocal pitch, by contrast, did not predict success, and neither did a gender match between the two. Timbre was doing the work.
Crowdfunding came next. Across 2,091 Kickstarter campaigns archived over more than a decade, the team asked a slightly different question: what happens when a spokesperson is talking not to one person but to a crowd? Here they compared each campaign narrator’s voice to the average voice of the whole sample, a stand-in for the “average” listener. Campaigns whose narrator sounded closer to that average voice raised more money and were more likely to hit their goal. A voice near the middle of the pack reaches more people, because more people hear something familiar in it.
One detail complicates the neat story, in a useful way. When a campaign carried an outside stamp of credibility, a Kickstarter “staff pick” badge, the voice mattered less. Given a strong outside signal, listeners leaned on that instead of on how the narrator sounded. Vocal similarity, then, is one cue among many, a thumb on the scale rather than the whole scale, and it loses weight when better information shows up.
Trust earned through a familiar-sounding voice
Field data can show a pattern but cannot fully rule out coincidence, so the team ran four controlled experiments. In one, 155 participants listened to five radio ads, then recorded their own voices so the researchers could compute similarity for each person individually. Listeners rated the more vocally similar spokesperson as more persuasive and reported being more likely to buy. The result held even after accounting for the participants’ own vocal pitch.
In another study, 185 people chose among three pizza restaurants recommended by three different voices, matched to each listener’s gender. People tended to pick the restaurant pushed by the voice that sounded most like theirs. Later experiments raised the stakes with real prizes on offer for headphones and a sparkling drink, and added a twist: participants rated how similar each voice felt to their own. Subjective similarity, the sense that a voice resembles one’s own, produced the same pull as the objective measure. What people felt about a voice tracked closely with what the math said about it.
A consistent thread ran through the experiments. A similar voice raised trust specifically in the speaker’s competence, the sense that this person knows what they are talking about, and the studies point to that trust as a key pathway through which persuasion rose. Familiarity dulled the effect slightly. With a product people already knew, a similar voice still made the recommender seem more persuasive, but it had less sway over whether someone wanted to try the product.
Why a familiar voice wins people over
Two careful explanations come from the researchers, both labeled as possibilities rather than settled fact. One is cognitive fluency: a voice that resembles a listener’s own is easier for the brain to process, and ease tends to feel like trustworthiness. The other draws on an old idea in psychology called balance theory, which holds that people use cues around them to make sense of the world. A familiar-sounding voice may signal that the speaker shares a listener’s traits, and therefore understands what that person wants. Both ideas need more testing, and the authors say so plainly.
Practical reach is where the finding gets uncomfortable. For now, a company can only aim for the broad middle, choosing voices close to an average listener. But the authors sketch a nearer future in clear terms: a call center could capture a customer’s voice in real time and adjust an AI agent’s voice to sound more like that customer’s, one person at a time. With cloning tools that need less than a minute of audio, the technical barrier is already low.
That possibility puts a familiar marketing trick on new footing. Brands have long picked spokespeople for their voices. A system that reshapes a voice to match each individual listener is a different thing, and the researchers do not pretend otherwise. They flag it as an open question for both science and policy. “It is an important question for future research to study in which contexts this would be considered helpful for the consumer and ethical for a firm to do,” they write. Trusting a voice that sounds like one’s own is human. Knowing a machine built that voice on purpose is the part worth thinking about.
Paper Notes
Limitations
The authors are candid about what their work does not yet settle. They write that more research is needed into why vocal similarity boosts persuasion, offering two untested explanations: cognitive fluency (a similar voice is easier to process) and balance theory (a similar voice signals shared traits). They also note that the mediators they explored, including trust in competence, integrity, and benevolence, along with warmth, competence, and liking, are exploratory and need further study. The two moderators they identify, an external credibility signal and product familiarity, could be tested again, and additional ones explored. While the effect held across varied contexts (fundraising, radio, restaurants, headphones, and seltzer) and across male and female spokespersons, the authors call for more work in other contexts and on individual differences such as self-esteem, need for uniqueness, and narcissism. The Kickstarter sample narrowed from 2,851 archived campaigns to the 2,091 that included both a video and a voiced speech, and the field analyses are associational rather than causal.
Funding and Disclosures
The author-accepted manuscript reviewed here does not contain a stated funding source or competing-interests declaration. Data access is documented: the data supporting Studies 1 and 2 are available on request through the Journal of Marketing Research Dataverse (https://doi.org/10.7910/DVN/HNU2EW), and the data for Studies 3 through 6 and the web appendix study are posted on the Open Science Framework (https://osf.io/gsx32/). Studies 5 and 6 were preregistered. Any formal funding acknowledgment or disclosure statement would appear in the final published version of record.
Publication Details
Authors: Na Kyong (Kimberly) Hyun, University of Cincinnati, Carl H. Lindner College of Business; Michael L. Lowe, Georgia Institute of Technology, Scheller College of Business; Aradhna Krishna, University of Michigan, Ross School of Business. Title: “Vocal Similarity, Timbre, and Persuasion in Consumer-Spokesperson Interactions.” Journal: Journal of Marketing Research (author-accepted manuscript; manuscript ID JMR-22-0449.R5). DOI: 10.1177/00222437261440557.







