Does Text-to-Speech Actually Help Struggling Readers? What the Research Says

Teachers ask us this question more than any other: does text-to-speech actually help, or is it just a crutch that prevents students from learning to read on their own?

It's a fair question. If a student can press a button and have every text read aloud to them, why would they ever develop independent reading skills? The concern is intuitive — but the research tells a more nuanced story. TTS doesn't replace reading instruction, and it isn't appropriate for every student in every context. But for students with reading difficulties, the evidence consistently shows that TTS improves comprehension, supports engagement, and — counterintuitively — can actually reinforce reading skills rather than undermine them.

Here's what the research says.

TTS Improves Reading Comprehension for Students with Disabilities

The most direct question is whether students understand more when they use TTS. Multiple studies have addressed this.

Balajthy (2005) studied the use of TTS with struggling readers in elementary school and found that students using TTS demonstrated improved reading comprehension compared to reading print alone. Critically, the benefit was most pronounced for students whose decoding skills were significantly below their listening comprehension — students who understood content when they heard it but couldn't access it through print. For these students, TTS acted as a bridge: it gave them access to the meaning of the text while they continued developing decoding skills separately.

Elkind, Black, and Murray (1996) found that students with dyslexia who used TTS read faster, with better comprehension, and with less fatigue than when reading print alone. The study used a synchronized highlighting model where words were highlighted as they were spoken — essentially the same approach used by modern TTS tools like ReadingVox. Students weren't just passively listening; they were reading along, with the audio providing scaffolding.

Stodden, Roberts, Takahashi, Park, and Stodden (2012) examined TTS use among college students with learning disabilities and found that TTS significantly improved both reading rate and comprehension scores on standardized measures. This is important because it demonstrates that the benefits extend beyond elementary school. Students with reading disabilities don't outgrow the need for TTS accommodations; many continue to benefit through postsecondary education.

The Dual-Coding Theory: Why Hearing and Seeing Together Works

The theoretical explanation for why TTS works comes largely from Allan Paivio's dual-coding theory (1986), which proposes that information processed through two channels — visual and auditory — is encoded more deeply than information processed through a single channel.

When a student reads silently, they process text through a single visual channel. When they listen to audio without text, they process through a single auditory channel. But when they see text on screen while simultaneously hearing it spoken aloud — especially with word-level highlighting that synchronizes the two channels — they process through both channels simultaneously. The visual representation of each word is paired with its auditory representation at the exact moment of processing.

This isn't just theoretical. Mayer's cognitive theory of multimedia learning (2009) builds on dual-coding theory and has been validated across hundreds of studies: people learn better from words and pictures together than from words alone, and better from narrated text than from text alone, provided the narration is synchronized with the visual presentation.

Word-level highlighting is the key mechanism that makes TTS more than just "listening to a book." Without highlighting, a student might listen while their eyes wander — effectively converting to single-channel auditory processing. With synchronized highlighting, the student's visual attention is anchored to the word being spoken, maintaining dual-channel input.

This is why the quality of highlighting matters. Approximate highlighting — where a sentence or phrase lights up — is less effective than word-level highlighting where each individual word is highlighted at the precise millisecond it's spoken. The tighter the synchronization, the stronger the dual-coding effect.

TTS in Content Areas: Evidence from STEM

A common concern is whether TTS is "legitimate" outside of language arts. If we're measuring science knowledge, should students be allowed to use TTS on a science test?

Meyer and Bouck (2014) studied TTS use in STEM contexts with students with learning disabilities. They found that students who used TTS on science and math word problems performed significantly better than when reading the same problems in print. The TTS didn't help them solve the math — it helped them understand what the math problem was asking. For a student who can do the computation but can't decode the word problem, TTS removes the irrelevant barrier.

This finding aligns with the accommodation vs. modification distinction: if we're measuring science knowledge, not reading ability, then giving a student auditory access to the question is an accommodation, not a shortcut.

Wood, Moxley, Tighe, and Wagner (2018) conducted a meta-analysis of TTS research in educational settings and found a moderate positive effect size for comprehension across content areas. The effect was strongest for students with identified reading disabilities and weaker (but still positive) for students without diagnosed disabilities who were reading below grade level.

Does TTS Become a Crutch? What the Longitudinal Evidence Shows

This is the concern that keeps teachers up at night: if students always have TTS, will they ever learn to read without it?

The research is reassuring, though with caveats. Raskind and Higgins (1999) conducted one of the few longitudinal studies on assistive technology use with students with learning disabilities and found that students who used compensatory technology (including TTS) in school were more likely to be employed and living independently as adults than similar students who didn't use assistive technology. The technology didn't make them dependent; it kept them engaged in education long enough to develop the skills they needed.

Several mechanisms explain why TTS doesn't typically become a crutch:

Vocabulary exposure. Students who use TTS encounter more text than students who avoid reading due to frustration. More text exposure means more vocabulary acquisition, more background knowledge, and more exposure to complex sentence structures — all of which support reading development.
Motivation and self-efficacy. Students who can successfully complete reading assignments — even with TTS support — develop a sense of academic competence that students who repeatedly fail at reading do not. This motivation keeps them engaged in literacy activities.
Incidental decoding practice. When a student follows along with word-level highlighting, they're getting repeated exposure to print-to-sound correspondence. They see the word "photosynthesis" while hearing it spoken, reinforcing the connection. This isn't a substitute for explicit phonics instruction, but it's additional exposure that accumulates over time.
Student self-regulation. In practice, most students with reading difficulties use TTS selectively — for difficult or unfamiliar text, for long reading assignments, for content areas where they want to focus on the content rather than the decoding. They don't use it for everything. Given choice, students naturally calibrate their use to their needs.

The important caveat: TTS is not a reading intervention. It's an accommodation. A student with dyslexia still needs structured literacy instruction — explicit, systematic phonics and decoding instruction — to build reading skills. TTS ensures they can access grade-level content while that instruction is happening. Schools that use TTS as a substitute for reading intervention are misusing the tool. Schools that use TTS alongside reading intervention are using it as intended.

What Makes TTS More or Less Effective?

Not all TTS implementations are equal. Research points to several factors that influence effectiveness:

Voice quality matters. Earlier studies using robotic, monotone TTS voices showed smaller effect sizes than studies using more natural-sounding voices. As synthesis technology has improved — from concatenative voices to neural TTS — the listening experience has become less fatiguing and more comprehensible. Neural voices like those used by AWS Polly (the engine behind ReadingVox) produce speech that's significantly more natural than the ChromeOS built-in TTS, and this quality difference translates to better sustained attention.

Synchronized highlighting matters more than audio alone. Studies consistently show that TTS with word-level highlighting produces better outcomes than audio alone. Montali and Lewandowski (1996) found that the combined presentation (visual text + synchronized audio) was superior to either modality alone, particularly for less skilled readers. The highlighting provides the visual anchor that enables dual-coding.

Student control matters. TTS is more effective when students can control the speed, pause and resume, and replay sections. This isn't just a UX preference — it reflects the reading comprehension strategy of re-reading, which is critical for learning from text. A student who can pause TTS and replay a confusing paragraph is engaging in the same metacognitive strategy as a proficient reader who re-reads a paragraph — they're just doing it through a different modality.

Context matters. TTS is most beneficial when the student's decoding ability is significantly below their listening comprehension. For students whose decoding and comprehension are both low, TTS helps with access but doesn't solve the comprehension problem. These students need both TTS (for access) and comprehension instruction (for understanding).

A Balanced View

We sell a TTS tool, so you should take our enthusiasm with appropriate skepticism. Here's our honest assessment:

Text-to-speech is not magic. It doesn't teach students to read. It doesn't replace skilled reading instruction, and it doesn't work equally well for all students. The research effect sizes are moderate, not overwhelming — we're talking about meaningful improvement, not transformation.

But the research is consistent, spans decades, and points clearly in one direction: for students with reading difficulties, TTS with synchronized word highlighting improves comprehension, supports engagement, and provides access to content that would otherwise be inaccessible. When used alongside — not instead of — explicit reading instruction, it makes school workable for students who would otherwise spend their days staring at text they can't decode, falling further behind in every subject.

The question isn't whether TTS helps struggling readers. The evidence says it does. The question is whether your school is providing it effectively: with high-quality voices, precise word highlighting, student control over the experience, and a culture where using it isn't stigmatized.

References

Balajthy, E. (2005). Text-to-speech software for helping struggling readers. Reading Online, 8(4).
Elkind, J., Black, M. S., & Murray, C. (1996). Computer-based compensation of adult reading disabilities. Annals of Dyslexia, 46, 159-186.
Mayer, R. E. (2009). Multimedia Learning (2nd ed.). Cambridge University Press.
Meyer, N. K., & Bouck, E. C. (2014). The impact of text-to-speech on expository reading for adolescents with LD. Journal of Special Education Technology, 29(1), 21-33.
Montali, J., & Lewandowski, L. (1996). Bimodal reading: Benefits of a talking computer for average and less skilled readers. Journal of Learning Disabilities, 29(3), 271-279.
Paivio, A. (1986). Mental Representations: A Dual Coding Approach. Oxford University Press.
Raskind, M. H., & Higgins, E. L. (1999). Speaking to read: The effects of speech recognition technology on the reading and spelling performance of children with learning disabilities. Annals of Dyslexia, 49, 251-281.
Stodden, R. A., Roberts, K. D., Takahashi, K., Park, H. J., & Stodden, N. J. (2012). Use of text-to-speech software to improve reading skills of high school struggling readers. Procedia Computer Science, 14, 359-362.
Wood, S. G., Moxley, J. H., Tighe, E. L., & Wagner, R. K. (2018). Does use of text-to-speech and related read-aloud tools improve reading comprehension for students with reading disabilities? A meta-analysis. Journal of Learning Disabilities, 51(1), 73-84.