Web DevelopmentText-to-Speech

Adding Text to Speech to JavaScript

Learn how browser speech works, its trade-offs, and how to integrate TTS2Go for high-quality AI text-to-speech in any JavaScript project.

Anthony Morris·March 30, 2026

You do not need a framework to add text-to-speech (TTS) to your website. Whether you are working with plain HTML and JavaScript, a static site generator, or a server-rendered page, adding voice to your content is straightforward.

This guide explains how browser speech works, the trade-offs of different approaches, and how to integrate TTS2Go into any JavaScript project.

How Browser Speech Synthesis Works

Modern browsers ship with the Web Speech API built in. With a few lines of vanilla JavaScript you can:

```js

const utterance = new SpeechSynthesisUtterance("Hello world");

window.speechSynthesis.speak(utterance);

```

No npm packages, no API keys, and no server are required. It is free and works offline.

The downside is quality and consistency:

Voices often sound robotic.
Voice quality and selection vary between platforms.
Chrome on Windows sounds different from Safari on macOS.
You cannot guarantee a consistent listening experience for all users.

The Rise of AI-Generated Speech

AI text-to-speech has transformed what is possible with voice on the web. Neural voice models produce audio that sounds genuinely human, with natural pauses, emphasis, and rhythm.

Pre-Generating Audio

One option is to pre-generate audio for all of your content and host the files on a CDN.

Pros

Instant playback for users
Identical quality and voice everywhere

Cons

You pay to generate audio for every piece of content up front, whether it gets played or not
Content must be known in advance
Updates require regenerating files
Storage costs grow with your content library

Pre-generation works well for small, stable catalogs, but becomes expensive and inflexible at scale.

Lazy Generation on Demand

A more efficient approach is to generate audio only when someone actually clicks play.

Benefits

You pay only for audio that users request
Dynamic and user-generated content is handled naturally

Challenge

Without any gating, anyone visiting your site could trigger expensive generations. You need a way to decide what gets generated and when, so you can control your budget while still offering TTS everywhere.

How TTS2Go Solves This

TTS2Go takes a hybrid approach that balances quality, cost, and simplicity.

Safe client-side key

You add the SDK to your site with a frontend API key. The key uses request domain blocking and rate limiting so it can safely live in your client-side code. It is an identification and rate limiting mechanism, not a secret.

First visitor: instant browser TTS + background generation

When the first person clicks a TTS button on a piece of content, they hear browser speech synthesis immediately. At the same time, a generation request is sent to TTS2Go in the background.

Approval and budget control

In your TTS2Go dashboard you can:

Approve generation requests manually, or
Configure the AI approval system to auto-approve requests that meet your criteria.

This gives you full control over your generation budget.

Subsequent visitors: premium AI audio from CDN

Once a request is approved and generated, every subsequent user who clicks TTS on that same content gets high-quality AI audio in your chosen voice, served instantly from TTS2Go's CDN.

In practice: the first visitor gets browser TTS, costs are controlled through approval, and everyone after gets premium AI audio.

Step 1: Install the SDK

Install the TTS2Go vanilla JavaScript package using npm:


npm install @tts2go/vanilla

If you are not using a bundler, you can load it from a CDN with a <script> tag and access it as window.TTS2Go.

Step 2: Create the Client

Initialize a new TTS2Go instance with your project credentials:


import { TTS2Go } from "@tts2go/vanilla";

const tts = new TTS2Go({

apiKey: "your-api-key",

projectId: "your-project-id",

});

Step 3: Add Text to Speech

Call tts.create() with your text and a voice ID to get a TTS instance with full playback controls:


const instance = tts.create(
"Hello from TTS2Go in plain JavaScript!",)

From here you can explore the full TTS2Go JavaScript API, including:

tts.getVoices() – list available voices for your project

Back to all posts