Web DevelopmentText-to-Speech

Adding Text to Speech to JavaScript

Learn how browser speech works, its trade-offs, and how to integrate TTS2Go for high-quality AI text-to-speech in any JavaScript project.

Anthony Morris·

You do not need a framework to add text-to-speech (TTS) to your website. Whether you are working with plain HTML and JavaScript, a static site generator, or a server-rendered page, adding voice to your content is straightforward.

This guide explains how browser speech works, the trade-offs of different approaches, and how to integrate TTS2Go into any JavaScript project.

How Browser Speech Synthesis Works

Modern browsers ship with the Web Speech API built in. With a few lines of vanilla JavaScript you can:

```js

const utterance = new SpeechSynthesisUtterance("Hello world");

window.speechSynthesis.speak(utterance);

```

No npm packages, no API keys, and no server are required. It is free and works offline.

The downside is quality and consistency:

  • Voices often sound robotic.
  • Voice quality and selection vary between platforms.
  • Chrome on Windows sounds different from Safari on macOS.
  • You cannot guarantee a consistent listening experience for all users.

The Rise of AI-Generated Speech

AI text-to-speech has transformed what is possible with voice on the web. Neural voice models produce audio that sounds genuinely human, with natural pauses, emphasis, and rhythm.

Pre-Generating Audio

One option is to pre-generate audio for all of your content and host the files on a CDN.

Pros

  • Instant playback for users
  • Identical quality and voice everywhere

Cons

  • You pay to generate audio for every piece of content up front, whether it gets played or not
  • Content must be known in advance
  • Updates require regenerating files
  • Storage costs grow with your content library

Pre-generation works well for small, stable catalogs, but becomes expensive and inflexible at scale.

Lazy Generation on Demand

A more efficient approach is to generate audio only when someone actually clicks play.

Benefits

  • You pay only for audio that users request
  • Dynamic and user-generated content is handled naturally

Challenge

Without any gating, anyone visiting your site could trigger expensive generations. You need a way to decide what gets generated and when, so you can control your budget while still offering TTS everywhere.

How TTS2Go Solves This

TTS2Go takes a hybrid approach that balances quality, cost, and simplicity.

  1. Safe client-side key

You add the SDK to your site with a frontend API key. The key uses request domain blocking and rate limiting so it can safely live in your client-side code. It is an identification and rate limiting mechanism, not a secret.

  1. First visitor: instant browser TTS + background generation

When the first person clicks a TTS button on a piece of content, they hear browser speech synthesis immediately. At the same time, a generation request is sent to TTS2Go in the background.

  1. Approval and budget control

In your TTS2Go dashboard you can:

  • Approve generation requests manually, or
  • Configure the AI approval system to auto-approve requests that meet your criteria.

This gives you full control over your generation budget.

  1. Subsequent visitors: premium AI audio from CDN

Once a request is approved and generated, every subsequent user who clicks TTS on that same content gets high-quality AI audio in your chosen voice, served instantly from TTS2Go's CDN.

In practice: the first visitor gets browser TTS, costs are controlled through approval, and everyone after gets premium AI audio.

Step 1: Install the SDK

Install the TTS2Go vanilla JavaScript package using npm:

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

If you are not using a bundler, you can load it from a CDN with a <script> tag and access it as window.TTS2Go.

Step 2: Create the Client

Initialize a new TTS2Go instance with your project credentials:

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

Step 3: Add Text to Speech

Call tts.create() with your text and a voice ID to get a TTS instance with full playback controls:

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

From here you can explore the full TTS2Go JavaScript API, including:

  • tts.getVoices() – list available voices for your project