programming

Text to Speech in Javascript

Text to speech (also known as tts or speech synthesis) is an offshoot of the Web Speech API which provides distinct areas of functionality, including speech recognition.

Speech Synthesis

Speech synthesis (aka text-to-speech, or tts) has to do with receiving synthesising text contained within an app to speech, and playing it out of a device’s speaker or audio output connection.

The Web Speech API has a main controller interface for this — SpeechSynthesis — plus a number of closely-related interfaces for representing text to be synthesized (known as utterances), voices to be used for the utterance, etc. Again, most OSes have some kind of speech synthesis system, which will be used by the API for this task as available.

In this tutorial, we shall build a simple webpage that uses the Web Speech API to implement the text to speech.

Prerequisites

To get through with this tutorial, you should have:

  • A basic understanding of HTML & CSS
  • A basic understanding of Javascript
  • Code editor of your choice.

Let’s begin …

First, HTML

<!DOCTYPE>
<html lang="en">
  <head>
    <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.0-beta1/dist/css/bootstrap.min.css" rel="stylesheet" />
    <link rel="stylesheet" href="index.css" />
    <title>Text to Speech</title>
  </head>
  <body class="container mt-5 bg-secondary">
    <h1 class="text-light">Codeflare Text to Speech</h1>
    <p class="lead text-light mt-4">Select Voice</p>

    <!-- Select Menu for Voice -->
    <select id="voices" class="form-select bg-secondary text-light"></select>

    <!-- Range Slliders for Volume, Rate & Pitch -->
    <div class="d-flex mt-4 text-light">
      <div>
        <p class="lead">Volume</p>
        <input type="range" min="0" max="1" value="1" step="0.1" id="volume" />
        <span id="volume-label" class="ms-2">1</span>
      </div>
      <div class="mx-5">
        <p class="lead">Rate</p>
        <input type="range" min="0.1" max="10" value="1" id="rate" step="0.1" />
        <span id="rate-label" class="ms-2">1</span>
      </div>
      <div>
        <p class="lead">Pitch</p>
        <input type="range" min="0" max="2" value="1" step="0.1" id="pitch" />
        <span id="pitch-label" class="ms-2">1</span>
      </div>
    </div>

    <!-- Text Area  for the User to Type -->
    <textarea class="form-control bg-dark text-light mt-5" cols="30" rows="10" placeholder="Type here..."></textarea>

    <!-- Control Buttons -->
    <div class="mb-5">
      <button id="start" class="btn btn-success mt-5 me-3">Start</button>
      <button id="pause" class="btn btn-warning mt-5 me-3">Pause</button>
      <button id="resume" class="btn btn-info mt-5 me-3">Resume</button>
      <button id="cancel" class="btn btn-danger mt-5 me-3">Stop</button>
    </div>
  </body>
  <script src="./main.js"></script>
</html>

Here, we are using:

  • Bootstrap to style the web page
  • A textarea where what we want to be read is typed
  • Control buttons
  • A select option where we can select available voices
  • Range sliders for volume, pitch as well as rate.
  • We are also setting our language to English because we want the text to read in English.
Text to speech in Javascript

Now, Javascript …

We first need to create an instance of the SpeechSynthesisUtterance class. This instance will be configured with various properties, such as language, volume, text.

// Initialize new SpeechSynthesisUtterance object
let speech = new SpeechSynthesisUtterance();

// Set Speech Language
speech.lang = "en";

let voices = []; // global array of available voices

window.speechSynthesis.onvoiceschanged = () => {
  // Get List of Voices
  voices = window.speechSynthesis.getVoices();

  // Initially set the First Voice in the Array.
  speech.voice = voices[0];

  // Set the Voice Select List. (Set the Index as the value, which we'll use later when the user updates the Voice using the Select Menu.)
  let voiceSelect = document.querySelector("#voices");
  voices.forEach((voice, i) => (voiceSelect.options[i] = new Option(voice.name, i)));
};

document.querySelector("#rate").addEventListener("input", () => {
  // Get rate Value from the input
  const rate = document.querySelector("#rate").value;

  // Set rate property of the SpeechSynthesisUtterance instance
  speech.rate = rate;

  // Update the rate label
  document.querySelector("#rate-label").innerHTML = rate;
});

document.querySelector("#volume").addEventListener("input", () => {
  // Get volume Value from the input
  const volume = document.querySelector("#volume").value;

  // Set volume property of the SpeechSynthesisUtterance instance
  speech.volume = volume;

  // Update the volume label
  document.querySelector("#volume-label").innerHTML = volume;
});

document.querySelector("#pitch").addEventListener("input", () => {
  // Get pitch Value from the input
  const pitch = document.querySelector("#pitch").value;

  // Set pitch property of the SpeechSynthesisUtterance instance
  speech.pitch = pitch;

  // Update the pitch label
  document.querySelector("#pitch-label").innerHTML = pitch;
});

document.querySelector("#voices").addEventListener("change", () => {
  // On Voice change, use the value of the select menu (which is the index of the voice in the global voice array)
  speech.voice = voices[document.querySelector("#voices").value];
});

document.querySelector("#start").addEventListener("click", () => {
  // Set the text property with the value of the textarea
  speech.text = document.querySelector("textarea").value;

  // Start Speaking
  window.speechSynthesis.speak(speech);
});

document.querySelector("#pause").addEventListener("click", () => {
  // Pause the speechSynthesis instance
  window.speechSynthesis.pause();
});

document.querySelector("#resume").addEventListener("click", () => {
  // Resume the paused speechSynthesis instance
  window.speechSynthesis.resume();
});

document.querySelector("#cancel").addEventListener("click", () => {
  // Cancel the speechSynthesis instance
  window.speechSynthesis.cancel();
});

Now, we have our text-to-speech logic fully functioning.

Go ahead and check the Github Repo.

See also …

7 Javascript Shortcut you should use in your next project

Recent Posts

ReferenceError vs. TypeError: What’s the Difference?

When debugging JavaScript, you’ll often encounter ReferenceError and TypeError. While both indicate something went wrong,…

10 hours ago

document.querySelector() vs. getElementById(): Which is Faster?

When selecting DOM elements in JavaScript, two common methods are document.querySelector() and document.getElementById(). But which…

10 hours ago

npm vs. Yarn: Which Package Manager Should You Use in 2025?

When starting a JavaScript project, one of the first decisions you’ll face is: Should I…

3 days ago

Why Learn Software Development? (And Where to Start)

Software development is one of the most valuable skills you can learn. From building websites…

6 days ago

JavaScript Multidimensional Arrays

In JavaScript, arrays are used to store multiple values in a single variable. While JavaScript…

2 weeks ago

What is Containerization

Containerization is a lightweight form of virtualization that packages an application and its dependencies into…

2 weeks ago