programming

Text to Speech in Javascript

Text to speech (also known as tts or speech synthesis) is an offshoot of the Web Speech API which provides distinct areas of functionality, including speech recognition.

Speech Synthesis

Speech synthesis (aka text-to-speech, or tts) has to do with receiving synthesising text contained within an app to speech, and playing it out of a device’s speaker or audio output connection.

The Web Speech API has a main controller interface for this — SpeechSynthesis — plus a number of closely-related interfaces for representing text to be synthesized (known as utterances), voices to be used for the utterance, etc. Again, most OSes have some kind of speech synthesis system, which will be used by the API for this task as available.

In this tutorial, we shall build a simple webpage that uses the Web Speech API to implement the text to speech.

Prerequisites

To get through with this tutorial, you should have:

  • A basic understanding of HTML & CSS
  • A basic understanding of Javascript
  • Code editor of your choice.

Let’s begin …

First, HTML

<!DOCTYPE>
<html lang="en">
  <head>
    <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.0-beta1/dist/css/bootstrap.min.css" rel="stylesheet" />
    <link rel="stylesheet" href="index.css" />
    <title>Text to Speech</title>
  </head>
  <body class="container mt-5 bg-secondary">
    <h1 class="text-light">Codeflare Text to Speech</h1>
    <p class="lead text-light mt-4">Select Voice</p>

    <!-- Select Menu for Voice -->
    <select id="voices" class="form-select bg-secondary text-light"></select>

    <!-- Range Slliders for Volume, Rate & Pitch -->
    <div class="d-flex mt-4 text-light">
      <div>
        <p class="lead">Volume</p>
        <input type="range" min="0" max="1" value="1" step="0.1" id="volume" />
        <span id="volume-label" class="ms-2">1</span>
      </div>
      <div class="mx-5">
        <p class="lead">Rate</p>
        <input type="range" min="0.1" max="10" value="1" id="rate" step="0.1" />
        <span id="rate-label" class="ms-2">1</span>
      </div>
      <div>
        <p class="lead">Pitch</p>
        <input type="range" min="0" max="2" value="1" step="0.1" id="pitch" />
        <span id="pitch-label" class="ms-2">1</span>
      </div>
    </div>

    <!-- Text Area  for the User to Type -->
    <textarea class="form-control bg-dark text-light mt-5" cols="30" rows="10" placeholder="Type here..."></textarea>

    <!-- Control Buttons -->
    <div class="mb-5">
      <button id="start" class="btn btn-success mt-5 me-3">Start</button>
      <button id="pause" class="btn btn-warning mt-5 me-3">Pause</button>
      <button id="resume" class="btn btn-info mt-5 me-3">Resume</button>
      <button id="cancel" class="btn btn-danger mt-5 me-3">Stop</button>
    </div>
  </body>
  <script src="./main.js"></script>
</html>

Here, we are using:

  • Bootstrap to style the web page
  • A textarea where what we want to be read is typed
  • Control buttons
  • A select option where we can select available voices
  • Range sliders for volume, pitch as well as rate.
  • We are also setting our language to English because we want the text to read in English.
Text to speech in Javascript

Now, Javascript …

We first need to create an instance of the SpeechSynthesisUtterance class. This instance will be configured with various properties, such as language, volume, text.

// Initialize new SpeechSynthesisUtterance object
let speech = new SpeechSynthesisUtterance();

// Set Speech Language
speech.lang = "en";

let voices = []; // global array of available voices

window.speechSynthesis.onvoiceschanged = () => {
  // Get List of Voices
  voices = window.speechSynthesis.getVoices();

  // Initially set the First Voice in the Array.
  speech.voice = voices[0];

  // Set the Voice Select List. (Set the Index as the value, which we'll use later when the user updates the Voice using the Select Menu.)
  let voiceSelect = document.querySelector("#voices");
  voices.forEach((voice, i) => (voiceSelect.options[i] = new Option(voice.name, i)));
};

document.querySelector("#rate").addEventListener("input", () => {
  // Get rate Value from the input
  const rate = document.querySelector("#rate").value;

  // Set rate property of the SpeechSynthesisUtterance instance
  speech.rate = rate;

  // Update the rate label
  document.querySelector("#rate-label").innerHTML = rate;
});

document.querySelector("#volume").addEventListener("input", () => {
  // Get volume Value from the input
  const volume = document.querySelector("#volume").value;

  // Set volume property of the SpeechSynthesisUtterance instance
  speech.volume = volume;

  // Update the volume label
  document.querySelector("#volume-label").innerHTML = volume;
});

document.querySelector("#pitch").addEventListener("input", () => {
  // Get pitch Value from the input
  const pitch = document.querySelector("#pitch").value;

  // Set pitch property of the SpeechSynthesisUtterance instance
  speech.pitch = pitch;

  // Update the pitch label
  document.querySelector("#pitch-label").innerHTML = pitch;
});

document.querySelector("#voices").addEventListener("change", () => {
  // On Voice change, use the value of the select menu (which is the index of the voice in the global voice array)
  speech.voice = voices[document.querySelector("#voices").value];
});

document.querySelector("#start").addEventListener("click", () => {
  // Set the text property with the value of the textarea
  speech.text = document.querySelector("textarea").value;

  // Start Speaking
  window.speechSynthesis.speak(speech);
});

document.querySelector("#pause").addEventListener("click", () => {
  // Pause the speechSynthesis instance
  window.speechSynthesis.pause();
});

document.querySelector("#resume").addEventListener("click", () => {
  // Resume the paused speechSynthesis instance
  window.speechSynthesis.resume();
});

document.querySelector("#cancel").addEventListener("click", () => {
  // Cancel the speechSynthesis instance
  window.speechSynthesis.cancel();
});

Now, we have our text-to-speech logic fully functioning.

Go ahead and check the Github Repo.

See also …

7 Javascript Shortcut you should use in your next project

Share
Published by
codeflare

Recent Posts

The Golden Ratio (φ)

1. What Is the Golden Ratio? The Golden Ratio, represented by the Greek letter φ (phi), is…

1 day ago

CSS Combinators

In CSS, combinators define relationships between selectors. Instead of selecting elements individually, combinators allow you to target elements based…

4 days ago

Boolean Algebra

Below is a comprehensive, beginner-friendly, yet deeply detailed guide to Boolean Algebra, complete with definitions, laws,…

5 days ago

Why It’s Difficult to Debug Other People’s Code (And what Can be Done About it)

Debugging your own code is hard enough — debugging someone else’s code is a whole…

6 days ago

Complete Git Commands

Git is a free, open-source distributed version control system created by Linus Torvalds.It helps developers: Learn how to…

1 week ago

Bubble Sort Algorithm

Bubble Sort is one of the simplest sorting algorithms in computer science. Although it’s not…

1 week ago