programming

Text to Speech in Javascript

Text to speech (also known as tts or speech synthesis) is an offshoot of the Web Speech API which provides distinct areas of functionality, including speech recognition.

Speech Synthesis

Speech synthesis (aka text-to-speech, or tts) has to do with receiving synthesising text contained within an app to speech, and playing it out of a device’s speaker or audio output connection.

The Web Speech API has a main controller interface for this — SpeechSynthesis — plus a number of closely-related interfaces for representing text to be synthesized (known as utterances), voices to be used for the utterance, etc. Again, most OSes have some kind of speech synthesis system, which will be used by the API for this task as available.

In this tutorial, we shall build a simple webpage that uses the Web Speech API to implement the text to speech.

Prerequisites

To get through with this tutorial, you should have:

  • A basic understanding of HTML & CSS
  • A basic understanding of Javascript
  • Code editor of your choice.

Let’s begin …

First, HTML

<!DOCTYPE>
<html lang="en">
  <head>
    <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.0-beta1/dist/css/bootstrap.min.css" rel="stylesheet" />
    <link rel="stylesheet" href="index.css" />
    <title>Text to Speech</title>
  </head>
  <body class="container mt-5 bg-secondary">
    <h1 class="text-light">Codeflare Text to Speech</h1>
    <p class="lead text-light mt-4">Select Voice</p>

    <!-- Select Menu for Voice -->
    <select id="voices" class="form-select bg-secondary text-light"></select>

    <!-- Range Slliders for Volume, Rate & Pitch -->
    <div class="d-flex mt-4 text-light">
      <div>
        <p class="lead">Volume</p>
        <input type="range" min="0" max="1" value="1" step="0.1" id="volume" />
        <span id="volume-label" class="ms-2">1</span>
      </div>
      <div class="mx-5">
        <p class="lead">Rate</p>
        <input type="range" min="0.1" max="10" value="1" id="rate" step="0.1" />
        <span id="rate-label" class="ms-2">1</span>
      </div>
      <div>
        <p class="lead">Pitch</p>
        <input type="range" min="0" max="2" value="1" step="0.1" id="pitch" />
        <span id="pitch-label" class="ms-2">1</span>
      </div>
    </div>

    <!-- Text Area  for the User to Type -->
    <textarea class="form-control bg-dark text-light mt-5" cols="30" rows="10" placeholder="Type here..."></textarea>

    <!-- Control Buttons -->
    <div class="mb-5">
      <button id="start" class="btn btn-success mt-5 me-3">Start</button>
      <button id="pause" class="btn btn-warning mt-5 me-3">Pause</button>
      <button id="resume" class="btn btn-info mt-5 me-3">Resume</button>
      <button id="cancel" class="btn btn-danger mt-5 me-3">Stop</button>
    </div>
  </body>
  <script src="./main.js"></script>
</html>

Here, we are using:

  • Bootstrap to style the web page
  • A textarea where what we want to be read is typed
  • Control buttons
  • A select option where we can select available voices
  • Range sliders for volume, pitch as well as rate.
  • We are also setting our language to English because we want the text to read in English.
Text to speech in Javascript

Now, Javascript …

We first need to create an instance of the SpeechSynthesisUtterance class. This instance will be configured with various properties, such as language, volume, text.

// Initialize new SpeechSynthesisUtterance object
let speech = new SpeechSynthesisUtterance();

// Set Speech Language
speech.lang = "en";

let voices = []; // global array of available voices

window.speechSynthesis.onvoiceschanged = () => {
  // Get List of Voices
  voices = window.speechSynthesis.getVoices();

  // Initially set the First Voice in the Array.
  speech.voice = voices[0];

  // Set the Voice Select List. (Set the Index as the value, which we'll use later when the user updates the Voice using the Select Menu.)
  let voiceSelect = document.querySelector("#voices");
  voices.forEach((voice, i) => (voiceSelect.options[i] = new Option(voice.name, i)));
};

document.querySelector("#rate").addEventListener("input", () => {
  // Get rate Value from the input
  const rate = document.querySelector("#rate").value;

  // Set rate property of the SpeechSynthesisUtterance instance
  speech.rate = rate;

  // Update the rate label
  document.querySelector("#rate-label").innerHTML = rate;
});

document.querySelector("#volume").addEventListener("input", () => {
  // Get volume Value from the input
  const volume = document.querySelector("#volume").value;

  // Set volume property of the SpeechSynthesisUtterance instance
  speech.volume = volume;

  // Update the volume label
  document.querySelector("#volume-label").innerHTML = volume;
});

document.querySelector("#pitch").addEventListener("input", () => {
  // Get pitch Value from the input
  const pitch = document.querySelector("#pitch").value;

  // Set pitch property of the SpeechSynthesisUtterance instance
  speech.pitch = pitch;

  // Update the pitch label
  document.querySelector("#pitch-label").innerHTML = pitch;
});

document.querySelector("#voices").addEventListener("change", () => {
  // On Voice change, use the value of the select menu (which is the index of the voice in the global voice array)
  speech.voice = voices[document.querySelector("#voices").value];
});

document.querySelector("#start").addEventListener("click", () => {
  // Set the text property with the value of the textarea
  speech.text = document.querySelector("textarea").value;

  // Start Speaking
  window.speechSynthesis.speak(speech);
});

document.querySelector("#pause").addEventListener("click", () => {
  // Pause the speechSynthesis instance
  window.speechSynthesis.pause();
});

document.querySelector("#resume").addEventListener("click", () => {
  // Resume the paused speechSynthesis instance
  window.speechSynthesis.resume();
});

document.querySelector("#cancel").addEventListener("click", () => {
  // Cancel the speechSynthesis instance
  window.speechSynthesis.cancel();
});

Now, we have our text-to-speech logic fully functioning.

Go ahead and check the Github Repo.

See also …

7 Javascript Shortcut you should use in your next project

Recent Posts

South Korea bans downloads of DeepSeek AI until further notice

The South Korean government announced on Monday that it had temporarily halted new downloads of…

5 days ago

Which programming language is best for software development?

As a software developer, choosing the right programming language for software development can be a…

1 week ago

What is a server farm?

A server farm, also known as a server cluster or data center, is a collection…

1 week ago

Elon Musk’s Starlink satellite internet service is awaiting authorization to begin operations in Pakistan.

Pakistan's mobile and broadband internet speeds rank in the bottom 10 percent globally, according to…

2 weeks ago

React Native Styling Guide

React Native is a popular framework for building cross-platform mobile applications using JavaScript and React.…

2 weeks ago

You can now customize your Android home screen shortcut with any widget of your choice

Google is not only passionate about developing innovative apps and services but also about finding…

2 weeks ago