Speech synthesis with native JavaScript

02-09-2023 - Andrés Cruz

Assistants such as Siri, Google Now or Cortana have been appearing for some time now, among several functions that they provide is voice recognition and other applications such as Google Translator that has a voice synthesis system for different languages.

Thanks to HTML5, which provides an API that offers developers the possibility of working with speech recognition and speech synthesis in a simple way and whose use is similar to other existing APIs in JavaScript.

We already have half the task done

In a previous entry we have talked a little about the Web Speech API using speech recognition (speechRecognition()) in JavaScript which gives our applications the ability to recognize the voice according to the language configured through the microphone of the PC or mobile device :

LA API DE RECONOCIMIENTO DE VOZ EN JAVASCRIPT: SPEECHRECOGNITION()

The SpeechSynthesisUtterance class allows you to capture text and convert it to audio.

Now it remains to explain how we can do the opposite which results in, given a text, playing the audio in the language configured using the JavaScript speech synthesis API which is also part of the Web Speech API.

As indicated at the beginning, it is really easy to use the Web Speech API so that the browser "speaks" to us based on the previous configuration through the speech synthesis API; the minimum code needed would be something like this:

var speechSynthesisUtterance = new SpeechSynthesisUtterance('Hola'); window.speechSynthesis.speak(speechSynthesisUtterance);

As we can see, first we create an instance of the SpeechSynthesisUtterance class passing as a parameter the text that will be "spoken" by the browser and we process this object through the speechSynthesis interface, which is what will finally make the browser "speak".

Properties of the SpeechSynthesisUtterance class

The SpeechSynthesisUtterance class contains a series of methods, properties, etc; that allow you to set how the browser will "talk"; one of them is the text that allows establishing/obtaining the text that will be captured when our browser speaks to us through the speech synthesis API.

Although there are other attributes that allow you to customize a little more various aspects of the SpeechSynthesisUtterance class.

Like all APIs, the SpeechSynthesisUtterance class has a series of properties with which we can configure various things apart from the text; among the most important we have the language, volume, voice, tone, speed, etc.

SpeechSynthesisUtterance.lang

This has also been one of the most important properties and allows you to set or get the language of the text displayed for the speech synthesis API.

SpeechSynthesisUtterance.pitch

Allows you to set/get the pitch of voice; the real (float) value included is between zero (for the lowest) and two (for the highest).

SpeechSynthesisUtterance.rate

Allows you to set the speed at which the browser will "talk"; the real (floating) value included is between zero point one (for the lowest) and ten (for the highest).

SpeechSynthesisUtterance.text

This has been the most important property of all, and it is the one that allows us to establish or obtain the text with which we want our browser to speak to us.

SpeechSynthesisUtterance.volume

Sets or gets the volume of the voice represented by a real (float) value between zero (for quietest) and one (for loudest).

Eventos manejadores

You can see all the speech synthesis API events in the official documentation at the following link; although among those that can be considered most important or used we have:

peechMessage.onstart = function(e) {
  console.log('Hablando...');
};
 
speechMessage.onend = function(e) {
  console.log('Finalizado.');
};

Browser Support

To verify browser support, simply use the following code:

if ('speechSynthesis' in window) {
// hay soporte de la SpeechSynthesisUtterance
}

Putting it all together. Example of speech synthesis

Although, as almost always, a simple example serves to better understand each of the properties and event handlers of the speech synthesis API seen previously.

Demo

I agree to receive announcements of interest about this Blog.

This entry will explain how to use the speech synthesis in JavaScript that is part of the Web Speech API.

02-09-2023 - Andrés Cruz

En español

Api Voz JavaScript