Integrate Azure Cognitive Services Text to Speech to your React.JS App


This blog post will help you how to easily integrate text to speech feature from Azure Cognitive Services to your React.JS application.

Jul. 31, 2022
Mark Deanil Vicente - dnilvincent.net
Introduction

Text-to-speech or TTS in short is a type of assistive technology that reads certain digital text aloud. This technology brings a lot of benefits in different industries, whether in teaching, announcement, broadcasting, entertainment and even to you when you're bored reading certain text. Why not just listen to them instead?

This technology not only gives us efficiency in the majority, but it also beneficial to those people with literacy difficulties, learning disabilities, reduced vision and those learning a language.

Top cloud providers offer Text-to-speech API like Microsoft Azure Text to speech, Google Cloud Text-to-Speech & Amazon Text-to-Speech. Other cloud services also offer this service, but it depends on you what will you choose.

But in this blog, we'll be using Microsoft Azure Cognitive Services Text to speech. Without further ado, let's start coding!

FTF (First Things First)
Prerequisites

1. Machine with your text editor/IDE

2. Microsoft Azure Account (Try it for free)

3. React.JS Application

1. Create Cognitive Services in Azure Portal

1.1 Create resource in Azure Portal. (Make sure you already have subscription whether free or paid in Azure)

Mark Deanil Vicente - dnilvincent.net

Below is a sample. Click the "Create" then once the creation is done, click the button "Go to resource"

Mark Deanil Vicente - dnilvincent.net

1.2 click the "Click here to manage keys" to navigate to the key section.

Mark Deanil Vicente - dnilvincent.net

1.3 Save the keys because we are going to need them on our react.js configuration.

Mark Deanil Vicente - dnilvincent.net
2. Install Package in React.JS App

2.1 npm i microsoft-cognitiveservices-speech-sdk

https://www.npmjs.com/package/microsoft-cognitiveservices-speech-sdk
3. Configure the React.JS App

On my sample, I'm using the React.JS project template. For this demo, I'm overwriting the App.tsx file.

3.1 Import the package below.

        const sdk = require("microsoft-cognitiveservices-speech-sdk");
      

3.2 Inside your component function, configure the speech sdk

        const key = "YOUR_KEY_FROM_YOUR_COGNITIVE_SERVICE";
const region = "westus2";
const speechConfig = sdk.SpeechConfig.fromSubscription(key, region);
// The language of the voice that speaks.
speechConfig.speechSynthesisVoiceName = "en-US-JennyNeural";

// Create the speech synthesizer.
let synthesizer = new sdk.SpeechSynthesizer(speechConfig);
      

3.3 Create a function that invoke the speakTextAsync from the SDK's Synthesizer

        const test = () => {
synthesizer.speakTextAsync(
  "Enter your text here.",
    function (result: any) {
      if (result.reason === sdk.ResultReason.SynthesizingAudioCompleted) {
        console.log("synthesis finished.");
      } else {
        console.error(`Speech synthesis canceled: ${result.errorDetails} \nDid you set the speech resource key and region values?`);
      }
      synthesizer.close();
      synthesizer = null;
      },
    function (err: any) {
      console.trace(`Error: ${err}`);
      synthesizer.close();
      synthesizer = null;
    }
  );
};
      

3.4 Call the function inside the useEffect or create an UI that has a button and input that trigger the function.

Below is a sample code that you paste and play around on your machine.

        import { useState } from "react";
const sdk = require("microsoft-cognitiveservices-speech-sdk");
  
function App() {
  const key = "YOUR_KEY_FROM_YOUR_COGNITIVE_SERVICE";
  const region = "westus2";
  const speechConfig = sdk.SpeechConfig.fromSubscription(key, region);
  // The language of the voice that speaks.
  speechConfig.speechSynthesisVoiceName = "en-US-JennyNeural";
  
  // Create the speech synthesizer.
  let synthesizer = new sdk.SpeechSynthesizer(speechConfig);
  
  const [text, setText] = useState("");
  const [loading, setLoading] = useState(false);
  
  const test = () => {
    setLoading(true);
    synthesizer.speakTextAsync(
      text,
      function (result: any) {
        if (result.reason === sdk.ResultReason.SynthesizingAudioCompleted) {
          console.log("synthesis finished.");
        } else {
          console.error(`Speech synthesis canceled: ${result.errorDetails} \nDid you set the speech resource key and region values?`);
        }
        synthesizer.close();
        synthesizer = null;
        setLoading(false);
      },
      function (err: any) {
        console.trace(`Error: ${err}`);
        synthesizer.close();
        synthesizer = null;
      }
    );
  };
  
  return (
    <div className="App">
      <input onChange={(e) => setText(e.target.value)} />
      {loading ? "Loading..." : <button onClick={test}>Run Text to Speech</button>}
    </div>
  );
}
  
  export default App;
  
      
Additional Info

You can change the speechSynthesisVoiceName or speech voice from this complete list.

You can read more about this API from this documentation.

If you have some questions or comments, please drop it below 👇 :)

Buy Me A Tea