IBM Text to Speech

IBM Text to Speech


Public API Docs: IBM Watson Text-to-Speech

The IBM Watson Text-to-Speech API is a powerful tool for turning written text into spoken words. With this API, developers can create sophisticated text-to-speech systems that can be used in a variety of applications. In this article, we’ll take a look at how to use the API in JavaScript.

Getting Started

Before you can start using the IBM Watson Text-to-Speech API, you’ll need to sign up for a free IBM Cloud account. Once you’ve done that, you can create a new Text-to-Speech service instance by following these steps:

  1. Log in to the IBM Cloud Console.
  2. Open the Catalog and select the Text-to-Speech service.
  3. Configure your service instance by giving it a name and selecting your pricing plan.
  4. Click “Create” to create your service instance.

Once your service instance is created, you can get your API key by clicking “Manage” and then selecting “Credentials”. You’ll need this API key to make API requests.

Using the API in JavaScript

To use the IBM Watson Text-to-Speech API in JavaScript, you’ll need to use the fetch() function to make HTTP requests to the API. Here’s an example of how to get the list of available voices:

fetch('https://api.us-south.text-to-speech.watson.cloud.ibm.com/instances/{instance_id}/v1/voices', {
  headers: {
    'Authorization': 'Basic ' + btoa('apikey:' + '{api_key}'),
    'Content-Type': 'application/json'
  }
})
.then(response => response.json())
.then(data => console.log(data));

In this code, you’ll need to replace {instance_id} with your service instance ID and {api_key} with your API key. This code sends a GET request to the API’s voices endpoint to get the list of available voices. The response is returned as JSON, which is then logged to the console.

Here’s another example of how to use the Text-to-Speech API to synthesize speech from text:

fetch('https://api.us-south.text-to-speech.watson.cloud.ibm.com/instances/{instance_id}/v1/synthesize', {
  method: 'POST',
  headers: {
    'Authorization': 'Basic ' + btoa('apikey:' + '{api_key}'),
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    text: 'Hello, world!',
    voice: 'en-US_LisaV3Voice',
    accept: 'audio/mp3'
  })
})
.then(response => response.blob())
.then(data => {
  const audio = new Audio(URL.createObjectURL(data));
  audio.play();
});

In this code, you’ll need to replace {instance_id} with your service instance ID and {api_key} with your API key. This code sends a POST request to the API’s synthesize endpoint with the desired text, voice, and audio format. The response is returned as a binary blob, which is then used to create an <audio> element that is played back.

Conclusion

The IBM Watson Text-to-Speech API is a powerful tool for adding natural-sounding speech to your applications. With a little bit of JavaScript, you can easily integrate this API into your projects.