Configure the Google Cloud Text to Speech Converter

Text-to-Speech AI is a service available in Google Cloud that converts text into natural-sounding speech using an API powered by the best of Google’s AI technologies.

To configure the Google Cloud Text-to-Speech AI as the text-to-Speech converter used by the Real Voice plugin, proceed as follows:

  1. Visit the Real Voice -> Options menu
  2.  Proceed to the Text-to-Speech tab
  3.  In the Text-to-speech Converter option, select Google Text-to-Speech API (Cloud service). This option is available in the General section.
  4.  Click Save Settings

To use this text-to-speech converter, you must also configure the credentials used to identify your API requests. You can configure the credentials with the following procedure:

  1. Visit the Real Voice -> Options menu
  2.  Proceed to the Text-to-Speech tab
  3.  Enter your key in the Google Cloud Secret Access Key option available in the Google Cloud Text-to-speech section
  4.  Click Save Settings

The plugin provides you additional options to configure your use of the Amazon Polly cloud service. These options are available in the Google Cloud Text-to-speech section of the Text-to-Speech tab:

  • Audio Encoding – Select one of the audio encoding supported by the service.
  •  Speaking Rate – Speaking rate/speed, in the range [0.25, 4.0]. 1.0 is the normal native speed supported by the specific voice. 2.0 is twice as fast, and 0.5 is half as fast.
  •  Speaking Pitch – Speaking pitch, in the range [-20.0, 20.0]. 20 means increase 20 semitones from the original pitch. -20 means decrease 20 semitones from the original pitch.
  •  Volume Gain db – Volume gain (in dB) of the normal native volume supported by the specific voice, in the range [-96.0, 16.0].
  •  Sample Rate – The synthesis sample rate (in hertz) for this audio. Note that this value affect the audio quality and the space occupied by the generated audio files.
  •  Effects Profile ID – Optionally select one or more audio profiles. Effects are applied on top of each other in the order they are given.
  •  Language Code – The language of the voice as a BCP-47 language tag. Note that this value should match the language code of the selected voice name.
  •  Voice Name – Enter the voice that will be used to speak the utterance.