API

The api is a function that you can use to integrate this package into your apps. When read this api docs you can toggle Outline (see top right) menu in github so you can navigate easily.

This package is written with typescript, You don't have to read all the docs in here, because this package now support VS Code IntelliSense what is that? simply its when you hover your mouse into some variable or function VS Code will show some popup (simple tutorial) what is the function about, examples, params, etc...

Show Video

intellisense.mp4

see API_VANILLA.md for vanilla js version.

Actually, Theres a lot of function, llm engine and constant that you can import from this package. Here's just few of them. When you have buy the package you can just go to the index.ts file and see all the function and constant. the package have a lot of features, ofcourse it have a lot of APIs.

Show How to import something from the package

// v5.3.6 API
import {
  // Main
  markTheWords,
  useTextToSpeech,

  // Utilities function for precision and add more capabilities
  pronunciationCorrection,
  getLangForThisText,
  getTheVoices,
  noAbbreviation,
  speak,
  convertTextIntoClearTranscriptText,

  // Package Data and Cache Integration
  // Your app can read the data used by this package, like:
  PKG,
  PREFERRED_VOICE, // Set global config for the preffered voice
  PKG_STATUS_OPT, // Package status option
  PKG_DEFAULT_LANG, // Package default lang
  LANG_CACHE_KEY, // Package lang sessionStorage key
  OPENAI_CHAT_COMPLETION_API_ENDPOINT,
  getVoiceBasedOnVoiceURI,
  getCachedVoiceInfo,
  getCachedVoiceURI,
  setCachedVoiceInfo,
  getCachedVoiceName,
} from "react-speech-highlight";

// Type data for typescript
import type {
  ControlHLType,
  StatusHLType,
  PrepareHLType,
  SpokenHLType,
  UseTextToSpeechReturnType,
  ActivateGestureProps,
  GetVoicesProps,
  VoiceInfo,
  markTheWordsFuncType,
  ConfigTTS,
  getAudioType,
  getAudioReturnType,
  VisemeMap,
  SentenceInfo,
} from "react-speech-highlight";

Main

1. TTS Marker `markTheWords()`

The markTheWords() function is to process the string text and give some marker to every word and sentences that system will read.

Show Code

Important, This example using react useMemo() to avoid unecessary react rerender. i mean it will only execute when the text is changing. it's similiar with useEffect().

function abbreviationFunction(str) {
  // You can write your custom abbreviation function here
  // example:
  // Input(string) : LMK
  // Ouput(string) : Let me know

  return str;
}

const textHL = useMemo(() => markTheWords(text, abbreviationFunction), [text]);

2. TTS React Hook `useTextToSpeech()`

2.A. CONFIG

There are two config placement, initialConfig and actionConfig.

Show Code

const initialConfig = {
  autoHL: true,
  disableSentenceHL: false,
  disableWordHL: false,
  classSentences: "highlight-sentence",
  classWord: "highlight-spoken",

  lang: "id-ID",
  pitch: 1,
  rate: 0.9,
  volume: 1,
  autoScroll: false,
  clear: true,

  // For viseme mapping,
  visemeMap: {},

  // Prefer or fallback to audio file
  preferAudio: null,
  fallbackAudio: null,

  batchSize: 200,

  timestampDetectionMode: "auto",
};

const { controlHL, statusHL, prepareHL, spokenHL } =
  useTextToSpeech(initialConfig);

const actionConfig = {
  autoHL: true,
  disableSentenceHL: false,
  disableWordHL: false,
  classSentences: "highlight-sentence",
  classWord: "highlight-spoken",

  lang: "id-ID",
  pitch: 1,
  rate: 0.9,
  volume: 1,
  autoScroll: false,
  clear: true,

  // For viseme mapping,
  visemeMap: {},

  // Prefer or fallback to audio file
  preferAudio: "example.com/some_file.mp3",
  fallbackAudio: "example.com/some_file.mp3",

  batchSize: null, // or 200

  timestampDetectionMode: "auto", // or rule, ml
};

void controlHL.play({
  textEl: textEl.current,
  onEnded: () => {
    console.log("Callback when tts done");
  },
  actionConfig,
});

Show details config

autoHL

If the voice is not support the onboundary event, then this package prefer to disable word highlight. instead of trying to mimic onboundary event
disableSentenceHL

Disable sentence highlight
disableWordHL

Disable word highlight
classSentences

You can styling the highlighted sentence with css to some class name
classWord

You can styling the highlighted word with css to some class name
lang

The one used for SpeechSynthesisUtterance.lang. see
pitch

The one used for SpeechSynthesisUtterance.pitch
volume

The one used for SpeechSynthesisUtterance.volume
autoScroll

Beautifull auto scroll, so the user can always see the highlighted sentences
clear

if true overide previous played TTS with some new TTS that user want, if false user want to execute play new TTS but there's still exist played TTS. so it will just entering queue behind it
visemeMap

The data for this parameter i provide in the demo website source code.
preferAudio

Some API to pass string or async function that return audio url like this example.com/some_file.mp3 as preferred audio.

So the package will use this audio instead of the built in web speech synthesis.

fallbackAudio

Some API to pass string or async function that return audio url like thisexample.com/some_file.mp3 as fallback audio.

When the built in web speech synthesis error or user doesn't have any voice. the fallback audio file will be used.

async function getAudioForThisText(text){
 var res = await getAudioFromTTSAPI("https://yourbackend.com/api/elevenlabs....",text);
 // convert to audio file, convert again to audio url

 return res;
}

const config = {
  preferAudio: getAudioForThisText // will only call if needed (if user want to play) so you can save cost
  fallbackAudio: getAudioForThisText // will only call if needed (if web speech synthesis fail)  so you can save cost
}

const { controlHL, statusHL, prepareHL, spokenHL } = useTextToSpeech(config)

batchSize

The batch size for the audio file.

When you set the batch is null so they send all the text. then you set for 200 package will chunk the text into 200 character.

Example: 200 so package will batched send 200 characters per request to TTS API

Readmore about batch system in this package
timestampDetectionMode

Detection mode for timestamp engine. see private docs

2.B. INTERFACE

controlHL

controlHL.play();
controlHL.pause();
controlHL.resume();
controlHL.stop();
controlHL.seekSentenceBackward();
controlHL.seekSentenceForward();
controlHL.seekParagraphBackward();
controlHL.seekParagraphForward();
controlHL.changeConfig();
controlHL.activateGesture();

statusHL

Some react state that give the status of the program. The value it can be idle|play|calibration|pause|loading. You can fixed the value with accessing from PKG_STATUS_OPT constant.

Name	Description
`idle`	it's initial state
`calibration`	system still process the text, so when TTS is playing it will performs accurate and better
`play`	The system still playing TTS
`pause`	Resume TTS
`loading`	it mean the the system still processing to get best voices available. status will change to this value if we call `prepareHL.getVoices()` see

prepareHL

Contain state and function to preparing the TTS. From all available voices that we can get from the SpeechSynthesis.getVoices() this package will test the voice and give 5 only best voice with language specified before.

Name	Description
prepareHL.getVoices()	Function to tell this package to get the best voice. see
prepareHL.voices	React state store the result from `prepareHL.getVoices()`
prepareHL.loadingProgress	React state for knowing voice testing progress

spokenHL

Contain react state for reporting while TTS playing.

Name	Description
spokenHL.sentence	Some react state, Get the sentence that read
spokenHL.word	Some react state, Get the word that read
spokenHL.viseme	Some react state, Get the current viseme
spokenHL.precentageWord	Read precentage between 0-100 based on words
spokenHL.precentageSentence	Read precentage between 0-100 based on sentences

Utilities

Utilities function for precision and add more capabilities

1. pronunciationCorrection()

The common problem is the text display to user is different with their spoken form. like math symbol, equations, terms, etc.. readmore about pronounciation problem

How to build this package with open ai api integration

Show Code

const inputText = `
<ul>
  <li>1000</li>
  <li>4090</li>
  <li>1.000.000</li>
  <li>1,2</li>
  <li>9.001</li>
  <li>30,1</li>
</ul>
`;

const textEl = useRef();

const pronounciation = async (): Promise<void> => {
  if (textEl.current) {
    await pronunciationCorrection(textEl.current, (progress) => {
      console.log(progress);
    });
  }
};

useEffect(() => {
  if (textEl.current) {
    console.log("pronounciation");
    void pronounciation();
  }
  // eslint-disable-next-line
}, []);

const textHL = useMemo(() => markTheWords(inputText), [inputText]);

return (
  <div ref={textEl}>
    <p
      dangerouslySetInnerHTML={{
        __html: textHL,
      }}
    ></p>
  </div>
);

2. getLangForThisText()

For example you want to implement this package into blog website with multi language, it's hard to know the exact language for each post / article.

Then i use chat gpt api to detect what language from some text. see How to build this package with open ai api integration

Show Code

var timeout = null;

const inputText = `
Hallo, das ist ein deutscher Beispieltext
`;

async function getLang() {
  var predictedLang = await getLangForThisText(textEl.current);

  // will return `de`
  if (predictedLang) {
    setLang(predictedLang);
  }
}

useEffect(() => {
  if (textEl.current) {
    if (inputText != "") {
      // The timeout is for use case: text change frequently.
      // if the text doesn't change just call getLang();
      if (timeout) {
        clearTimeout(timeout);
      }

      timeout = setTimeout(() => {
        getLang();
      }, 2000);
    }
  }
}, [inputText]);

3. convertTextIntoClearTranscriptText()

Function to convert your input string (just text or html string) into Speech Synthesis Markup Language (SSML) clear format that this package can understand when making transcript timestamp.

You must use this function when making the audio file

var convertInto = "ssml"; // or "plain_text"
var clear_transcript = convertTextIntoClearTranscriptText(
  "your string here",
  convertInto
);
// with the clear_transcript you can make audio file with help of other speech synthesis platforms like elevenlabs etc.

Package Data and Cache Integration

The data or cache (storage) that this package use can be accessed outside. The one that used by React GPT Web Guide.

Show

import {
  // ...other API

  // Your app can read the data / cache used by this package, like:
  PREFERRED_VOICE, // Set global config for the preffered voice
  PKG_STATUS_OPT, // Package status option
  PKG_DEFAULT_LANG, // Package default lang
  LANG_CACHE_KEY, // Package lang sessionStorage key
  OPENAI_CHAT_COMPLETION_API_ENDPOINT, // Key to set open ai chat completion api
  getVoiceBasedOnVoiceURI,
  getCachedVoiceInfo,
  getCachedVoiceURI,
  setCachedVoiceInfo,
  getCachedVoiceName,
} from "react-speech-highlight";

Usage example:

Set custom constant value for this package

import { setupKey, storage } from "@/app/react-speech-highlight";

// set global preferred voice
useEffect(() => {
  const your_defined_preferred_voice = {
    // important! Define language code (en-us) with lowercase letter
    "de-de": ["Helena", "Anna"],
  };

  storage.setItem(
    "global",
    setupKey.PREFERRED_VOICE,
    yourDefinedPreferredVoice
  );

  // Set open ai chat completion api
  // example in demo website (next js using environment variable) src/Components/ClientProvider.tsx
  if (process.env.NEXT_PUBLIC_OPENAI_CHAT_COMPLETION_API_ENDPOINT) {
    storage.setItem(
      "global",
      setupKey.OPENAI_CHAT_COMPLETION_API_ENDPOINT,
      process.env.NEXT_PUBLIC_OPENAI_CHAT_COMPLETION_API_ENDPOINT
    );
  }

  // or
  storage.setItem(
    "global",
    OPENAI_CHAT_COMPLETION_API_ENDPOINT,
    "http://localhost:8000/api/v1/public/chat"
  );

  // You can set the headers for the fetch API request with this key in sessionStorage
  const headers = {
    Authorization: `Bearer xxx_YOUR_PLATFORM_AUTH_TOKEN_HERE_xxx`,
  };

  // Tips: Hover your mouse over the REQUEST_HEADERS variable to see the example and docs
  storage.setItem("global", setupKey.REQUEST_HEADERS, headers);

  // Speech to Text API endpoint
  if (process.env.NEXT_PUBLIC_OPENAI_STT_API_ENDPOINT) {
    storage.setItem(
      "global",
      setupKey.OPENAI_SPEECH_TO_TEXT_API_ENDPOINT,
      process.env.NEXT_PUBLIC_OPENAI_STT_API_ENDPOINT
    );
  }
}, []);

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API.md

API.md

API

Main

1. TTS Marker `markTheWords()`

2. TTS React Hook `useTextToSpeech()`

2.A. CONFIG

2.B. INTERFACE

controlHL

statusHL

prepareHL

spokenHL

Utilities

1. pronunciationCorrection()

2. getLangForThisText()

3. convertTextIntoClearTranscriptText()

Package Data and Cache Integration

Set custom constant value for this package

Files

API.md

Latest commit

History

API.md

File metadata and controls

API

Main

1. TTS Marker markTheWords()

2. TTS React Hook useTextToSpeech()

2.A. CONFIG

2.B. INTERFACE

controlHL

statusHL

prepareHL

spokenHL

Utilities

1. pronunciationCorrection()

2. getLangForThisText()

3. convertTextIntoClearTranscriptText()

Package Data and Cache Integration

Set custom constant value for this package

1. TTS Marker `markTheWords()`

2. TTS React Hook `useTextToSpeech()`