Open AI - Whisper Commands

<< Click to Display Table of Contents >>

Navigation:  3. Script Language > AI - Artificial Intelligence Commands > AIC. - Artificial Intelligence Command > ! Open AI - Whisper >

Open AI - Whisper Commands

Whisper "Speech 2 Text"

Previous Top Next


MiniRobotLanguage (MRL)

 

clip0794

 

Whisper AI - "Speech to Text" 

 

 Whisper API for SPR: Speech-to-Text Made Easy

 

## Introduction

 

Whisper is an advanced Speech-to-Text (STT) system developed by OpenAI.

It's designed to convert spoken language into written text with high accuracy and minimal latency.
 

Whisper is particularly useful for Smart Package Robot users who want to integrate voice commands or transcriptions into their projects.

In this section, we'll explore how Whisper works and why it's a valuable addition to your Scripts.

 

## API-Key needed

To use Whisper you will need an OpenAI API key.

This is the same key that you can also use for the other OpenAI services like ChatGPT and DALLE 2 image generation.

We have in this manual a chapter on how you obtain this key and the details how you use the key.

From all services WHISPER is the cheapest with a price of currently just $0.006 / minute (rounded to the nearest second).

Which equals about 6 Cent per 10 Minutes of transcription.

 

## Combining AI's with the SPR

Imagine a world where your voice can be transformed into text and then spoken back to you in a completely different voice,

all within seconds. Welcome to the incredible synergy of ElevenLabs' Speech Synthesis and OpenAI's Whisper Speech-to-Text! 🎙️🤖

 

How Does It Work? 🤔

You Speak: Simply say something out loud.

Whisper Listens: OpenAI's Whisper technology converts your spoken words into text.

11 Labs Speaks: This text is then sent to 11 Labs, which synthesizes it into speech using a different voice.

 

Prompts and Quality Improvement

You can use a prompt to enhance the quality of the transcripts generated by the Whisper API. The model will attempt to match the style of the prompt, making it more likely to use capitalization and punctuation if the prompt does. However, the current prompting system is more limited than other language models and only provides limited control over the generated audio. Here are some ways prompting can assist:

Specific Words/Acronyms: Prompts can correct specific words or acronyms that the model often misrecognizes. For instance, the prompt "The transcript is about OpenAI which makes technology like DALL·E, GPT-3, and ChatGPT..." can improve the transcription of words like DALL·E and GPT-3.

Preserving Context: To maintain the context of a segmented file, prompt the model with the transcript of the preceding segment. This makes the transcript more accurate, as the model will use the relevant information from the previous audio.

Punctuation: The model might sometimes skip punctuation. This can be rectified by using a prompt that includes punctuation, e.g., "Hello, welcome to my lecture."

Filler Words: The model may omit common filler words. To retain these in your transcript, use a prompt containing them, e.g., "Umm, let me think like, hmm... Okay, here's what I'm, like, thinking."

Writing Styles: Some languages, like Chinese, have different writing styles (simplified or traditional). Use a prompt in your preferred style to guide the model.

 

Here is a sample Prompt that will tell Whisper how several acronyms are written:

 

' Using Parameters could look like this, and will help the AI to transcribe things properly.

$$PRO=ZyntriQix, Digique Plus, CynapseFive, VortiQore V8, EchoNix Array, 

$$PRO+OrbitalLink Seven, DigiFractal Matrix, PULSE, RAPT, B.R.I.C.K., Q.U.A.R.T.Z., F.L.I.N.T.

AIC.Set Whisper Default|$$PRO

 

The Best Part? 🎉

You can achieve this magical experience with just 5 lines of code (See below)!

Yes, you read that right. Six lines are all it takes to create this voice transformation loop. 🚀

 

AIC.Set Key|file

 

AIC.Set Whisper default

AIC.Dictate Text|$$RET

 

' Send Whisper Output to Elevenlabs to speak

AIS.Say Text|$$RES

ENR.

 

 

And what if you just add one more Line calling ChatGPT with the result from Whisper??

 

AIS.Set Key|file

AIC.Set Key|file

 

VAF.$$FIL=?exeloc\Test.mp3

 

AIC.crb

AIC.rsb.|$$FIL

AIC.dtx|$$FIL|$$RES

 

$$INS=Please take the following text and convert it into a rhyme of Goethe, return only the rhyme and nothing else.

$$INS+$crlf+$$RES

AIC.Ask_Chat|$$INS|$$OUT

 

DBP.$$OUT

' Now we speak the Text

AIS.Say Text|$$OUT

ENR.

 

 

 

See also: