<< Click to Display Table of Contents >> Navigation: 3. Script Language > AI - Artificial Intelligence Commands > AIC. - Artificial Intelligence Command > ! Open AI - Whisper > Open AI - Whisper Commands |
MiniRobotLanguage (MRL)
## Introduction
Whisper is an advanced Speech-to-Text (STT) system developed by OpenAI.
It's designed to convert spoken language into written text with high accuracy and minimal latency.
Whisper is particularly useful for Smart Package Robot users who want to integrate voice commands or transcriptions into their projects.
In this section, we'll explore how Whisper works and why it's a valuable addition to your Scripts.
## API-Key needed
To use Whisper you will need an OpenAI API key.
This is the same key that you can also use for the other OpenAI services like ChatGPT and DALLE 2 image generation.
We have in this manual a chapter on how you obtain this key and the details how you use the key.
From all services WHISPER is the cheapest with a price of currently just $0.006 / minute (rounded to the nearest second).
Which equals about 6 Cent per 10 Minutes of transcription.
## Combining AI's with the SPR
Imagine a world where your voice can be transformed into text and then spoken back to you in a completely different voice,
all within seconds. Welcome to the incredible synergy of ElevenLabs' Speech Synthesis and OpenAI's Whisper Speech-to-Text! 🎙️🤖
How Does It Work? 🤔
•You Speak: Simply say something out loud.
•Whisper Listens: OpenAI's Whisper technology converts your spoken words into text.
•11 Labs Speaks: This text is then sent to 11 Labs, which synthesizes it into speech using a different voice.
You can use a prompt to enhance the quality of the transcripts generated by the Whisper API. The model will attempt to match the style of the prompt, making it more likely to use capitalization and punctuation if the prompt does. However, the current prompting system is more limited than other language models and only provides limited control over the generated audio. Here are some ways prompting can assist:
•Specific Words/Acronyms: Prompts can correct specific words or acronyms that the model often misrecognizes. For instance, the prompt "The transcript is about OpenAI which makes technology like DALL·E, GPT-3, and ChatGPT..." can improve the transcription of words like DALL·E and GPT-3.
•Preserving Context: To maintain the context of a segmented file, prompt the model with the transcript of the preceding segment. This makes the transcript more accurate, as the model will use the relevant information from the previous audio.
•Punctuation: The model might sometimes skip punctuation. This can be rectified by using a prompt that includes punctuation, e.g., "Hello, welcome to my lecture."
•Filler Words: The model may omit common filler words. To retain these in your transcript, use a prompt containing them, e.g., "Umm, let me think like, hmm... Okay, here's what I'm, like, thinking."
•Writing Styles: Some languages, like Chinese, have different writing styles (simplified or traditional). Use a prompt in your preferred style to guide the model.
Here is a sample Prompt that will tell Whisper how several acronyms are written:
' Using Parameters could look like this, and will help the AI to transcribe things properly.
$$PRO=ZyntriQix, Digique Plus, CynapseFive, VortiQore V8, EchoNix Array,
$$PRO+OrbitalLink Seven, DigiFractal Matrix, PULSE, RAPT, B.R.I.C.K., Q.U.A.R.T.Z., F.L.I.N.T.
AIC.Set Whisper Default|$$PRO
You can achieve this magical experience with just 5 lines of code (See below)!
Yes, you read that right. Six lines are all it takes to create this voice transformation loop. 🚀
AIC.Set Key|file
AIC.Set Whisper default
AIC.Dictate Text|$$RET
' Send Whisper Output to Elevenlabs to speak
AIS.Say Text|$$RES
ENR.
And what if you just add one more Line calling ChatGPT with the result from Whisper??
AIS.Set Key|file
AIC.Set Key|file
VAF.$$FIL=?exeloc\Test.mp3
AIC.crb
AIC.rsb.|$$FIL
AIC.dtx|$$FIL|$$RES
$$INS=Please take the following text and convert it into a rhyme of Goethe, return only the rhyme and nothing else.
$$INS+$crlf+$$RES
AIC.Ask_Chat|$$INS|$$OUT
DBP.$$OUT
' Now we speak the Text
AIS.Say Text|$$OUT
ENR.
See also:
•