<< Click to Display Table of Contents >> Navigation: 3. Script Language > AI - Artificial Intelligence Commands > AIC. - Artificial Intelligence Command > Open AI - Whisper Commands |
MiniRobotLanguage (MRL)
What is Whisper and how does Whisper Work?
Whisper is the leading Speech to Text Technology from Open AI.
Whisper uses machine learning algorithms trained on a vast dataset of audio and corresponding text,
to implement the leading Speech to Text (STT-) Technology.
When you send an audio clip to the Whisper Cloud, it processes the audio and returns the transcribed text in real-time.
Here's a simplified workflow:
1. **Capture Audio**: Record the audio you want to transcribe.
2. **AIW.Ask_Whisper**: This command will send the audio file to the Whisper Cloud.
3. **Get Response**: You will receive the transcribed / translated text directly in a SPR-Variable.
4. **Display or Use Text**: Use the transcribed text in your project as needed.
5. **Process with GPT-3.5 or GPT-4**: Use the transcribed text in your project and have it automatically corrected and processed, using GPT-3.5 or GPT-4 as needed.
To make it easier for SPR users, we have integrated specific commands that interact with the Whisper API.
### No Training Required
Unlike some STT systems that require you to "train" the model to understand your voice, Whisper works "out of the box." It's designed to understand a wide range of accents and dialects.
### Multilingual Support
Whisper can transcribe nearly 100 languages, making it incredibly versatile for international projects.
### High Accuracy
Whisper's advanced algorithms ensure that the transcriptions are highly accurate, even in noisy environments.
### Real-Time Transcription
The API is designed for low-latency, real-time transcription, which is crucial for interactive applications.
### Cloud-Based or Local
Whisper offers both cloud-based and local solutions. The cloud-based API is cost-effective and doesn't require any installation, while the local version is ideal for those who need to keep their data in-house.
### Easy Integration with GPT-3.5
Whisper can be easily combined with OpenAI's GPT-3.5 to not just transcribe the text but also to understand and act upon it, making your SPR projects smarter and more interactive.
## Conclusion
Whisper's advanced Speech-to-Text capabilities make it an invaluable tool for Smart Package Robot users. Its ease of use, high accuracy, and real-time processing capabilities set it apart from other STT technologies.
By integrating Whisper into your projects, you open up a whole new realm of possibilities for voice-activated automation and data collection.
Schema of the Whisper AI from Open AI, see.
https://raw.githubusercontent.com/openai/whisper/main/approach.png
from:
https://github.com/openai/whisper#available-models-and-languages
or the Whisper Whitepaper