Elevenlabs Speech Synthesis

<< Click to Display Table of Contents >>

Navigation:  3. Script Language > AI - Artificial Intelligence Commands > AIS. - AI Speech Synthesis >

Elevenlabs Speech Synthesis

AIS.Text to MP3

Previous Top Next


MiniRobotLanguage (MRL)

 

AIS.Text to MP3

Converts text to spoken MP3 (in a file).

 

 

Intention

 

The AIS.Text to MP3 command is designed to convert text into an MP3 audio file using cloud services from Elevenlabs.io.

This command requires an API key, which must be set using the AIS.Set Key command.

 

Hint: The model will automatically identify the written language and use the set parameters to generate speech in it.

 

Generally there are 28 Langugaes supported, for example:
English, Japanese, Chinese, German, Hindi, French, Korean, Portuguese, Italian, Spanish, Indonesian, Dutch, Turkish, Filipino, Polish, Swedish,

Bulgarian, Romanian, Arabic, Czech, Greek, Finnish, Croatian, Malay, Slovak, Danish, Tamil, Ukrainian

 

To find out if the selected language is supported in that voice-model, use the AIS-Get Models-command.

 

The Caching System

There is a Caching System built into this command. Here's how it works and why it's beneficial:

 

In the observation of standard conversational patterns, it becomes evident that certain words and phrases are recurrently employed.

 

Given that the utilization of the Elevenlabs Cloud incurs a cost, for each spoken word, it is judicious to implement a caching mechanism.

 

When the Smart Package Robot identifies the repetition of specific words or phrases, it retrieves these elements from the cache rather than initiating a redundant request to the Elevenlabs Cloud.

It should be noted that this caching feature is optional; specifying a filename will bypass the mechanism altogether.

 
The employment of a caching mechanism offers the distinct advantage of expedited language availability compared to awaiting MP3 delivery from the Elevenlabs Cloud.

However, it's important to acknowledge a minor drawback. The Elevenlabs AI is designed to never vocalize the exact same sentence in an identical manner.

 

Consequently, bypassing the cache can lend a more authentic feel to the conversation.

On the other hand, repeated use of the cache may result in noticeable uniformity when the same sentence is audibly identical over time.

Ultimately, the choice of whether to utilize this feature rests with the user, allowing for customization based on individual preferences.

 

 

There is also a way to have the Smart Package Robot delete the actual recording and generate it new via Cloud.

If you do this:

 

AIS.Say Text|$$TXT|-

 

Then the Script will just re-generate the saved mp3-file with a new version, and save the new version in the Cache.

 

If P2 is omitted, the system will automatically generate a cache folder to store all MP3 files along with a checksum. This feature helps in reducing costs and saving resources by reusing the same MP3 file for repeated phrases.

Note: The system is case-sensitive, meaning the same phrase with different cases will generate different checksums. Be mindful of this when using the command.

This is intentional to keep compatibility with special Commands that may need upper and lowercase characters.

 

Efficiency and Speed: The built-in caching system serves multiple purposes, primarily aimed at optimizing resource usage and reducing costs.

When the same text is converted to MP3 multiple times, the system retrieves the already generated MP3 file from the cache instead of making a new API call to Elevenlabs.io. This speeds up the process significantly.

Cost-Effectiveness: API calls usually come with a cost. By utilizing a caching system, you minimize the number of API calls made to Elevenlabs.io, thereby saving money.

Resource Optimization: Generating an MP3 from text consumes computational resources. Caching allows the system to avoid redundant operations, thus saving CPU cycles and memory usage.

Customization and Control:The commands AIS.Set Folder and AIS.Get Folder allow you to specify the directory where the cached MP3 files and their checksums are stored. This gives you control over the organization of these files, making it easier to manage them.

Case Sensitivity:The system intentionally does not alter the case of the text when generating the checksum. This means that the same text with different casing will be treated as different phrases, each with its own cached MP3. This feature allows for precise control but also means you should be mindful of text casing to maximize the benefits of caching.

In summary, the built-in caching system is designed to make the command more efficient, cost-effective, and user-friendly.

 

The Location of the default cache folder is: "?exeloc\AIS_Folder\"

 

 

Syntax

 

 

AIS.Text to MP3|P1[|P2]

AIS.tmp|P1[|P2]

 

 

 

Parameter Explanation

 

P1: The text you want to convert into an MP3 file.

P2: Optional. The path where the generated MP3 file will be saved. The caching system will automatically be used if P2 is omitted or empty.

 

If P2 is omitted, the system will automatically generate a cache folder to store all MP3 files along with a checksum.

This feature helps in reducing costs and saving resources by reusing the same MP3 file for repeated phrases.

Note: The system is case-sensitive, meaning the same phrase with different cases will generate different checksums.

The use of different voices is not part of the caching, means the cache knows only the Text, not which voice it is.

To use the caching with different voices and cache them separately, you will need to Change the Cache Folder location using the AIS.Set Folder - Command.

 

 

Example

 

'***********************************

' AIS.-Sample

'***********************************

AIS.Set Key|<YourAPIKeyHere>

AIS.Text to MP3|Hello World|?path/to/save.mp3

 

     In this example, the text "Hello World" will be converted into an MP3 file and saved in the specified path.

 

 

 

Remarks

 

The API key must be set using the AIS.Set Key command before using this command.

The system will use a cache to save resources only if P2 is omitted or empty.

 

 

Limitations:

 

-

 

 

See also: