3. Script Language > AI - Artificial Intelligence Commands > AIC. - Artificial Intelligence Command > Ask AI Commands

AIC.Ask Multi Vision

Previous Top Next

MiniRobotLanguage (MRL)

AIC.Ask Multi Vision

Send a multiple Picture to the Open AI "Vision A.I."-Endpoint and receive a description of these pictures

Intention

Using the AIC.Ask Multi Vision command you can send a picture to the OpenAI Vision endpoint and receive a text-description about this picture.

You can also send a prompt and tell the Vision AI what exactly you want to know about these picture and this way you can influence the result that you get.

The AIC.Ask Multi Vision command is developed for complex image processing tasks involving multiple images.

It is particularly useful in scenarios that require batch analysis of images under the guidance of a single prompt.

The current model is best at answering general questions about what is present in the images. While it does understand the relationship between objects in images, it is not yet optimized to answer detailed questions about the location of certain objects in an image. For example, you can ask it what color a car is or what some ideas for dinner might be based on what is in you fridge, but if you show it an image of a room and ask it where the chair is, it may not answer the question correctly.

The latency of the model can also be improved by downsizing your images ahead of time to be less than the maximum size they are expected them to be. For low res mode, we expect a 512px x 512px image. For high rest mode, the short side of the image should be less than 768px and the long side should be less than 2,000px. You can use the AIC.EnsureFormatResize Command to automatically generate resized versions of Pictures.

Actually the OpenAI Vision API will only accept ".jpg"-Files.

Therefore you best provide ".jpg"-FIles.

However, if you provide other files, the SPR will internally generate jpg-Versions of these files in the "?temp\" (Windows-Temp-Folder) and use these instead.

AIC.Set Key|File

$$PAT=F:\Testfolder

ARR.Set Array|0|0|$$PAT\Testpic_XL.jpg

ARR.Set Array|0|1|$$PAT\Testpic_XA.png

$$PRO=Tell me the difference between the two pictures

AIC.Ask Multi Vision|0|$$PRO|$$RET

AIC.Show Error

MBX.$$RET

ENR.

clip0909 clip0911 clip0912 clip0910

Using the same pictures and prompts, you may get very different results.

Syntax

AIC.Ask Multi Vision|P1|P2[|P3]

Parameter Explanation

P1 - Array-Nr. 0-32, is the array number, an integer between 0 and 32, indicating the SPR-Array that contains the file names.

P2 - Prompt - Prompt to tell the AI what you want to know from the picture.

P3 -(optional) Variable that will receive the result. If omitted the result is placed on TOS.

Example

'***********************************

' AIC.Ask Multi Vision

'***********************************

AIC.Set Key|File

$$PAT=F:\00_MR\MR_Komponents\U1-DLL

ARR.Set Array|0|0|$$PAT\Testpic_XL.jpg

ARR.Set Array|0|1|$$PAT\X513.jpg

ARR.Set Array|0|2|$$PAT\C.jpg

$$PRO=Tell me the difference between the 3 pictures

AIC.Ask Multi Vision|0|$$PRO|$$RET

AIC.Show Error

MBX.$$RET

ENR.

clip0913

Remarks

It's crucial to ensure the array number (P1) is within the valid range (0-32). Providing an out-of-range number will result in an error.

Limitations:

Is there a maximum Number of images you can send?

From the SPR side there is no other limitation then Memory limits and bandwidth limits.