AIN.AskV

Previous Top Next

MiniRobotLanguage (MRL)

AIN.AskV
Perform a Vision Chat Completion with an Image

Intention

AskV Command: Vision-Based Chat Completion

The AIN.AskV command performs vision-based chat completions using a local Ollama model (e.g., llava). It accepts a text prompt and an image input, which can be a URL, local filepath, or base64-encoded string.

This enhances local automation by enabling direct use of filesystem images or embedded base64 data, processed on-premises with AnythingLLM.

It’s part of the AIN - AnythingLLM AI suite.

What is the AskV Command?

The AIN.AskV command sends a POST request to the AnythingLLM API (default: http://localhost:3001/api/chat), using a local Ollama model like llava to analyze an image and respond to a prompt.

The JSON payload includes message (prompt), either imageUrl (for URLs) or imageData (for base64 from files or strings), mode ("chat"), maxTokens (default 4096), and temperature (default 0.7).

Ollama runs at http://localhost:11434, bridged by AnythingLLM for local processing.

Why Do You Need It?

Key use cases include:

•Flexible Input: Supports URLs, local files, or base64 strings for vision tasks.

•Local Automation: Directly process filesystem images without hosting.

•Privacy: Keeps all data on-premises with Ollama.

How to Use the AskV Command?

Provide a prompt and an image input (URL, filepath, or base64 string). Optionally, specify a response variable and clipboard flag.

Requires Ollama at http://localhost:11434, integrated with AnythingLLM. Set an API key with AIN.SetKey and a model like llava via AIN.SetModel.

Example Usage

' Using a local filepath

AIN.SetKey|your_api_key_here

AIN.SetModel|llava

AIN.AskV|What’s on this screen?|C:\Screenshots\screen.jpg|$$RES|1

DBP.Screen Content: $$RES

Analyzes a local screenshot, converting it to base64 internally.

' Using a base64 string

AIN.AskV|Describe this image|data:image/jpeg;base64,/9j/4AAQSkZJRg...|$$OUT

DBP.Image Description: $$OUT

Processes a base64-encoded image directly.

Syntax

AIN.AskV|P1|P2[|P3][|P4]

Parameter Explanation

P1 - Text prompt (required), e.g., "What’s on this screen?"

P2 - Image input (required): URL (e.g., "http://localhost:8080/screen.jpg"), filepath (e.g., "C:\Screenshots\screen.jpg"), or base64 string (e.g., "data:image/jpeg;base64,...").

P3 - (Optional) Variable for the response, e.g., "$$RES". If omitted, use AIN.GetRaw.

P4 - (Optional) "1" to copy to clipboard; omit or "0" otherwise.

Example

AIN.SetKey|your_api_key_here

AIN.SetModel|llava

AIN.AskV|What’s in this photo?|C:\Photos\photo.jpg|$$OUT

DBP.Photo Description: $$OUT

ENR.

Describes a local photo, converting it to base64 internally.

Remarks

- Requires a vision-capable Ollama model (e.g., llava), installed via ollama pull llava.

- Local filepaths are converted to base64; ensure files are readable.

- Base64 inputs can be raw strings or data URIs (e.g., data:image/jpeg;base64,...).

- AnythingLLM must support imageData for base64 inputs.

Limitations

- Base64 support requires AnythingLLM to accept imageData; untested as of March 20, 2025.

- Large files or base64 strings may increase processing time or exceed payload limits.

See also:

• AIN.Ask

• AIN.SetModel

• AIN.SetKey