|
<< Click to Display Table of Contents >> Navigation: 3. Script Language > AI - Artificial Intelligence Commands > AIG. - Google AI > 9. Computer Use Commands > AIG. - AI Google Gemini Integration |
MiniRobotLanguage (MRL)
AIG.ComputerUse
Autonomous UI Element Finding via AI Vision
Intention
To find the exact screen coordinates of a UI element (button, icon, text field) based solely on a natural language description. This allows the robot to "see" the screen and click items even if they don't have standard Windows Controls or IDs.
This is a high-level "Agent" command that performs a complex workflow automatically:
1.Captures: Takes a screenshot of the specified window (or active screen).
2.Optimizes: Resizes and converts the image according to AIG.SetCuOptions (default: 1500px width, JPG) to ensure speed and low cost.
3.Analyzes: Sends the image to Google Gemini with your description.
4.Calculates: Translates the AI's response back into real-world screen coordinates.
5.Returns: Puts the X and Y coordinates into variables you provide.
�Robustness: Finds elements even if they move, change color, or change resolution.
�Ease of Use: No need to capture "Image Search" bitmaps or record coordinate offsets.
�Semantic Finding: You can ask for "The red delete icon" or "The third checkbox in the list".
' 1. Select the target window (e.g. Calculator)
STW.t|Calculator
HTV.$$HND
' 2. Ask AI to find the "Equal" button
AIG.ComputerUse|$$HND|the equals sign button|$$X|$$Y
' 3. Check results (0,0 means not found)
IVV.$$X>0
' 4. Click it (using Window Relative coordinates)
STW.h|$$HND
MLC.t|$$X,$$Y
ELS.
DBP.Element not found!
EIF.
Syntax
AIG.ComputerUse|P1|P2|P3|P4
AIG.cpu|P1|P2|P3|P4
Parameter Explanation
P1 - (String/Int) Window Handle.
� �- Provide a handle variable (e.g., `$$HND`) to capture a specific window.
� �- Pass `0` or an empty string to capture the currently **Active Window**.
P2 - (String) Description of the element.
� �- e.g., "The blue Login button", "The gear icon in the top right".
P3 - (Variable) Variable to receive the X coordinate (Integer).
P4 - (Variable) Variable to receive the Y coordinate (Integer).
Returns
- Sets P3 and P4 to the pixel coordinates relative to the Window's client area (usually).
- If the element is not found or an error occurs, P3 and P4 are set to 0.
See also:
? AIG.SetCuOptions (Configure resolution/quality)
? AIG.AskVision (Manual vision requests)