String Operations

<< Click to Display Table of Contents >>

Navigation:  3. Script Language > String commands > !STR.- String Command >

String Operations

STR.RegExpr

Previous Top Next


MiniRobotLanguage (MRL)

 

STR.RegExpr

Searches a target string for a pattern defined by a regular expression, returns position and length.

 

 

Intention

 

The STR.RegExpr function scans a target string (P1) for a matching expression specified in the regular expression mask (P2),

returns match position and length

If a match is found, MatchPositionVar and MatchLengthVar can be used for further string operations.

If no match is found, both MatchPositionVar and MatchLengthVar are set to zero.

The search is case-insensitive by default.

The ^ and $ operators match both the actual string start/end and line-delimiters.

To maximize performance, avoid overuse of the *, +, and ? metacharacters.

 

 

 

 

Syntax

 

 

STR.RegExpr|P1|P2[|P3|P4|P5]

 

 

 

Parameter Explanation

 

P1: TargetString: The string in which to search for the pattern.

P2: RegExMask: The regular expression pattern to search for.

P3: StartPosition (Optional): The position in the target string to start the search.

P4: MatchPositionVar (Optional): Variable to store the starting position of the match. if omitted, result is placed on TOS.

P5: MatchLengthVar (Optional): Variable to store the length of the match. if omitted, result is placed on TOS.

 

Metacharacters Table for STR.RegExpr

Char

Definition

.

(period) Matches any character, except the end-of-line.

^

(caret) Matches the actual beginning-of-line position or the preceding line-delimiter character pair (CHR$(13,10) or $CRLF).

$

(dollar) Matches the end-of-line position or the first line-delimiter character pair (CHR$(13,10) or $CRLF) that is encountered.

`

`

?

(question mark) Specifies that zero or one match of the preceding sub-pattern is allowed.

+

(plus) Specifies that one or more matches of the preceding sub-pattern are allowed.

*

(asterisk) Specifies that zero or more matches of the preceding sub-pattern are allowed.

[ ]

(square brackets) Identifies a user-defined class of characters.

[-]

(hyphen) Identifies a range of characters to match.

[^]

(caret) Identifies a complemented class of characters, which will not match.

 

Character Classes Table for STR.RegExpr

Symbol

Definition

[ ]

Square brackets identify a user-defined class of characters, any of which will match. For example, [abc] will match a, b, or c.

\\, \-, \], \e,
 \f, \n,
 \q, \r,

 \t, \v, \x##

These are the only special metacharacters recognized within a class definition. Any other use of a backslash yields an undefined operation.

[-]

The hyphen identifies a range of characters to match. For example, [a-f] will match a, b, c, d, e, or f.

 

Characters in an individual range must occur in the natural order. For example, [f-a] will match nothing.

 

Multiple ranges in a class are valid. For example, [a-d2-5] matches a, b, c, d, 2, 3, 4, or 5.

 

When the hyphen is escaped, it is treated as a literal. For example, [a\-c] matches a, -, or c.

[^]

When the caret appears as the first item in a class definition, it identifies a complemented class of characters, which will not match. For example, [^abc] matches any character except a, b, or c.

 

A range can also be specified for the complemented class. For example, [^a-z] matches any character except a through z.

Tags/Sub-patterns Table for STR.RegExpr

Symbol

Definition

( )

Parentheses are used to match a Tag, or sub-pattern, within the full search pattern. The matched sub-pattern can be retrieved later in the mask or in a replace operation with \01 through \99.

 

Parentheses can also be used to force precedence of evaluation with the alternation operator `
For example, "(Begin)|(End)File" would match either "BeginFile" or "EndFile", but without the Tag designations, "Begin|EndFile" would only match either "BeginndFile" or "BegiEndFile".

 

Note: Parentheses may not be used with ?, +, * as any match repetition could cause the tag value to be ambiguous. To match repeated expressions, use parentheses followed by \01*.

 

Escaped Characters Table for STR.RegExpr

Escape Sequence

Definition

\\

Backslash, treated as a literal value.

\b

A word boundary. Matches the start or end of a word.

\c

Enables case-sensitive search.

\e

Escape character: CHR(27) or "ESC".

\f

Formfeed character: CHR(12) or "FF".

\n

Linefeed (or newline) character: CHR(10) or "LF".

\q

Double-quote mark ("): CHR(34) or "DQ".

\r

Carriage-return character: CHR(13) or "CR".

\s

Shortest match character.

\t

Horizontal tab character: CHR(9) or "TAB".

\v

Vertical tab character: CHR(11) or "VT".

\x##

Hex character code.

\##

Tag number. Evaluated as the characters matched by tag number ##.

 

Restrictions

 

Example

 

'***********************************

' STR.-Sample

'***********************************
' Using STR.RegExpr to find the Mail-Adress

$$STA=please mail to support@smart-package.com in all cases.

$$MSK=([a-z0-9._/+-]+)(@[a-z0-9.-]+)

STR.RegExpr|$$STA|$$MSK|1|$$REA|$$REB

MBX.$$REA--$$REB

' Returns: 16 - 25

ENR.

 

 

 

 

 

Remarks

 

-

 

 

Limitations:

  To maximize performance, avoid overuse of the *, +, and ? metacharacters.

 

 

See also: