3. Script Language > Filesystem-Commands > FIF.

FIF. - Find in File

Previous Top Next

MiniRobotLanguage (MRL)

FIF. Command

Find in File

Intention

This command can be used to find Text or Patterns in files.

You can search in binary files or in text-files.

Search in Text-Files is line-based.

In all files, Wildcard-Search tries to deliver the optimum result.

Wildcards use:

? Any single character

* Zero or more characters

# Any single digit (0-9)

[charlist] Any single character in 'charlist'

[!charlist] Any single character not in 'charlist'

Example:

$$TXT=Peter*Müller

FIF.tw|$$FIL|$$TXT|3

This will search in a Text File the 3rd Person with the Name

"Peter (anything) Müller".

It will find

"Peter Ralf Müller"

as well as

"Peter Klaus Müller"

The command will deliver that

- found name and the

- start and Ending position in the file where it was found.

It will try to find the smallest fitting pattern.

You can specify the term "word" to have an additional tracer, complete found words until their word-end.

You can specify "nocase" in all text searches to search case independent.

Usage is simple:

' Specify a filename

$$FIL=?path\Test.txt

' Set a Pattern to search for

$$SRC=11.?7.*

' start the desired search

FIF.tsn|$$FIL|$$SRC

' See the results on TOS

STS.DUMP

' This is just to show we are ready

MBX.!

ENR.

Syntax

FIF.P1|P2|P3[|P4]

Parameter Explanation

P1 - Prototype can be one of these:

bw,binary wildcard

bs,binary string

tw,text wildcard

ts,text string

by adding "word" you have 2 more options

tww,text wildcard word

tsw,text string word

by adding "nocase" you have 4 more options

twwn,text wildcard nocase

tswn,text string nocase

twwn,text wildcard word nocase

tswn,text string word nocase

Generally all "binary" searches are "byte based" and not "Line based".

They will search any file "Byte by Byte".

This is slower then searching in text files. Therefore you can search in files that are

not "Text-Files".

The result will show the start and end-byte-position in the binary file.

There is no "word"-option for binary files.

All searches with "text" are line-based. For this, the file must have an line-delimiter,

best is $crlf$. But $cr$ or $lf$ should also work as line-delimiter.

In text based searches, the result on TOS will show the 1-based Linenumber,

and the start- and end-byte-positions in the line.

Also the last stack position will contain the found line-delimiter as a string.

This is done so you can easily reconstruct the file with the same type of delimiter,

in case you want to make changed copy.

Adding the term "word" to the prototype will call an additional task, that will

trace the found end-position to a word-boundary.

Currently word boundaries are:

ASCII(10,13,160,32) and $TAB$ and ".,;"

The "string" search modes do not compare pattern. They will just do an "Instring"-compare.

All compares are case-sensitive unless the two "nocase" searches.

Here is an example how the TOS looks after a text based Wildcard-search.

TOS

000 - Number 1 - signals that something was found.

In case the search fails, the TOS just contains a "0".

001 - Linenumber where the search found the text

002 - Starting Byte-position of the found text (in the line)

003 - Ending Byte-position of the found text (in the line)

004 - Found text as string

005 - Line delimiter that was found in the textfile as string

Using the "word" addition we can easily get the complete date!

TOS - same as before. Just the string is been traced to the word-end.

000 - Number 1 - signals that something was found.

In case the search fails, the TOS just contains a "0".

001 - Linenumber where the search found the text

002 - Starting Byte-position of the found text (in the line)

003 - Ending Byte-position of the found text (in the line)

004 - Found text as string

005 - Line delimiter that was found in the textfile as string

In binary searches the TOS looks a bit different:

TOS

000 - Number 1 - signals that something was found.

In case the search fails, the TOS just contains a "0".

001 - Found text as string

002 - Starting Byte-position of the found text (in the line)

003 - Ending Byte-position of the found text (in the line)

P2 - Filename of the File to search.

P3 - Text or Wildcard-Pattern to search for

P4 - (optional) number of found result (1st .. 2nd .. 3rd etc.)

Wildcard-Pattern (in Wildcard-Mode only):

? Any single character

* Zero or more characters

# Any single digit (0-9)

[charlist] Any single character in 'charlist'

[!charlist] Any single character not in 'charlist'

Example

'********************************************

' FIF. - Sample

'********************************************

$$FIL=?path\Test.txt

$$SRC=11.?7.*

FIF.tww|$$FIL|$$SRC

JIZ.$$000|Lab_nf

$$LIN=$$000

$$STA=$$000

$$END=$$000

$$STR=$$000

PRT.String: $$STR Found in Line: $$LIN

PRT.From Character $$STA to $$END

:Lab_nf

MBX.!

ENR.

END.

'***********************************

' FIF.-Sample

' Replace in File

' Here we use RPL to replace Text in

' a line-based Textfile

' We search for a Date like 11.?7.*

' using Wildcards ...

' and replace it with 12.08.2015

'***********************************

$$FIL=?path\Test.txt

$$TAR=?path\Target.txt

' Delete it if it existed before

DEL.$$TAR

' We search for the line with the date

$$SRC=11.?7.*

' New date - we replace the old one with this

$$NEW=12.08.2015

FIF.tww|$$FIL|$$SRC

JIZ.$$000|Lab_nf

$$LIN=$$000

$$STA=$$000

$$END=$$000

$$STR=$$000

$$TRE=$$000

PRT.String: $$STR Found in Line: $$LIN

PRT.From Character $$STA to $$END

GLC.$$FIL|$$REA

' Line where String is minus 1

CAL.$$TOL=$$LIN-1

' Copy all lines until the wanted Line

FOR.$$COP|1|$$TOL

LFF.$$FIL|$$COP|$$RES

' We use the delimiter that was found in the source file

$$TXT=$$RES$$TRE

ATF.$$TAR|$$TXT

NEX.

' Here is the Line with the date

LFF.$$FIL|$$LIN|$$TXA

RPL.$$TXA|$$STR|$$NEW

$$TXB=$$TXA$$TRE

ATF.$$TAR|$$TXB

' Now copy Rest of file

' Line where String is plus 1

CAL.$$TOL=$$LIN+1

FOR.$$COP|$$TOL|$$REA

LFF.$$FIL|$$COP|$$RES

' We use the delimiter that was found in the source file

$$TXT=$$RES$$TRE

ATF.$$TAR|$$TXT

NEX.

:Lab_nf

MBX.!

ENR.

Remarks

Internally a brute force algo is used for the wildcard-search, therefore Wildcard-Search in large binary files can be quite slow!

This has to do with the Maximum Pattern size that is checked against the Pattern if you use a * - Wildcard.

Because a * Wildcard can have theoretically any length - up to the full length of the file, there is an internal limit for the maximum found String.

Currently that limit is 256 per default.

You can change that maximum Pattern size limit using the OPT.FIF|(number) - command. The larger the maximum Pattern size, the longer the search will possibly take in wildcard search mode. The larger results you can find using one or more * - Wildcards.

Limitations: