Smart Package Robot 's XML-Features

<< Click to Display Table of Contents >>

Navigation:  3. Script Language > XML-Parser >

 Smart Package Robot 's XML-Features

Smart Package Robot 's XML-Commands

Previous Top Next


 

Smart Package Robot 's XML-Commands

graphic

What is XML good for?

 

XML stands for Extensible Markup Language. It was developed at the World Wide Web Consortium, by a group of people who wanted to improve on HTML and SGML. It was defined as a formal specification in February 1998.

 

Possibly you have XML Data files as output from other software.

Possibly you want to generate or analyze XML Data from whatever sources.

 

The question is "how can you (re-)act up on these data files?"

 

The answer is now implemented in the XML.-Commands.

These commands make it easy for you to go through these XML-Files.

 

What can you do with these XML-Commands?

 

    Parse the XML-File and search for Tags, Attributes or data in these files.

    Generate new XML Files and copy parts of the old files to these new files.

 

Here are some more Infos about XML.

 

A simple XML-Data could look like this:

 

<?xml version="1.0" encoding="ISO-8859-1"?>

<note>

 <to>Peter</to>

 <from>Ralph</from>

 <heading>Reminder</heading>

 <body>Don't forget making your skript!</body>

</note>

 

 

 

Clean and UnClean your Data

 

As you can see, we have data that is enclosed and described using "tags".

We have "Start-Tags" like <note>

and we have "End-Tags" like </note>

between these we have data that can itself be XML or other data.

Note that there are some characters that should not be in this data, because they are reserved for tags. These characters are:

 

& - should be replace by &amp;

< - should be replace with &lt;

> - should be replace with &gt;

" - should be replace with &quot;

' - should be replace with &spos;

 

For this purpose the XML. command has the "Clean"-Option.

It will prepare your data to be valid as XML-Data.

And if you want your data back to the original state, just use the "unclean" option.

 

Here is an example:

 

'***********************************

' XML.-Clean/UnClean Sample

'***********************************

'

$$XMF=My current <XML\> is &new

' Now we are going to parse it

XML.Clean|$$XMF|$$XMC

XML.UnClean|$$XMC|$$XMU

PRT.Original:    $$XMF

PRT.After CLean: $$XMC

PRT.--------------

PRT.UnCLeaned:   $$XMU

MBX.!

ENR.

 

graphic

 

 

 

 

Self terminated Tags

 

Some XML-Tags do not have a closing tag because they are "Self-Terminated".

They look like this:

 

' This is "self terminated" tag.

<Name/>

 

The first tag, also known as "Root-Tag" is a system tag, because it starts with "<?".

System Tags are always self terminated. They do not have an explicit close tag.

 

' This is a system tag, its also "self terminated".

<?xml version="1.0" encoding="ISO-8859-1"?>

 

Also comment tags are self terminated:

 

<!-- this is a comment tag -->

 

 

If you want to learn more about XML, see the www-resource below.

 

Here are a few important notes:

 

1. XML is not HTML, it has stricter format rules

 

2.  XML Tags are Case Sensitive

 

3.  All XML Elements Must Have a Closing Tag unless they are "self terminated".

 

4.  You must correctly nest, your tag. This is *not* allowed:

      <B><I>Something</B></I>

 

5. It *must* be nested correctly, like this:

   <B><I><Something</I></B>

 

6.  XML Documents Should Have a Root Element

    The built in parser however will take the first < he finds as the root-element.

 

7. Currently the XML-Parser is not an XML-Validator. It will instead try to do its best

   to understand what YOU want. Instead of complaining if you have not hold

   on all the rules of XML. You may however find error-messages in the process if you do not

   obey the rules. If problems appear, the XML-Commands will always set the "Timeout-Flag"

   and will store the error. You can get the stored errors description using:

 

XML.load|$$MYF

' Test the timeout-flag for errors

' and jump to a LAbel if there were problems

JIT.Lab_Error

 

Lab_Error:

' Get the Error-Message in to a variable

XML.get error|$$ERT

DBP.$$ERT

END.

 

Possibly you will get an additional error number that is easier to react up on using:

 

XML.get error|$$ERT|$$ERN

 

8. White space and carriage returns are preserved

 

9.  Attribute values of tags must be enclosed in either single or double quotes.

   If you use a single quote, then double quotes are allowed in the value.

   Vice versa is also true.

 

10. Non-tag data is expected to be free of <, >, and &. While the parser may still parse it properly,

    this goes for attribute values as well. You should replace these like described above using the

    supplied "clean" and "unclean" options..

 

Using the CVF. (Convert File) -Command, you can convert XML-Files to better readable Text-Files. These files can be displayed in any text-editor. For this most often, just replace all "><" with ">$crlf<".

 

Generally you could also work with XML files by just using the available String-Commands.

Using the XML-Commands makes things much easier.

 

In the following tutorials we will show you how it works, and why its much easier this way.

 

 

 

How does it work generally?

 

There are few commands that can always be used, even if no XML-File has been parsed.

These are:

 

    Parse          -  parse a file that is inside a variable

    Load            - load a file from file-system

    Clear            - clear / delete the previous loaded XML File and all Flags.

    Clean Text    - used to prepare a text as XML-Data

    Unclean Text - undo the "clean" command

    Get Error      -  retrieve the XML-Error - Status

    Clear Error    - clear the XML-Error Status and all error flags and variables

 

also there are some IML.-commands that can be used on any text.

Currently these are (see under "no parse"):

graphic

 

All the other XML/IML.-Commands require an XML-File or XML-Data to be loaded first.

 

At the process of loading/parsing, the XML-Parser Engine adds internal tags to any of the loaded characters. Loading an XML.-File is as simple as:

 

$$MYF=C:\filename.xml

XML.load|$$MYF

 

The loading command will automatically call the parsing command.

If you have already loaded the XML-file into a variable.you can use the command:

 

$$XML=3k<body attrib="value">Hallo nothing="fake"</body>

XML.parse|$$XML

 

After this, the file is in the internal XML Memory and it has been tagged with "Flags" in a way that you can much easier work with the file. You can read more details about these "Flags" below.

Using these Flags you can later more easily navigate inside the XML-Document.

You can see this using the

 

XML.dump

 

Command. Lets see an example, using the built in "dump" option.

 

graphic

 

 

' This is a XML-"One-Liner"

$$XML=3k<body attrib="value">Hallo nothing="fake"</body>

' First we need to parse it

XML.parse|$$XML

' Here we check if there was an error during the parsing

JIT.Lab_Err

' Now we display the internal tables

XML.dump

' This is just to wait until want to close the script

MBX.Wait

ENR.

 

' This is the part that would be called if there would be an error

' during the parsing of the XML

:Lab_Err

XML.get error|$$ERT

DBP.$$ERT

ENR.

 

Above is the result output of this script.

 

In the above picture we see on the very left the "address" of each byte of XML-Code,

 

It starts basically with the first character on the left  as 1 and counts up until the length of the file.

 

Note that several XML commands will never go beyond the root-address that is the first "<" in the file. In our case the root-address is 3. Which is the address of the very first start tag.

 

None of the XML commands will go above the end address of the last byte in the XML-File.

Doing so would lead to a GPF (Windows General Protection Fault) and is therefore prevented.

 

The second number from the left shows the "Level" "L:" of the character.

Each Start Tag increases the level by 1, while each end tag decreases the level by one.

 

Then comes "A:" the ASC-value of the letter(code) and the "C:" Character.

Note that some .characters are replaced by a "-" if they are not printable.

 

After that, on the right you will see a comma separated list with flags that are being assigned to each byte in the XML-code. You can work with these flags using the commands:

 

XML.search inside down|$$FLG

XML.search inside up|$$FLG

 

These two commands search "Inside" that is between the "<" and the ">" brackets.

 

If you want to search for anythingthat is outside (including the "<"">" themselves, you need to use these commands.

 

XML.search all down|$$FLG

XML.search all up|$$FLG

 

To search something you simply define $$FLG as one of the SPR XML-Flags below.

For example:

 

$$FLG=open bracket

XML.search all down|$$FLG

 

will search a "open bracket" in the "down" direction. Note that if the current position is on a "open bracket", then this one will be found!

Therefore you may want to use the

 

XML.increment position

 

command before you start search. The position that is found, is placed on the TOS (Top of Stack) unless you specify a return variable. Like this:

 

XML.increment position

$$FLG=open bracket

XML.search all down|$$FLG|$$POS

 

But even if you do not use a return variable, all searches remember the "actual position" internally. Like you locate a window, also the XML-Parser has always a "actual position".

Immediately after parsing the XML-File, the actual position is set to the "Root-Position".

That is the first "<" in the file.

 

Now if you search something then it can be found or it can not be found. If nothing is found, then the actual position is not assigned!

The search command will return a zero in this case.

If you want to find out the actual position, you can use the

 

XML.get position|$$POS

 

command. In the same way you can set the actual position to another place in the XML-File using the

 

$$POS=79

XML.set position|$$POS

 

commands. This way you can jump up and down in the XML file like you desire.

 

Note that this is just the "basics". There are multiple command that make it more easy for you to find specified places or data inside the XML-File.

 

To understand the internals of the XML-Parser better you can read the Chapter:

 

SPR - XML-Parser Flags

 

 

 

 

 

 

 

 

At this place we will not explain more details about XML, as you can get this information easily in the World Wide Web. For example here:

 

 

1. XML-Basics

2. XML-Schools

3. XML-Validator