XML Format Descriptor

The format descriptor template looks like this:

<XMLLOADFORMAT NODE="">
  <TABLELIST NODE="">
    <TABLE NODE="" NAME="">
      <ROW NODE="" KEY="">
        <COLUMNLIST TYPE="">
          <COLUMN NODE="" ATTR="" NAME=""/>
        </COLUMNLIST>
        <TABLELIST NODE="">
          …
        </TABLELIST>
      </ROW>
    </TABLE>
  </TABLELIST>
</XMLLOADFORMAT>

 

The parts of the XML descriptor are:

XMLLOADFORMAT — This is the root node of the format descriptor. The NODE attribute can be set with a path of nodes from the targeting XML. e.g. "ROOT/NODE"

TABLELIST — Contains a list of TABLE nodes. The TABLELIST can be part of the XMLLOADFORMAT node or of the ROW node. The latter allows specifying parent-children relations. The NODE attribute can be set with path of nodes from the targeting XML.

TABLE — Defines a resulting table. There are two attributes that can be set: NODE and NAME. The NODE attribute can be a path of nodes from the targeting XML. The NAME node can be used to specify a name for the table. There are three options with this: regular name, name starting with a '@' character or omit the attribute. Regular name will just be the name of the table. When a '@' precedes the name, the value of the attribute with that name will be the name for the table. When you omit the attribute the name of the node in targeting XML will be the name for the table.

ROW — Defines a row in the table. Two attributes can be set: NODE and KEY. NODE is the path of nodes from the targeting XML. KEY is the name of the column that represents the key value for a row. If the KEY attribute is omitted then a key will be generated automatically.

COLUMNLIST — This is a list of COLUMN nodes that make up a row. The attribute TYPE can have 3 values:

NODE — only nodes of the targeting XML will be used as columns

ATTR — only attributes of the targeting XML will be used as columns

BOTH — both nodes and attributes will be used as columns

COLUMN — A COLUMN node has three possible attributes

NODE — path of nodes from the targeting XML. When the ATTR attribute is not specified, the content of the node will be used as value for the column.

ATTR — if specified, the value of the node attribute with the given name will be used as content for the column.

NAME — this attribute lets you specify the name of the column. If this attribute is not used, the name will be that from the node or when an attribute is used, a concatenation of the node name and the attribute name.

 

Example :

- Format Descriptor :
<XMLLOADFORMAT NODE="XML">
  <TABLELIST NODE="PARAMETERS">
    <TABLE NODE="PARAMETER" NAME="@NAME">
      <ROW NODE="ARTICLE" KEY="ID">
        <COLUMNLIST TYPE="BOTH">
          <COLUMN ATTR="ID"/>
          <COLUMN NODE="IMAGE" ATTR="WIDTH" NAME="IMAGEWIDTH"/>
          <COLUMN NODE="IMAGE" ATTR="HEIGHT"/>
          <COLUMN NODE="LANG" NAME="LANGUAGE"/>
        </COLUMNLIST>
        <TABLELIST>
          <TABLE NODE="OTHERARTICLEURLS">
            <ROW NODE="OTHERARTICLEURL">
              <COLUMNLIST TYPE="NODE"/>
            </ROW>
          </TABLE>
        </TABLELIST>
      </ROW>
    </TABLE>
  </TABLELIST>
</XMLLOADFORMAT>

- XML :
<XML>
  <PARAMETERS>
    <PARAMETER NAME="ARTICLES">
      <ARTICLE ID="1422">
        <URL>http://citysecrets.lavenir.net/bruxelles/n/1422</URL>
        <TITLE>Comment survivre au Blue Monday à Bruxelles ?</TITLE>
        <DESCRIPTION>Notre conseil ..</DESCRIPTION>
        <INTRO>Blue Monday .. </INTRO>
        <POSTCODE>1000</POSTCODE>
        <IMAGE WIDTH="300" HEIGHT="228">http://.. </IMAGE>
        <OTHERARTICLEURLS>
          <OTHERARTICLEURL>http://..</OTHERARTICLEURL>
          <OTHERARTICLEURL>http://..</OTHERARTICLEURL>
        </OTHERARTICLEURLS>
        <LANG>FR</LANG>
        <MAINLOCATION>1000</MAINLOCATION>
        <AREA>1000,1020</AREA>
        <ITEM_CREATED_DT>2012-01-16 14:25:11</ITEM_CREATED_DT>
      </ARTICLE>
    </PARAMETER>
  </PARAMETERS>
</XML>

The above format descriptor and xml will eventually be parsed and loaded as following tables:
JOB_34_ARTICLES
ID : 1422
IMAGEWIDTH: 300
IMAGE_HEIGHT: 228
LANGUAGE: FR

JOB_34_ARTICLES_OTHERARTICLESURLS
OTHERARTICLEURL: http://...
ID: 1
PARENT_ID: 1422
OTHERARTICLEURL: http://...
ID: 2
PARENT_ID: 1422