Version 16 of A tDOM Tutorial

Updated 2003-05-29 19:52:56

MC 27 May 2003: Inspired by a question on the Tcl'ers Chat, I decided to start some tDOM XML Tutorials.


First we need to load the package.

 package require tdom

Then, let's start with a small XML document. Note, XML allows attributes to be surrounded by either single or double quotes. The advantage of using single quotes is that they don't need to be escaped in Tcl.

 set XML "
    <order number='1'>
        <customer>John Doe</customer>
        <phone>555-4321</phone>
        <email>[email protected]</email>
        <website/>
        <parts>
            <widget sku='XYZ123' />
            <widget sku='ABC789' />
        </parts>
    </order>"

Now let's parse it:

 set doc  [dom parse $XML]

and get the root node so we can start working with the DOM tree:

 set root [$doc documentElement]

Now, suppose we want to print out the text data that the <phone/> number tag contains. If we wanted to be really verbose we could observe that the text node is the only child of the <phone> node, which is the second child of the <order> node (which is our $root node).

 set node [$root firstChild]  ;#  <customer/>
 set node [$node nextSibling] ;# <phone/>
 set node [$node firstChild]  ;# <phone>'s text node

We could write the above all on one line as:

 set node [$node firstChild [$node nextSibling [$root firstChild]]]

Or we could use the selectNodes method and an xpath expression to specify the node we want:

 set node [$root selectNodes /order/phone/text()]

Personally, I prefer the latter. Now, to print out the phone number we can use either of:

 puts [$node data]
 puts [$node nodeValue]

Now let's suppose we want to change the phone number (maybe to include an area code?). We can specify a new value for the text node:

 $node nodeValue "(999) 555-4321"

Now, let's look at attributes a bit. We can easily get an attribute from a node:

 set order_num [$root getAttribute number]

It's an error to attempt to get the value of an attibute that doesn't exist unless we provide a default:

 set bogus_attrib [$root getAttribute foobar "this is a default value"]

And we can set an attribute, which will either replace any current attribute (if already present) or create a new attribute:

 $root setAttribute status "Shipped to Customer"

We can also easily test for the presence of an attribute and easily remove an attribute we no longer need/want:

 if {[$root hasAttribute foobar]} {
     $root removeAttribute foobar
 }

Now let's suppose we want to add some additional widget's to this customers order. There are several ways, first is the appendXML method:

 set node [$root selectNodes /order/parts]
 set sku NEW456
 $node appendXML "<widget sku='$sku'/>"

The other way is to use the appendFromList method. For this we need a 3-element list:

  1. tag name
  2. any attributes in key/value pair (same format returned by array get)
  3. nested contents (another list of this form) or the empty string {} if this element has no children
 $node appendFromList [list widget {sku OLD999} {}]

Another way is to create new nodes and then append or insert them into the DOM tree:

 set comment [$doc createComment "this is a comment"]
 $root appendChild $comment

 set node [$doc createElement widget]
 $node setAttribute sku FOO333

 [$root selectNodes /order/parts] appendChild $node

An easy way to add a text node is to use the appendFromList method. Since a text node doesn't have any child-nodes we omit the 3rd list element. We use the special tag name #text:

 set node [$root selectNodes /order/website]
 # check and make sure there isn't already a child text() node
 if {[$node selectNodes text()] == ""} {
     $node appendFromList [list #text http://somewhere.example.com]
 }

 # another equivalent
 if {[$node selectNodes text()] == ""} {
     $node appendChild [$doc createTextNode http://somewhere.example.com]
 }

Let's delete the text node we just added:

 [$root selectNodes /order/website/text()] delete

Next, for no real good reason, let's move the <widget/> whose sku is ABC789 to the top of the list of widgets (i.e., the first child of <parts>).

 set node [$root selectNodes /order/parts]

 # Two different (equivalent) approaches for selecting the node to move
 set move [$root selectNodes {widget[@sku='ABC789']}]
 set move [$root find sku ABC789]

 # remove the child
 $node removeChild $move

 # and insert it before the current first child
 $node insertBefore $move [$node firstChild]

Let's loop over our DOM tree and print out the name and type of each node, as well as a list of attributes (if any) for that node:

 proc explore {parent} {
     set type [$parent nodeType]
     set name [$parent nodeName]

     puts "$parent is a $type node named $name"

     if {[llength [$parent attributes]]} {
         puts "attributes: [join [$parent attributes] ", "]"
     }

     foreach child [$parent childNodes] {
         explore $child
     }
 }

 explore $root

We can serialize our DOM tree back to XML using:

 set XML [$root asXML]
 puts $XML

Now let's see how to handle a situation when an XML document has elements of the same name at the same level in the document. Notice that there are more than one <order> elements below:

 set XML "
    <order>
        <customer>John Doe</customer>
        <phone>555-4321</phone>
        <email>[email protected]</email>
        <website/>
        <parts>
            <widget sku='XYZ123' />
            <widget sku='ABC789' />
        </parts>
    </order>
    <order>
        <customer>Jane Doe</customer>
        <phone>555-4321</phone>
        <email>[email protected]</email>
        <website/>
        <parts>
            <widget sku='XYZ123' />
            <widget sku='ABC789' />
        </parts>
    </order>"

Lets parse the order elements

 set doc [dom parse $XML]
 set root [$doc documentElement]

 # Since there are more than order nodes a Tcl list will be returned from the selectNodes method.
 set nodeList [$root selectNodes /order/phone/customer()]

 # Parse node1 from the returned list.
 set node1 [lindex $nodeList 0]

 # Parse node2 from the returned list.
 set node2 [lindex $nodeList 1]

 # Display there values
 puts [$node1 nodeValue]
 puts [$node2 nodeValue]

Further reading:

XPath:

There isn't really anything Tcl-specific about xpath other than brackets [ ] have special meaning to both Tcl and the xpath engine (so be sure to properly quote/escape them).

  • add links to good xpath tutorials/references here

XSLT:

tDOM's xslt support is very complete (and well-tested). The only Tcl-specific aspects are any (optional) additional xpath functions used in your stylesheet that are defined in Tcl. The xslt.tcl script in the apps directory of the tDOM distribution is a good example to study.

Wiki pages that demonstrate the use of tDOM (for further study):


[ Category XML | Category Tutorial | Category Example ]