Version 1 of Natively accessing XML

Updated 2002-09-27 05:15:07

if 0 {

Brian Theado - I read the article at [L1 ] about an extension to the ECMAScript language (a.k.a. JavaScript) that allows XML to be manipulated "natively" and thought that it would be easy to do something similar in Tcl using the tDOM extension. The basic idea is that values from an xml document can be read and modified using XPath queries. One use for this code is for Complex data structures in Tcl.

This really just amounts to a simpler interface to the domNode object than what tDOM provides.

Creation command:

  • xmlstruct::create xml - Returns an extended domNode object command

The following methods are supported in addition to those the tDOM domNode object already provides:

  • $node set xpathQuery ?value? - retrieves or modifies portions of the xml document that match the given xpathQuery
  • $node unset xpathQuery - deletes the portions of the xml document that match the given xpathQuery

Here are some examples from the above article translated to this Tcl version:

 set y [xmlstruct::create {
        <employees>
          <employee>
            <name>Joe</name>
            <age>28</age>
            <hobbies>
                 <favorite>
                 <name>Sailing</name>
               </favorite>
              <others>
                 <name>Reading</name>
                 <name>Running</name>
                 </others>
            </hobbies>
            <department>
              <name>Engineering</name>
            </department>
           </employee>
        </employees>
 }]
 # Reading values
 % $y set /employees/employee/name
 Joe
 % $y set /employees/employee/age
 28
 % $y set //name
 Joe Sailing Reading Running Engineering
 % $y set //hobbies//name
 Sailing Reading Running
 % $y set {/employees/employee[1]//favorite}
 <name>Sailing</name>

 % $y set /   ;# Returns whole document - output not included here

 # Modifying values
 % $y set {/employees/employee[name='Joe']/age} 27  ;# Braces needed to escape XPath square bracket
 % $y set {/employees/employee[name='Joe']/age}
 27
 % $y set {//employee[1]/name} Bill
 % $y set {//employee/name}
 Bill

 # Deleting
 % $y unset //department
 % $y unset //favorite
 % $y unset //hobbies
 % $y set /
 <employees>
    <employee>
        <name>Bill</name>
        <age>27</age>
    </employee>
 </employees>

}

 #
 # TODO: return the value when setting a value and return a list of values when multiple values are set
 #

package require tdom

 # By placing these procs in the ::dom::domNode namespace, they automatically 
 # become add-on domNode methods
 proc ::dom::domNode::unset {node query} {
        ::set resultNodes [$node selectNodes $query type]
        switch $type {
                attrnodes {xmlstruct::unsetattrs $node $query}
        nodes {xmlstruct::unsetnodes $resultNodes}
        empty {error "No results found for '$query'"}
                default {error "$type is an unsupported query result type"}
    }
 }
 proc ::dom::domNode::set {node query args} {
        switch [llength $args] {
                0 {return [xmlstruct::getvalue $node $query]}
                1 {return [xmlstruct::setvalue $node $query [lindex $args 0]]}
                default {error "wrong # args: should be \"set xpathQuery ?newValue?\""}
    }
 }
 namespace eval xmlstruct {}

 # Convenience function for creating an xml doc and returning the root
 proc xmlstruct::create {xml} {
        ::set doc [dom parse $xml]
        return [$doc documentElement] 
 }

 # For '$node set query' calls
 proc xmlstruct::getvalue {node query} {
        ::set resultNodes [$node selectNodes $query type]
        switch $type {
                attrnodes {
                        ::set retVal {}
                        foreach attrVal $resultNodes {
                                lappend retVal [lindex $attrVal 1]
            }
                        return $retVal
        }
                nodes {
                        ::set retVal {}
                        foreach node $resultNodes {
                ::set xml ""
                foreach child [$node childNodes] {
                    append xml [$child asXML]
                }
                                lappend retVal $xml 
            }
            # This is so the curly braces are not there due to the above lappend
            if {[llength $resultNodes] == 1} {::set retVal [lindex $retVal 0]}
                        return $retVal
        }
        empty {return ""}
                default {error "$type is an unsupported query result type"}
    }
 }

 # For '$node set query value' calls
 proc xmlstruct::setvalue {node query value} {
        ::set targetNodes [$node selectNodes $query type]
        switch $type {
                nodes {xmlstruct::setnodes $targetNodes $query $value}
                attrnodes {xmlstruct::setattrs $node $query $value}
                empty {error "Creating new elements/attributes not supported yet"}
                default {error "$type is an unsupported query result type"}
    }
 }

 # For i.e. '$node unset {/employees/employee[1]/@age}' calls
 proc xmlstruct::unsetattrs {node query} {
    ::set nodeQuery [join [lrange [split $query /] 0 end-1] /]
    ::set attribute [string range [lindex [split $query /] end] 1 end]
    foreach matchingNode [$node selectNodes $nodeQuery] {
        $matchingNode removeAttribute $attribute
    }
 }

 # For i.e. '$node set {/employees/employee[1]/@age} 25' calls
 proc xmlstruct::setattrs {node query value} {
    ::set nodeQuery [join [lrange [split $query /] 0 end-1] /]
    ::set attribute [string range [lindex [split $query /] end] 1 end]
    foreach matchingNode [$node selectNodes $nodeQuery] {
        $matchingNode setAttribute $attribute $value
    }
    return $value
 }
 # For i.e. '$node unset {/employees/employee[1]}' calls
 proc xmlstruct::unsetnodes {nodes} {
    # This probably breaks if some nodes are descendents of each other and
    # they don't get deleted in the right order
    foreach node $nodes {
        $node delete
    }
 }

 # Determines if the given string is intended to be valid xml
 proc xmlstruct::isXml {string} {
        ::set string [string trim $string]
        if {([string index $string 0] == "<") && [string index $string end] == ">"} {
                return 1
    } else {
                return 0
    }
 }

 # For i.e. '$node set {/employees/employee[1]} value' calls
 proc xmlstruct::setnodes {targetNodes query value} {
        if {[xmlstruct::isXml $value]} {
                foreach target $targetNodes {xmlstruct::setxml $target $value}
        } else {
                foreach target $targetNodes {xmlstruct::settext $target $value} 
    }
 }
 # TODO: don't allow this to be called for the documentElement node
 # (i.e. $obj set / "some text"  should not be allowed)
 # For i.e. '$node set {/employees/employee/name} Bill' calls
 proc xmlstruct::settext {node text} {
    ::set doc [$node ownerDocument]
        ::set textNode [$doc createTextNode $text]
        foreach child [$node childNodes] {$child delete}
        $node appendChild $textNode
    return $text
 }
 # For i.e. '$node set {/employees/employee} <name>Bill</name>' calls
 proc xmlstruct::setxml {node xml} {
        foreach child [$node childNodes] {$child delete}
        $node appendXML $xml
    return $xml
 }

Jochen Loewer Excellent Brian! I think this XPath based updating of XML/DOM structures makes it even easier to generate/manipulate nested structures. I also about a similar approach somewhere else. Do you mind if I add this to the next standard tDOM distribution? May be also code in C?