tDOM

Difference between version 216 and 217 - Previous - Next
a [Tcl] extension for parsing [XML].



** Critique **


[Rolf Ade] [chat%|%has said] that the most valuable thing in tDOM is [Jochen
Loewer%|%Jochen Loewer's] implementation of [xpath], which is writen in [C].

[PT]:  I'll second that. tDOM's XPath processing is '''extremely''' competent.



** Features **

   Based on the [Expat] parser:   

   allows access to the [DOM] trees as Tcl DOM objects:   
   
   Includes an [HTML] reader that reads HTML and generates a DOM tree.:   

   Includes novel mechanisms such as `appendFromScript` for generating elements.:   



** See Also **

   [Interfacing with XML]:   

   [starDOM]:   which combines tDOM and [BWidget] - [Playing SAX].

   [Natively accessing XML] and [Tcllib tree style interface to tDOM]:   which modify/extend the tDOM API

   [XML Graph to canvas]:   uses [expat] and [Tcldot] to render directed graphs

   [XML/tDOM encoding issues with the http package]:   

   [tDOM and Coroutines]:   using [coroutine]s with tDOM and [Expat] to parse a document in SAX mode



** Documentation **

   [http://tdom.org/index.html/doc/trunk/doc/index.html%|%Overview%|%]:   

   [http://tdom.org/index.html/doc/trunk/doc/dom.html%|%dom%|%]:   Create an in-memory DOM tree from XML

   [http://tdom.org/index.html/doc/trunk/doc/domDoc.html%|%domDoc%|%]:   Manipulates an instance of a DOM document object

   [http://tdom.org/index.html/doc/trunk/doc/domNode.html%|%domNode%|%]:   Manipulates an instance of a DOM node object

   [http://tdom.org/index.html/doc/trunk/doc/expat.html%|%expat%|%]:   Creates an instance of an expat parser object

   [http://tdom.org/index.html/doc/trunk/doc/expatapi.html%|%expatapi%|%]:   Functions to create, install and remove expat parser object extensions.

   [http://tdom.org/index.html/doc/trunk/doc/tdomcmd.html%|%tdom%|%]:   tdom is an expat parser object extension to create an in-memory DOM tree from the input while parsing.

   [http://tdom.org/index.html/doc/trunk/doc/tnc.html%|%tnc%|%]:   tnc is an expat parser object extension, that validates the XML stream against the document DTD while parsing.



** Tutorials and articles **

   [A tDOM tutorial]:   available here on the Wiki.

   [http://tdom.github.com/documents/tDOM3.pdf%|%tDOM – A fast XML/DOM/XPath package for Tcl written in C%|%] ([http://web.archive.org/web/20000823141958/http://www.tu-harburg.de/skf/tcltk/papers2000/tDOM3.pdf%|%alternate]):   by Jochen Loewer, a valuable description of tDOM's origins and uses presented at the first European Tcl Conference.

   [ftp://ftp.tcl.tk/pub/tcl/all/t/tdom/tclum2001-tdom.pdf%|%tclum2001-tdom.pdf](alternates (compressed) [https://github.com/tDOM/tdom.github.com/raw/master/documents/tclum2001-tdom.zip%|%1] [http://web.archive.org/web/20090815145215/http://www.tdom.org/documents/tclum2001-tdom.zip%|%2]), [Jochen Loewer], [Second European Tcl/Tk Users Meeting], 2001:   design, API, usage, eBay [web scraping] example

   [http://www.linux-magazine.com/issue/20/tDOM.pdf%|%Processing XML documents with Tcl and tDOM]:   [Carsten Zerbst], 2003, illustrating how practical and easy tDOM is.  And here's the same article in german: [http://www.linux-magazin.de/Artikel/ausgabe/2002/04/feder/feder.html%|%XML-Dokumente mit Tcl und tDOM bearbeiten%|%]

   [http://www-106.ibm.com/developerworks/xml/library/x-xmi/index.html?dwzone=xml%|%XMI and UML combine to drive product development], [Cameron Laird], 2001:   profiles a company ([Ideogramic]) that uses tDOM for XMI processing in its product.

   [http://web.archive.org/web/20040417113005/http://www-106.ibm.com/developerworks/xml/library/x-tdom.html%|%Using tDOM and tDOM XSLT], [Cameron Laird], 2002-02-01:   highlights tDOM's performance, while also supplying a recipe for your own start with the package.



** Releases **

tDOM-0.9.1, 26. Jul. 2018: http://tdom.org/downloads




** Obtaining **

   [https://tdom.org/index.html%|%official repository], maintained by [de%|%Rolf Ade]:   To clone with [fossil]:

======none
fossil clone http://tdom.org/index.html
======

   [https://core.tcl.tk/tdom/%|%mirror]:   

   [http://irrational-numbers.googlecode.com/files/tdom-0.8.3.zip%|%irrational-numbers project%|%]:   A collection of binaries for Win/Linux/Mac in a single package.  At runtime the right binary library is selected.


   [ftp://ftp.revier.com/pub/tcl/libs/tdom_0.7.4-1_i386.deb%|%tdom_0.7.4-1_i368.deb]:   [Jochem Huhmann]: A binary package of tDOM 0.7.4 for Debian
GNU/Linux 3.0 (Woody)

   [http://groups.yahoo.com/group/tdom/%|%the Yahoo tdom group]:   Various combinations of mirrors, backups, and experimental work are often
announced here. This is particularly important, as sdf.lonestar.org seems relatively erratic in reliability.  The authors of tdom announce their new releases here.


   [http://phaseit.net/binaries/tDOM-0.63.tar.gz%|%tDOM-0.63.tar.gz%|%]:   provided by [Phaseit] 



** Resources **

   [http://tdom.org/index.html/ticket%|%issue tracker]:   

   [http://groups.yahoo.com/neo/groups/tdom/info%|%yahoo tdom group]:   



** Licensing **

[RS] 2011-02-28:

Since my employer applies stricter license checks now, I was told that I cannot
use tdom in software for customers any more, because of a license conflict:

   * The library itself is under MPL (Mozilla Public License)

   * Two of the source files are under LGPL: ''generic/DOMhtml.c'' and ''generic/XMLsimple.c''

According to the FSF, these two licenses are contradicting. Can this be helped?

[schlenk] 2011-02-28:  Those two LGPL headers seem to be from [drh] probably
taken from tkhtml2? If yes it should be easy enough to fix, as DRH has release
the tkhtml2 code into public domain and tkhtml3 is BSD licensed (was LGPL
before). Probably just needs asking drh and [Rolf Ade] to fix it.



** Examples **

   [Downloading your utility usage from Pacific Gas and Electric using TCL]:   

   [Scraping timeentry.kforce.com]:   


A tDOM tutorial could include these points (proposed by [Rolf Ade]):
   * how to use the SAX interface (setting up event handler scripts, stacking event handler scripts, error handling etc)
   * how to parse (parse a string, or read the XML data out of a channel, with notes about the encoding problems)
   * how to get a DOM tree representation of XML data
   * the XPointer, DOM 1 and DOM 2 and XPath commands to find or to navigate to some nodes of interest
   * how to use tDOM's XSLT engine (to build an XSLT processor or in server application)
   * how to serialize DOM trees (as XML, HTML or text)
   * how to validate the XML data while parsing 
   * how to create DOM trees from scratch
   * how to create additional, tcl scripted DOM methods
   * how to create additional, tcl scripted XPath functions 



*** Example: An XPath query ***

[PS] 2004-11-04:

A quick way to get values from XML documents:

======none
set xml {
<agents>
    <agent id="007">
        <name type="first">James</name>
        <name type="last">Bond</name>
        <age>Still attractive</age>
        <sex>Male</sex>
    </agent>
    <agent id="013">
        <name type="first">Austin</name>
        <name type="last">Powers</name>
        <age>Depends on your timeline</age>
        <sex>Yes, please</sex>
    </agent>
</agents>
}


set dom [dom parse $xml]
set doc [$dom documentElement]
puts "Agent: [$doc selectNodes {string(/agents/agent[@id='013']/@id)}]"
puts "First Name: [$doc selectNodes {string(/agents/agent[@id='013']/name[@type='first'])}]"
puts "Last Name: [$doc selectNodes {string(/agents/agent[@id='013']/name[@type='last'])}]"
puts "Age: [$doc selectNodes {string(/agents/agent[@id='013']/age)}]"
======

Will output:

======
Agent: 013
First Name: Austin
Last Name: Powers
Age: Depends on your timeline
======

*** Example: innerXML as extension to the DOM node methods ***

[heinrichmartin]: 2014-06-05
   * initial version
[heinrichmartin]: 2015-03-26
   * re-use parameter checks from asXML (allows -indentAttrs from trunk)
   * fixed indentation of first line

proc comment in [TclDoc] format:

======
% package require tdom
0.8.3
%    # returns the inner XML of a node
   # @param node a DOM node
   # @param -indent the optional indentation (see tdom's <code>asXML -indent</code>). Defaults to use tdom's default
   # @param -indentAttrs the optional attribute indentation (see tdom's <code>asXML -indentAttrs</code>). Defaults to use tdom's default
   # @return the inner XML string
   proc ::dom::domNode::innerXML {node args} {
      # re-use parameter checks from asXML
      set result [$node asXML {*}$args]
      if {[$node hasChildNodes]} {
         # strip element tag
         set result [string replace $result 0 [string first ">" $result]]
         set result [string replace $result [string last "<" $result] end]
         # strip whitespace, but not the indentation
         set indent ""
         regexp {^\s*?\n??([ \t]*?)\S} $result -> indent
         set result $indent[string trim $result]
         return $result
      } else {
         return [$node text]
      }
      # UNREACHABLE
   }
% % % % % %
% dom parse {<A specialchar="&lt;"><B><C x="y"/></B></A>} doc
domDoc0xf6aa10
% $doc documentElement root
domNode0x1011838
% [$root selectNodes //C] innerXML
% [$root selectNodes //B] innerXML
    <C x="y"/>
% [$root selectNodes //A] innerXML
    <B>
        <C x="y"/>
    </B>
% [$root selectNodes //A] innerXML -indent none
<B><C x="y"/></B>
% [$root selectNodes //A] innerXML -indent 0
<B>
<C x="y"/>
</B>
======


** XML Namespaces **


'''[PYK] 2019-08-16:'''

Commands such as `createElement` and `appendFromList` are not
XML-namespace-aware.  They can be used to generate documents with elements that
have namespaced tags and attributes, but those tags and attributes aren't boundto any namespaces.  This is sufficient if the goal is simply to generate a
well-formed XML document containing
`[https://www.w3.org/TR/xml-names/%|%expanded names]`.
In order to use such documents in a namespace-aware manner with commands like
`selectNodes`, one can produce the XML representation of the document and
create a new document by parsing that representation.  In the new document all
elements are bound to their proper namespaces and `selectNodes` works as
documented.

`appendChild` is XML-namespace-aware, and it is possible to append a node from
one document into another document.  This feature, which is a tdom extension to
the DOM specification, is used in the following XML-namespace-aware
implementation of `appendFromList`:


======
proc appendFromList {node list} {
    dom createDocumentNS urn:oasis:names:tc:SAML:2.0:metadata md:EntityDecriptor doc
    $doc documentElement root
    $root appendFromList $list
    set xml [$root asXML]
    dom parse $xml doc2
    $doc2 documentElement root2
    $root2 firstChild child
    while {$child ne {}} {
        $node appendChild $child
        $child nextSibling child
    }
    return
}
======


Another alternative is
[https://chiselapp.com/user/pooryorick/repository/tdomutil/index%|%tdomutil],provides an pure-Tcl implementation of `appendFromList` that uses itshe owintended
paresnt nolvder and `its ancreastors to reEsolve the namespace for a giventNS` prefix. 



** An XPath query with a namespace **


Trying to retrieve elements from a namespace was confusing me until I found
[http://groups.google.com/forum/#!msg/comp.lang.tcl/oPpwmXWdr-Y/c65hfaqLWWwJ%|%Making
XPATH queries against a document w/ namespaces], [comp.lang.tcl], 2006-04-27,
which demonstrates the use of `-namespaces` with `selectNodes`.

skm 2006-12-10

======
% set fh [open small.gpx]
file13bbfa8
% set xmldata [read $fh]
<?xml version="1.0"?>
<gpx
version="1.0"
creator="ExpertGPS 1.1 - http://www.topografix.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://www.topografix.com/GPX/1/0"
xsi:schemaLocation="http://www.topografix.com/GPX/1/0 http://www.topografix.com/GPX/1/0/gpx.xsd">
<time>2002-02-27T17:18:33Z</time>
<bounds minlat="42.401051" minlon="-71.126602" maxlat="42.468655" maxlon="-71.102973"/>
<wpt lat="42.438878" lon="-71.119277">
<ele>44.586548</ele>
<time>2001-11-28T21:05:28Z</time>
<name>5066</name>
<desc><![CDATA[5066]]></desc>
<sym>Crossing</sym>
<type><![CDATA[Crossing]]></type>
</wpt>
<wpt lat="42.439227" lon="-71.119689">
<ele>57.607200</ele>
<time>2001-06-02T03:26:55Z</time>
<name>5067</name>
<desc><![CDATA[5067]]></desc>
<sym>Dot</sym>
<type><![CDATA[Intersection]]></type>
</wpt>
</gpx>

% close $fh
======

This is edited down from http://www.topografix.com/fells_loop.gpx for brevity's
sake.

Get the doc and the root.

======none
% set doc [dom parse $xmldata]
domDoc013CAE80
% set root [$doc documentElement]
domNode013B1FDC
======

Try to get a list of
[http://www.topografix.com/gpx/1/1/#type_wptType%|%waypoints%|%]

Define a namespace and use it with selectNodes

======none
% set ns {gpxns http://www.topografix.com/GPX/1/0}
gpxns http://www.topografix.com/GPX/1/0
======

Sample XPath queries with and without the namespace

======none
% $root selectNodes -namespaces $ns //wpt
% $root selectNodes -namespaces $ns //gpxns:wpt
domNode013B2060 domNode013B2194
% $root selectNodes -namespaces $ns {//gpxns:wpt[1]}
domNode013B2060
======


[kpv] Another approach is to set the namespaces to be 
searched once in the beginning.

======
% $doc selectNodesNamespaces $ns
% $root selectNodes gpxns:wpt
======

** Example: Parse HTML **

Using tdom to parse some html:

======
package require tdom
package require http

# html - the source of the html page
proc pullOutTheURLs {html} {

# Parse your HTML document into a DOM tree structure
set doc [dom parse -html $html]

# root will be the root element of your HTML document,
# ie. the HTML element
set root [$doc documentElement]

# The following finds all anchor links <a>. It isn't clear to me,
# if you also interested in the urls of <area>, <link> and <base>
# elements.
set nodeList [$root selectNodes {descendant::a}]

# init the result list
set urlList {}

# Pull out the Values of the href attributes
foreach node $nodeList {
    set attList [$node attributes *]
    foreach attribute $attList {
        if {[string tolower $attribute] == "href"} {
            lappend urlList [$node getAttribute $attribute]
            break
        }   
    }
}

# Get rid of the DOM representation of your HTML document
$doc delete

# finished
return $urlList
}


# Test it
set urlList [pullOutTheURLs [http::data [http::geturl [lindex $argv 0]]]]

foreach url $urlList {
    puts $url
}
======

MHN 2012-11-08: I have not tested the performance, but the nested foreach looks
like an overhead (compared to byte-code tdom)... How about the following?

======
# pull out the values of the href attributes
foreach attr [$root selectNodes {descendant::a/@href}] {
    lappend urlList [lindex $attr 1]
}

# if case insensitive matching is required, use {descendant::a/@*['href' = translate(name(),'HREF','href')]}

# if no URL matches "href" then it can be achieved in one line
set urlList [lsearch -all -inline -exact -not [concat {*}[$root selectNodes descendant::a/@href]] href]
======



** XSLT Example **

I am an XSLT newbie. I didn't find an XSLT example for tDOM on the wiki, so I
thought I would provide one. It is almost too easy to warrant one, but I think
it helps anyway.

I took the sample XML and simple XSL from
[http://www-106.ibm.com/developerworks/xml/library/x-xslt/?article=xr%|%What
kind of language is XSLT?], Michael Kay, 2005-04-20. For paste here, I will use
a shorter version of the xml file.

======none
% package require tdom
080
% set gamedata {<results group="A">
<match>
    <date>10-Jun-1998</date>
    <team score="2">Brazil</team>
    <team score="1">Scotland</team>
</match>
<match>
    <date>23-Jun-1998</date>
    <team score="0">Scotland</team>
    <team score="3">Morocco</team>
</match>
</results>}
% set gamedoc [dom parse $gamedata]
domDoc012E5690
======

''xsldata'' was set to the first xsl example from the ''What kind of language
is XSLT?'' essay.  That is:

======none
% set xsldata {
         <xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
            <xsl:template match="results">
            <html>
                    <head>
                    <title>
                        Results of Group <xsl:value-of select="@group"/>
                    </title>
                    </head>
                    <body>
                            <h1>
                                Results of Group <xsl:value-of select="@group"/>
                            </h1>
                            <xsl:apply-templates/>
                    </body>
            </html>
            </xsl:template>
            <xsl:template match="match">
                    <h2>
                        <xsl:value-of select="team[1]"/> versus <xsl:value-of select="team[2]"/>
                    </h2>
                    <p>Played on <xsl:value-of select="date"/></p>
                    <p>Result: 
                            <xsl:value-of select="team[1] "/> 
                            <xsl:value-of select="team[1]/@score"/>,
                            <xsl:value-of select="team[2] "/> 
                            <xsl:value-of select="team[2]/@score"/>
                    </p>
            </xsl:template>
    </xsl:transform>
}

% set soccerstyle [dom parse $xsldata]
domDoc012DB570
% $gamedoc xslt $soccerstyle gamehtml
domDoc012E6CC0
% $gamehtml asXML
<html>
    <head>
        <title>
        Results of Group A</title>
    </head>
    <body>
        <h1>
        Results of Group A</h1>
        <h2>Brazil versus Scotland</h2>
        <p>Played on 10-Jun-1998</p>
        <p>Result: 
            Brazil2,
            Scotland1</p>
        <h2>Scotland versus Morocco</h2>
        <p>Played on 23-Jun-1998</p>
        <p>Result: 
            Scotland0,
            Morocco3</p>
    </body>
</html>
======



** Loading XML from file with the right encoding settings **

tDOM provides some helpers for this in the tdom.tcl lib distributed with the
tDOM. Cited from [Rolf Ade]'s answer in
[http://groups.google.com/forum/#!topic/comp.lang.tcl/ilbpBNs5KDU%|%tdom again
- BOM and/or encoding problems & how to create <?xml ...?> node?], [comp.lang.tcl], 2008-05-22

    :   tDOM::xmlOpenFile expects a filename and returns a file channel handle, which
    :   is readily fconfigure'd and seek'ed to get feeded into a dom parse `-channel` ...
    :   Please note, that the proc opens a channel and returns that. That channel will
    :   not magically go away, if you're done with it. It's your responsibility to
    :   close that channel, if you don't need them anymore. So, a typical use pattern
    :   (sure, not the only) is

======
set xmlfd [tDOM::xmlOpenFile $filename]
set doc [dom parse -channel $xmlfd]
close $xmlfd
======

`tDOM::xmlReadFile` is just a wrapper around `tDOM::xmlOpenFile`. The
pattern is

======
set doc [dom parse [tDOM::xmlReadFile $filename]]
======

and you're done. No leaking file channels, filename in, DOM tree out.



** Preserving Whitespace **

[PYK] 2014-07-26: I was looking for a way to parse HTML while preserving
newline characters in `<pre>` elements.  In my case,  `-keepEmpties` did the
trick.  See Also:

   [https://stackoverflow.com/questions/7908627/how-to-apply-an-xslt-transformation-that-includes-spaces-to-an-xml-doc-using-tdo/7908668#7908668%|%How to apply an XSLT transformation that includes spaces to an XML doc using tDOM?], 2011-10-26.:   

   [https://groups.google.com/d/msg/comp.lang.tcl/ulu5tbGfrvs/O-iNWFvVc_EJ%|%asymmetry in tDOM XML parsing and rendering], [comp.lang.tcl], 2008-04-11:   



** Discussion **


Uhm, this "full compliant" XSLT support is a little bit to much said - Jochen
would have done better, if he had used my "almost compliant". 

Don't get this wrong. It is true, that tDOM's XSLT support was greatly improved
over the last releases. I'm pretty sure, you could use every 'real live' XSLT
stylesheet with it, with correct result. I won't confuse you with outlying
nifty difty XSLT details, therefore I omit the list of things that are not quite
right.

There are (prominent) tools out there, that don't do it better than tDOM's XSLT
engine, due to my extensive testing, but nevertheless claim since a couple of
months 100% XSLT compliance, which is simply not completely true, and nobody
bothers.

So just use tDOM's XSLT and you will be happy with it. It's only, that I know
my business and "full compliant" is not 100% true. Even the missing outlying
details will be added, in the next months, for sure. I wonder, what Jochen then
will write, in the announcement ;-).

The XPath support is now indeed really very compliant and complete. de.

----

[skm] 2005-03-11 Many of the links here in [http://www.tdom.org/index.html#SECTid80ac968%|%Further Readings%|%] are stale. Does anyone have up-to-date locations for these papers? thanks.

----

[LV]: Note that even though the tdom.org web site shows no release since 2004, email on the tdom mail list encourages people to use the CVS to fetch the latest version of code, which developers assure continues to be updated. Note that 64 bit users need to get the cvs version of tdom to get it working properly, according to the developer.

Also note that http://www.tdom.org/files/ appears to have the tdom ''releases'' that are being performed.

----

[LV] 2007-09-11:

Just a note - if you build tdom from source, do a "make test" and see a crash, try rebuilding tdom with the '''--disable-tdomalloc''' flag; on my SPARC Solaris 9 system, this resolved the crash.

----

'''Getting the Current Namespace Mapping for a Node'''

[DKF]: It came up in [comp.lang.tcl] recently that [Gerald Lester] needed access to the current namespace mapping for an arbitrary DOM node (parsing [XML Schema] and [WSDL] requires this sort of thing). [Rolf Ade] told us how:


======none
Given that the variable node2 holds the node command, for which you
want to know all prefix-URI mappings in scope, do
======

======
$node2 selectNodes namespace::*
======

This returns a list of two-element lists (i.e. pairs). Of each pair, the first element is either the literal "xmlns" (stating that this is the description of the namespace of unqualified elements) or "xmlns:" followed by a string local namespace name (stating that this is the description of a named namespace and that elements and attributes in that namespace will be using qualified names). The second element of the pair is the URI that characterises the namespace (which need not resolve to anything).

Note (for XML neophytes) that unqualified attributes are always in ''no'' namespace at all (unlike unqualified elements).

----

[LV] 2007-10-09:
Anyone have an example of using tDOM to validate XML using a DTD?

See the page [tnc] for that.

----

[LV]: from comp.lang.tcl, we read:

======none
>  2. I evidently don't understand the domNode man page. For the 
>     "getAttribute" method it says: 

>        getAttribute attributeName ?defaultValue? 
>           Returns the value of the attribute attributeName. If 
>           attribute is not available defaultValue is returned. 


>      It also doesn't give any example(s). 


> Can someone point me to some sample code? 
======



If I have a Dom Document $doc 
Which has an element somewhere: 

======none
<Foo Bar="baz">Stuff</Foo> 
======

======
set $node [$doc getElementsByTagName "Foo"] 
puts [$node getAttribute "Bar"] 
======

will print: 

======none
baz 
======

----

[male] 2010-02-24 - [Sorting nodes in tdom]

----

[CMcC] 2010-04-13 21:05:27:

I was having some confusion with [[$node attributes]] and xml namespaces.  [evilotto] helped me decode what it returns, and I record the findings for posterity.

attributes may return a singleton.  In that case, the attribute name is just that.

attributes may return a three-element list.  In that case it may be approximated as [[lassign $a name namespace uri]] ... however:  the uri may be empty and the name and namespace equal.  In that case, the attribute appears to be a definition of the uri for the namespace given by $name, although the uri thus defined is not returned in the uri field, the uri-defining attribute is named as if it were $ns:$ns.  Finally, the {xmlns {} {}} form appears to be special, and to indicate that the xmlns namespace's uri is being defined.

There.  Clear as mud.  No wonder XML is so popular (?)

----

[aricb] 2010-06-17:

An '''XML declaration''' is a processing instruction along the lines of `<?xml version="1.0" encoding="UTF-8" standalone="no" ?>` at the beginning of an XML document.  As nearly as I can determine, the recommended way to put this line in a tDOM-generated XML file is something along the lines of:

======
puts $xmlfile "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\" ?>"
======

In other words, tDOM doesn't provide any facilities for outputting XML declarations (see [http://groups.google.com/group/comp.lang.tcl/browse_frm/thread/8a56e904db392835?tvc=1] and [http://groups.google.com/group/comp.lang.tcl/browse_frm/thread/865beb80e5837e2f/650d922399d811a6]).

However, when parsing an XML file with tDOM, you can capture the value of the XML declaration's encoding attribute using:

======
dom parse [tDOM::xmlReadFile $filename encStr]
======

the value will be stored in `$encStr` (see [http://groups.google.com/group/comp.lang.tcl/browse_frm/thread/8a56e904db392835?tvc=1%|%this google thread%|%]).

[DKF]: Note that the XML declaration is not formally a processing instruction, though it uses the same basic syntax.



** Historical **

www.tdom.org was terminated around 2010-11

[bovine] set up the github project with a mirror of the old website and a fork from the
repository.

[makr] kept as mirror of the old CVS repository.

----

[snichols]: 

I recently compiled tdom 0.8.0 on Windows XP successfully with threads enabled,
but when doing a package require from within Tcl 8.4.7 I get the following
error, "too many nested evaluations (infinite loop?)" when doing the package
require tdom command.  Any ideas?  Thanks in advance.

[RS]: Oh yes, a silly bug - in the file tdom.tcl, comment out the line

======
package require tdom
======

a file can't require a package during providing it :)

[snichols]: Thank you very much. That fixed the issue.

[AK] 2006-08-02: When was '''recently''' ? According to Rolf Ade this problem was fixed '''Sep 29, 2004'''. Both the package index generated by the TEA Makefile, and the one found in the 'win' directory load the shared library (i.e. DLL) first, then source tdom.tcl. As the DLL runs the C equivalent of 'package provide tdom' the 'package require tdom' executed by 'tdom.tcl' is satisfied and will not loop.



** Bugs **

[escargo] 2008-02-38: The "official home" does not provide any way to
submit bug reports.  I did report a problem to the Yahoo group (now defunct),
since seemed to be the only resort available.

[HaO] 2014-09-16: Rolf Ade said in a private conversation on the 2014 European TCL User meeting that he maintains tdom actively (thank you) and asked if there is something missing...
The fossil repository [https://46.163.78.80/cgi-bin/repros/tdom/timeline] offers a ticket system which might be on his radar.

** Description **

This tightly-coded extension emphasizes speed and memory economy.  In contrast
to the "Pure Tcl" [TclDOM], it bests leading [Java] [DOM] implementations by an
order of magnitude (!) in both processing speed and memory demand. Jochen also
exploits tDOM's expressivity to offer a nice [HTML] reader, XML
validator--called [tnc]--and [XSLT] engine. tDOM is in production at several
commercial installations.


0.8.3 doesn't compile for Tcl version 8.6.  Use trunk or a later version instead.

[MSW]:  now understands why tDOM crashed for him ... obviously you shouldn't be
generating variables more than once (like two consecutive [[<domDoc>
documentElement root]] in different functions make tDOM trip). He's still
intrigued by appended nodes not being addressable (no localName, no URI, not
reachable via [[selectNodes]]).

Hmm?

======
set doc [dom createDocument foo]
set root [$doc documentElement]
# append a node
set node [$root appendChild [$doc createElement bar]]
# all of the following return a value == $node
$root firstChild
lindex [$root selectNodes /foo/bar] 0
lindex [$root selectNodes //bar] 0
lindex [$root selectNodes /*/*] 0
lindex [$root getElementsByTagName bar] 0

# do it again
set node1 [$root appendChild [$doc createElement bar]]
# all of the following return a value == $node1
[$root firstChild] nextSibling
lindex [$root selectNodes /foo/bar] 1
lindex [$root selectNodes {/foo/bar[2]}] 0
lindex [$root selectNodes //bar] 1
lindex [$root selectNodes /*/*] 1
lindex [$root getElementsByTagName bar] 1
======


----

**return of "$node attributes" (and xmlns:xxx attributes)**

[oehhar] 2018-05-14: Rolf wrote on the chat of the return value of the [http://tdom.github.com/domNode.html%|% $node attributes ?attributeNamePattern? %|%] command:

In case of an "ordinary" namespaced attribute, the sublist elements are `{<localname> <prefix> <namespace_uri>}`.

In the special case of an xml namespace declaration it is `{<the prefix defined> <localname> <namespace_uri>}`.
Of course, the prefix defined and the localname are equal.
And some of the (somewhat ... special) recommendations say, that namespace declaration attributes are not in any namespace.


***My example for xml namespace declarations***

The issue arised within [https://core.tcl.tk/tclws/info/584bfb772724c1a9%|%tcl ws bug 584bfb77%|%].

Example script (real example in the upper ticket):

======
<?xml version="1.0" encoding="utf-8"?>
<ps1:t1 xmlns:ps1="uri1" xmlns:ps2="uri2">
  <ps1:t2 type="ps2:type"></ps1:t2>
</ps1:t1>
======

How do I get the attributes "xmlns:*" from the node "ps1:t1" ?

Here is the example script:
======
set xml {<?xml version="1.0" encoding="utf-8"?>
<ps1:t1 xmlns:ps1="uri1" xmlns:ps2="uri2">
  <ps1:t2 type="ps2:type"></ps1:t2>
</ps1:t1>
}
dom parse $xml doc
$doc documentElement top
foreach attributelist [$top  attributes] {
    lassign $attributelist localname prefix uri

    # ignore attributes in namespaces and with different local names
    if {$uri ne "" || $prefix ne $localname} {continue}

    # Attribute name of a namespace prefix
    set attributename "xmlns:[lindex $attributelist 0]"

    # This will filter out any non-namespace declaration attributes
    if {![$top hasAttribute $attributename]} {continue}

    puts "$attributename = [$top getAttribute $attributename]"
}
======

with the output:
======
xmlns:ps1 = uri1
xmlns:ps2 = uri2
======

Here is the implementation in client.tcl of tclws:
[https://core.tcl.tk/tclws/artifact/d072926db77b9f6c?ln=2187,2203%|%client.tcl line 2187 - 2203%|%]

<<categories>> Package | Mailing List | XML | Parsing