Playing XPath with JSON

Tcllib JSON returns nested dictionaries, which you are free to traverse and query using regular dict commands. However, these might be cumbersome if you want to access particular parts of the "tree" from within the logic of your application. The code below is playing XPath with dictionaries returned by the tcllib implementation of the JSON parser. Since the parser (and thus the dictionaries), part of the code is doing some magic to segregate between (JSON) arrays and true dictionaries. It is not perfect, but works against most usecases.

Queries use parenthesis to access arrays (and not square brackets as in XPath, so that matching will be easier using string match). Arrays start at index 0 (XPath uses 1 as the first index). You can use "glob-style" parameters compatible with string match in the query.

package require json


proc ::json:listof? { dta class } {
    foreach i $dta {
        if { ![string is $class -strict $i] } {
            return 0
        }
    }
    return 1
}


proc ::json:object? { dta } {
    if { [llength $dta]%2 == 0 } {
        if { [::json:listof? $dta integer] || [::json:listof? $dta double] } {
            return 0
        }

        foreach {k v} $dta {
            if { ![string is wordchar $k] } {
                return 0
            }
        }
        return 1
    }
    return 0
}

proc ::json:select { dta xpr { separator "/" } {lead ""} } {
    set selection {}

    if { [::json:object? $dta] } {
        foreach { k v } $dta {
            set fv $lead$separator$k
            set selection [concat $selection [::json:select $v $xpr $separator $fv]]
            if { [string match $xpr $fv] } {
                set selection [concat [list $fv $v] $selection]
            }
        }
    }

    if { [llength $selection] == 0 } {
        set len [llength $dta]
        if { $len > 1 } {
            for {set i 0} {$i < $len} {incr i} {
                set fv $lead\($i\)
                set v [lindex $dta $i]
                set selection [concat $selection [::json:select $v $xpr $separator $fv]]
                if { [string match $xpr $fv] } {
                    set selection [concat [list $fv $v] $selection]
                }
            }
        }
    }
    return $selection
}

So, given data that would contain the following tree (example comes from the JSON RFC):

      [
        {
           "precision": "zip",
           "Latitude":  37.7668,
           "Longitude": -122.3959,
           "Address":   "",
           "City":      "SAN FRANCISCO",
           "State":     "CA",
           "Zip":       "94107",
           "Country":   "US"
        },
        {
           "precision": "zip",
           "Latitude":  37.371991,
           "Longitude": -122.026020,
           "Address":   "",
           "City":      "SUNNYVALE",
           "State":     "CA",
           "Zip":       "94085",
           "Country":   "US"
        }
      ]

The code below would return the values of L* (which will both match Latitude and Longitude) for all objects of the JSON array.

::json:select [::json::json2dict $dta] (*)/L*

In other words

(0)/Longitude -122.3959 (0)/Latitude 37.7668 (1)/Longitude -122.026020 (1)/Latitude 37.371991

Further you can look for specific values in the tree using calls similar to the one below, which looks for an object in the array that have the value of Zip set to 94107.

::json:match [::json::json2dict $dta] (*)/Zip == 94107

In other words, you would receive the following list:

(0)/Zip

To exhibit the magic around trying to differentiate between lists and dictionaries, consider the following example (again, taken from the JSON RFC).

      {
        "Image": {
            "Width":  800,
            "Height": 600,
            "Title":  "View from 15th Floor",
            "Thumbnail": {
                "Url":    "http://www.example.com/image/481989943 ",
                "Height": 125,
                "Width":  100
            },
            "Animated" : false,
            "IDs": 116, 943, 234, 38793
          }
      }

To get the values of the third "ID" from the Image object, you would do:

::json:select [::json::json2dict $dta] /Image/IDs(2)

However, you will not receive any value for:

::json:select [::json::json2dict $dta] /Image/IDs/116

SRB: 2014-11-14 - what you are describing is not XPath. XPath uses square brackets for predicates, not parentheses. XPath also does not allow globbing of "element" names. At best this is an XPath-like language for navigating an information set.

EF I never intended to implement XPath, therefrom the title and the description. I mentioned above that I was not using square brackets, so as to be able to use them for globbing, which XPath does not have. I am sorry if I have misled the reader.

Jochen Loewer 2014-11-20: I extended tDOM to have an embedded JSON parser (dom parse -json ...) and serializer (asJSON), which allow us to make use of the powerful XPATH constructs to operate on JSON data (given certain rules are met to overcome differences in expressiveness between JSON and DOM/XML). If there is interest, I can push the feature to the tDOM repo on github.

EF I think that this would be a wonderful idea! MSH I second that idea if it is still possible !!

EF Otherwise, jq (see http://stedolan.github.io/jq/ ) can be a wonderful toy to perform similar things.


ak - 2015-01-20 21:59:46

Note also treeql for a tree-query&editing language based on struct::tree 's.