Version 6 of http::formatQuery

Updated 2010-03-06 13:09:11 by MHo

QUESTION:

Does someone know if it's allowed to use upper OR lower characters in CGI parameter encoding? E.g., are the following two results equivalent?:

 % http::formatQuery äöü
 %c3%a4%c3%b6%c3%bc
 %

 % http::formatQuery2 äöü
 %C3%A4%C3%B6%C3%BC
 %

I read a few lines of RFC 3875 but did not fully understand everything yet...

MJ - The URI escaping is described in RFC 2396 (which is referenced by the CGI RFC 3875). There it states:

   An escaped octet is encoded as a character triplet, consisting of the
   percent character "%" followed by the two hexadecimal digits
   representing the octet code. For example, "%20" is the escaped
   encoding for the US-ASCII space character.

      escaped     = "%" hex hex
      hex         = digit | "A" | "B" | "C" | "D" | "E" | "F" |
                            "a" | "b" | "c" | "d" | "e" | "f"

So yes they are equivalent.

Mat (2009-10-27) Though they should be equivalent, many implementations don't recognize either of those variants.

As it turns out, even Amazon's Product Advertising API requires query strings to use upper case character triplets.. This proc will convert them:

proc formatQuery2 {val} {
    return [subst [regsub -all {(%.{2})} [::http::formatQuery $val] {[string toupper "\1"]}]]
}

MHo 2010-03-06: Confusion:

% info pa
8.5.8
% package require http
2.7.5
% http::formatQuery umlaut1 ä umlaut2 ö umlaut3 Ü
umlaut1=%c3%a4&umlaut2=%c3%b6&umlaut3=%c3%9c
% package require ncgi
1.3.2
% ncgi::encode äöÜ
%E4%F6%DC

Does http::formatQuery produce the right encoding for german "Umlauts" here???


Category Command that is part of the http package of Tcl