Version 5 of How To Read Large Strings without Newlines

Updated 2011-06-02 02:32:11 by RLE

XAN: I've run into a kind of surprising limitation(?) and it has me stumped. Maybe someone else here has done something similar?

The Problem

I am attempting to use TclCurl to have a "conversation" with an ASP driven webpage. A little background... these are corporate in-house webpages over a VPN and there is little to no hope of getting them to change anything. And there is a major design flaw in these webpages - the designer has decided to use the __VIEWSTATE field to save state for everything... including the kitchen sink! The result...

The stumbling block is a __VIEWSTATE field that is approximately 500K large! To communicate with the server, I have to capture this awful chunk of data and send it back in the POST request. I have code that does this, and I know that it works because it is heavily tested on pages with a sane-sized __VIEWSTATE. Trying to use the same code on a page with a large VIEWSTATE, however, just causes TCL to hang.

Possible Cause

I have no problem reading a file megabytes large into a variable, so somothing else is going on here. I looked for "offensive" characters that may be choking the interpreter but none seem to exist. The striking detail seems to to me to be the absence of newlines. The ~500K is all on one line... could it be that TCL cannot handle this scenario, or handle it well?

Things I've Tried and The Outcome

  1. I've taken a chunk of about 10K and tried building a 500K string with no newlines using append. The interpreter does get slower and slower and eventually grinds to a halt. When doing this with newlines, no slowdown occurs.
  2. I've tried every possible method of reading this data into a variable...
  • different encodings and translations on fconfigure -> read into variable
  • set variable directly using copy/paste into console
  • breaking up into chunks and rebuilding these chunks into one variable
  1. I've looked for a way to have TclCurl pull the postfields from a file. No such method seems to exist - it must come from a variable.
  2. I've tried using external options like CMD.exe type filename. Same problem. Interestingly, CMD.exe has no problem typing this file out in a CMD.exe instance.

Questions

  1. Why can't TCL handle this data?
  2. What can I do? Anyone have any suggestions or similar experience?

AMG: I wasn't aware of any problems with long strings, even when they lack newlines. The computer I'm on right now takes about thirty milliseconds to assemble a one-megabyte string consisting only of the letter "x":

time {string repeat x 1048576} 100

I suspect the problem isn't with Tcl or the way it handles long strings or newlines, but rather with the way data is exchanged with the TclCurl library. Profile your program with [time] and/or [clock microseconds] to see which command is taking the longest to complete, then split up that command and see which piece is taking too long, and keep repeating until you've isolated the culprit. Let us know what you find.

There doesn't appear to be any documentation for TclCurl on this Wiki. Maybe write a sample program or two and post it so we can see how it works.