Version 44 of exec quotes problem

Updated 2014-08-30 17:54:54 by AMG

The exec quotes problem and the exec ampersand problem are due to the mismatch between the command execution paradigms of exec and Microsoft Windows. On that platform, a program is executed by passing the entire command line as a single value to the system. Before it calls main(), the system calls _setargv() to parse the command line into the standard C argc and argv[] arguments that it passes to the main() function of the program.

The program can provide its own _setargv(), giving it complete control over what syntax to impose on the command line. This creates an impedence mismatch with exec, which is provides an API compatible with that of the the Unix execve() function, which accepts a sequence of values rather than a single command line.

For Windows, exec must convert its arguments into a single command line. Since the syntax of the command line for any given program is a moving target, and individual programs can get wild and woolly about parsing the command line, the only sane thing for exec to do is to convert according to the rules of the default _setargv() function, which are documented in in Parsing C Command-Line Arguments , Visual Studio 2013.

Another _setargv() that is a part of the standard Windows development environment expands wildcard characters, and is described in Expanding Wildcard Arguments , Microsoft Visual Studio 2013. It can be selected via the appropriate command-line switch at compile time.

Still other programs roll their own _setargv(). Since there is no way to tell exec to skip the step of translating its arguments according to the default _setargv(), other workarounds must be improvised. For example, the Windows FIND command has its own peculiar syntax:

FIND [/V] [/C] [/N] [/I] [/OFF[LINE]] "string" [[drive:][path]filename[ ...]]

At first blush the syntax looks fairly normal, but in contrast to other programs, which don't see the double quotes because the default _setargv() stripped them off prior to invoking the program, FIND does see them, and uses them to determine the string to search for. Even though its documentation doesn't say it, "string" can be the second argument instead of the first as long as the first argument doesn't contain any double quotes

FIND c:/myfile.txt "somestring"

In fact, the arguments don't even have to be delimited by whitespace. "string" is simply the first two double quotes in the command line and whatever is between them, although if the second double quote is not followed by whitespace, FIND returns an error. According to FIND, the following syntax is perfectly legal, causing it to search for somestring in c:/myfile.txt:

FIND c:/myfile.txt"somestring"

In the following example, FIND searches in c:/myfile.txt and c:/myotherfile.txt for somestring:

FIND c:/myfile.txt"somestring" c:/myotherfile.txt

Using exec to execute FIND can be tricky. The following example won't work because Tcl strips the double quotes even before exec is called.

exec FIND "somestring" c:/myfile.txt

The following example won't work either:

exec FIND {"somestring"} c:/myfile.txt
Access denied - \

exec targets the default _setargv() rules, adding a backslash character to each double quote, resulting in a command line that looks like this:

FIND \"somestring\" c:/myfile.txt

FIND.exe, of course, doesn't follow the default _setargv() rules, and it interprets that command line to mean that it should search in the file \ and in the file c:/myfile.txt for somestring. Hence the Access denied error.

Since exec is always going to translate its arguments into a single command line, the workaround is to have exec call CMD.exe , which does use the default _setargv(). CMD.exe can then interpret an execute the script that passed to it as an argument:

exec $::env(ComSpec) /c find {"somestring"} c:/myfile.txt

Another option is to pipe the script into the standard input of CMD. The trailing newline is necessary:

exec $::env(ComSpec) << {find "somestring" c:/myfile.txt;
}

The cmd << variant behaves more like an interactive command shell, displaying an initial message, and requiring a newline to terminate the last command, so it may be considered cleaner to use cmd /c instead.

These two variants of the technique effectively undo the quoting that exec performs. It also shifts the burden of producing the correct command line syntax back to the author of the Tcl script. There's really no alternative. Not in Tcl and not in any other language. It's just the Windows way.

The script that is passed to CMD can of course use environment variables and CMD syntax:

set ::env(QT) \"; exec $::env(ComSpec) /c find %QT%somestring%QT% *.txt 

FIND.exe interacts with the console directly, and its output will not be captured by exec. It also won't work in Wish on Windows for the same reason.

Proposal: -noquoting

MJ:

A -noquoting option to exec could tell it to skip the step where it attempts to quote the command line according to the rules of the default _setargv():

exec -noquoting {test.bat "a,b"}

This will also make use of auto_execok without eval possible (although in 8.5 {*} solves this in different way):

exec -noquoting "[auto_execok start] http://www.tcl.tk" ; # 8.4.13
exec {*}[auto_execok start] "http://www.tcl.tk" ; # 8.5

This new flag will only lead to problems with older scripts if a command named -noquoting exists, which seems unlikely.

RS: Another point that exec -noquoting should handle is to take < and > as literal characters, not as redirections, e.g.

exec echo "<StartTag>"

MJ: Ideally, -noquoting would refrain from interpreting any special characters such as " < << 2> etc. So the tag case should be handled correctly eg:

% exec echo "<starttag>"
couldn't read file "starttag>": no such file or directory
% exec -noquoting echo "<starttag>"
<starttag>
%

MJ: You can download a patch on 8.5a3 from [L1 ] that adds a -noquote option to exec. The command must still be expanded so:

exec [auto_execok start]

will still not work. All other quoting of " | & < etc. will be disabled. Note however that | & and others will still have a meaning for the shell.

To apply the patch go to the root of the Tcl source dir and type:

patch -p1 < exec.patch

After running all the test cases for exec, only one test case fails (understandably), so the patch shouldn't break any existing functionality:

==== exec-14.3 unknown switch FAILED
==== Contents of test case:

   list [catch {exec -gorp} msg] $msg

---- Result was:
1 {bad switch "-gorp": must be -keepnewline, -noquote, or --}
---- Result should have been (exact matching):
1 {bad switch "-gorp": must be -keepnewline or --}
==== exec-14.3 FAILED

Examples

% exec cmd /c echo \"test\"
\"test\"
% exec -noquote cmd /c echo \"test\"
"test"
  
% # example of disabling interpretation of <
% exec  cmd /c echo <> ; # error from Tcl
couldn't read file ">": no such file or directory
% exec  cmd /c echo \"<>\" ; # quoting doesn't help
\"<>\"
% exec  -noquote cmd /c echo <> ; # error from shell
> was unexpected at this time.
% exec  -noquote cmd /c echo \"<>\" ; # quoting helps
"<>"

MJ: After using the exec -noquote version for a while I noticed that TWAPI already offers much of the requested functionality in twapi::createprocess

package require twapi
interp alias {} twexec {} ::twapi::create_process {} -showwindow hidden -cmdline
twexec "echo <>"

I don't see a clear way to get the output of the command into Tcl yet so it might not be usable in cases where you need the output of the command.

LV 2007-06-07: to move forward it would be worthwhile to submit a TIP if you haven't already, detailing the -noquote option and pointing to your patch. That way, the TCT can debate what you've proposed, and perhaps help tweak things so that it works even better.

PYK 2014-08-29: +1 for -noquote.

AMG: Perhaps -raw would be a better name, but yes, this seems to be the only way forward short of beating some sense into Microsoft.

Batch Files

MJ: One example I have encountered is when passing parameters to a batch file that contain commas:

test.bat a,b

will call test.bat with two parameters a and b. This can be very useful, but leads to problems when you want to send a,b as one parameter. The solution is to quote the a,b

test.bat "a,b"

If we want to call this batchfile from Tcl, the first idea would be to use

exec test.bat {"a,b"} 

For the reasons describe above, exec will transform this to:

test.bat \"a,b\"

One of the techniques described above can be used instead:

exec cmd /c "test.bat \"a,b\"\n"

Misc

HaO Example opening files in windows explorer with a file selected. Possible invocations on a windows command prompt:

C:\Windows\explorer.exe /select,C:\Program Files\tcl8.5\doc\tcl85.hlp

or

C:\Windows\explorer.exe /select,"C:\Program Files\tcl8.5\doc\tcl85.hlp"

but not

C:\Windows\explorer.exe "/select,C:\Program Files\tcl8.5\doc\tcl85.hlp"

which is issued by the exec quoting rules when using:

set filename [file nativename {C:/Program Files/tcl8.5/doc/tcl85.hlp}]
eval exec [auto_execok explorer] [list /select,$filename]

Found solutions:

eval exec [auto_execok explorer] [string map {\\ \\\\} /select,$filename]
eval exec [auto_execok cmd] [list << "explorer.exe /select,$filename\n"]
eval exec [auto_execok cmd] [list << "explorer.exe /select,\"$filename\"\n"]

The first passes the path as multiple arguments and thus avoids exec adding quotes. The second and third use the previously-mentioned cmd.exe magic to get the required data one and two. I was not able to get controled quotes using solution 1.

PYK 2014-08-29: explorer.exe is another example of a program which does not use the standard _setargv() command-line parsing. It parses the command line on its own, and allows but does not require quoting of whitespace in the filename portion of the /select,c:\path to some/file flag. Because cmd /c results in a non-interactive CMD.exe, it's a little cleaner in operation than cmd <<, suggesting the following syntax:

exec $env(ComSpec) /c "[auto_execok explorer] /select,$nativename"

Reference

Parsing C Command-Line Arguments , Microsoft Visual Studio 2013
describes the default _setarv() behaviour.
CreateProcess function , Microsoft Visual Studio 2013
Customizing C++ Command-Line Processing , Microsoft Visual Studio 2013
Expanding Wildcard Arguments , Microsoft Visual Studio 2013
describes an alternate _setargv()
Using batch parameters , Microsoft Windows XP