Version 9 of Tcl_ParseCommand

Tcl_ParseCommand - function in Tcl's public C API for parsing a Tcl command from a string.

Official docs

http://www.purl.org/tcl/home/man/tcl8.4/TclLib/ParseCmd.htm

  int
  Tcl_ParseCommand(
      Tcl_Interp *interp,
      const char *string,
      int         numBytes,
      int         nested,
      Tcl_Parse  *parsePtr)

There have been several bugs deep in the Tcl core dealing with the parsing of commands that suggest the existing docs for Tcl_ParseCommand don't point out a tricky case well enough. See Tcl Bug 681641.

The end of a Tcl command is determined by the presence of a command terminator character. This might be a newline ("\n") or a semi-colon (";"). When Tcl_ParseCommand is asked to parse a command in a command substitution context (by setting the nested value to true), then the close-bracket character ("]") is also a command terminator.

Tcl_ParseCommand returns its parsing results in a Tcl_Parse structure pointed to by parsePtr. The two fields commandStart and commandSize in the Tcl_Parse struct indicate the substring that was parsed as a valid Tcl command. The substring begins with the byte pointed to by parsePtr->commandStart and includes parsePtr->commandSize bytes. This substring includes the command terminator!

So, for example, if string originally points to "foo;bar" and numBytes is 7 (requesting the whole string be parsed), then after Tcl_ParseCommand returns, commandStart will point to the "f" and commandSize will be 4, indicating the substring "foo;" was successfully parsed as a Tcl command.

This interface has pros and cons. The main advantage is that it is easy to create a loop that will parse many commands from a script:

  while (...) {
    if (TCL_OK != Tcl_ParseCommand(interp,script,numBytes,0,&parse)) {
      return TCL_ERROR;
    }
    end = script + numBytes;
    script = parse.commandStart + parse.commandSize;
    numBytes = end - script;
    Tcl_FreeParse(&parse)
  }

The parser takes care of advancing the pointer past the terminator character for us.

A disadvantage is that the caller of Tcl_ParseCommand() may really be interested in the string that is the actual command, and not interested in the terminator character. In that case, the caller is burdened with having to strip off the command terminator character.

Finally, we come to the really tricky problem. The substring marked off by the commandStart and commandSize fields of the Tcl_Parse struct only includes a command terminator character when it exists. A Tcl command can be terminated without a command terminator character in one special case: when the command is terminated by the end of the string (when numBytes drops to 0).

So, if we pass in a string of "foo;bar" and a numBytes value of 3, then the substring marked off in the Tcl_Parse struct is "foo". Unlike the previous example, there is no ";" included in the substring.

So, if a caller is interested in stripping command terminator characters, it has a more complex task of having to discover when Tcl_ParseCommand left them in the substring, and when it did not. And the sad truth is that the public Tcl_ParseCommand interface does not provide a simple way to make that discovery.

If the marked substring does not consume all numBytes bytes of the original string argument, then we do know that the last character of the marked substring is the command terminator character that terminates the parsed command.