UPL: Tcl, Perl, Python, C, Etc

Peter Newman 8 January 2005 ----------------------- Unified Programming Language

Multiple Languages

One-eyed Tcl'ers will note the sacrilege at the top of the example UPL: The Bootstrap File. What are Perl and Python (etc) interpreters doing there?

Simple; this is the "Unified Programming Language". It's for idiots like me who think that there's no such thing as the perfect programming language (although every programmer probably has his or her preference(s)).

But UPL is built on the assumption that every programming language has its advantages and disadvantages. And we want the script level programmer to be free to:-

  1. Take data types - commands/functions - and even syntax and quoting rules etc - from other languages, as they see fit, and:-
  2. To be able to script a single application in whatever combination of languages/syntaxes they want, and;
  3. For programmers to be able to invent new quoting rules and syntaxes - and ways of expressing the solution to programming problems, etc - as they see fit.

So not only only can the script-level programmer select the data types and command/functions from their `traditional' programming language of preference (Tcl, Perl Python, C, etc), they can also:-

  • Select data types and commands/functions from other programming languages, and/or;
  • Load a parser/interpreter that will parse the UPL version of any combination of those other languages too.

jcw - Interesting idea. For syntax differences, I see no roadblocks ahead. But how will you deal with differences in the data & execution model? Tcl's copy on write vs. Python's mutable/immutable/references? That a string can be treated as an int or a list in Tcl but not in Python? Garbage collection versus plain C? The fact that Python has a "None" datatype? It'd be fascinating if this can be resolved somehow. Also, IMO, the main benefits will come when one can adopt library packages from other languages, so as not to have to re-invent all wheels - can UPL end up close enough to any current language to run the reams of existing code there is today (perhaps with minor adjustments)?

Peter Newman Thanks jcw. There are many issues to be resolved. But I'd like to end up with something as powerful and flexible as possible. But who knows how far we can go, and what might be achieved.


RS A parser that grows by loaded languages is a formidable idea indeed - however, there are ambiguous cases. Say we loaded both Tcl and Python parsers, what would they do with

 set = 0

This is an assignment in both languages, but Tcl would assign 0 to =, while Python would assign to the variable set...

Peter Newman Thanks RS. As I see it there are two possibilities:-

  1. The code for different languages is bracketed with (for example):-
 <LANGUAGE Tcl>
    ...some Tcl here...
 <END-LANGUAGE><LANGUAGE Perl>
    ...some Perl here...
 <END-LANGUAGE>

This would be the normal way - used when you have complete routines/code fragments that you wanted to write in some, perhaps more appropriate, language.

  1. An embedded call to a single function in another language, could perhaps be done with an extension to the namespace syntax - where we prepend a language specifier to the front. Eg:-
       set dirHandle [ Perl:/opendir xxx xxx ]

IL isn't this similar to something .net provides? i think what they had the languages do was adopt a common specification to be scripting compliant. i'm guessing thats how it resolved the data type issues... dunno though, i tend to stay away from learning ms technologies...

Peter Newman I think they're completely different animals. But who knows; I avoid Microsoft as much as possible too. Had a look at .NET though; seemed to be more about SOAP and XML. I doubt that .NET has the modularity that to me is the key feature of UPL.


LV So basically you are saying that in the end, I might have a program that read like this?

   # First some JCL
  //step10 dd dsn=/home/lvirden/mystuff.txt,
  //              alloc=((100,100), 1000)

  # Now some ksh/bash
  lst=$(ls ~/*.txt)

  # Now some Tcl
  foreach i $lst {
        # Now a pipeline which mixes tcl, ksh notation
        j=`ls -l $i | lindex - 1`
        puts stdout $j
  }

Peter Newman 11 January 2005: Almost but not quite. Each language's code is bracketed with (for example) <LANGUAGE languageID ...script here... <END-LANGUAGE>. So each language has to watch out for that end sequence (as well as the normal End Of The File You Started Parsing In end sequence) - at which points to hands control back to the either the parser that called it, or the bootstrap intepreter that calls the first parser.

In other words:-

   # First some JCL
 <LANGUAGE JCL>
  //step10 dd dsn=/home/lvirden/mystuff.txt,
  //              alloc=((100,100), 1000)
 <END-LANGUAGE JCL>

 <LANGUAGE kshBash>
  # Now some ksh/bash
  lst=$(ls ~/*.txt)
 <END-LANGUAGE>

 <LANGUAGE Tcl/2>
  # Now some Tcl
  foreach i $lst {
        # Now a pipeline which mixes tcl, ksh notation
        j=`ls -l $i | lindex - 1`
        puts stdout $j
  }
  <END-LANGUAGE>

The <LANGUAGE Xxx>...<END-LANGUAGE> terminology is just a suggestion. Obviously, we have to invent a syntax which is friendly to as many languages as possible. Or different languages may have a slightly diffrent syntax. Eg:-

   # First some JCL
 <LANGUAGE JCL>
               ^-- JCL receives control here. So it doesn't care about the "Transfer to the 'JCL' parser" sequence that precedes it.
  //step10 dd dsn=/home/lvirden/mystuff.txt,
  //              alloc=((100,100), 1000)
 <END-LANGUAGE JCL> <-- JCL has to recognise and process this. So it must be JCL friendly.

 <LANGUAGE kshBash>
                   ^-- Alternatively, maybe the <END-LANGUAGE> sequence could be discarded. And JCL could look for and action the "Switch to ''Whatever''" sequence instead. Then that would have to be JCL friendly.
  # Now some ksh/bash
  lst=$(ls ~/*.txt)
 <END-LANGUAGE>

IL Maybe I'm just lazy and need to read more, (definitely possible) but how does this resolve issues like how primitives are handled? wouldn't they have to function at the lowest common denominator? (aka a our wonderful "everything is a string" philsophy?)

DKF: I'm reminded somewhat of some efforts in the past to get Tcl going on Parrot, which we (the TCT) were in on at an early stage. We dropped it though as their value model was significantly different from the Tcl one, and their code was deeply ugly too (it might have been neat for Perl/Python implementations, but it was so massively short of Tcl Engineering Manual requirements that we decided that the effort it would take to just persuade people of the fixes we required would be more than we wished to spend, given that we could work on the core instead.)

Swapping design notes and cool tricks with other languages works great. Swapping implementations often charges off into the long grass, sucking massive amounts of development effort with it. And sometimes things won't carry across because they depend on fundamental trade-offs in the language.

... which, if you think about it, explains a lot. One reason multiple languages coexist and thrive is precisely because they offer a mix of trade-offs which works particularly well for specific uses and mindsets. The other reason multiple languages are here to stay is the investment people have in mastering them (which takes years). -jcw