Generating a generic platform name

This page was prompted by a discussion started by Bob Techentin and Joe English on comp.lang.tcl [L1 ]. -jcw

The problem: how to generate a simple name that can be used as directory name when storing multiple shared library builds for different platforms next to each other. This has uses in Critcl and in Starkits.

The problem is hard, because it's not well specified. One could do "[join [array get tcl_platform] { }]", and end up with a name that is definitely unique, but huge. The trouble is that for shared libraries, this approach is in fact wrong - there are many details in tcl_platform that differ between machines, yet are irrelevant w.r.t. binary compatibility. An example from Joe English's post: IRIX has numerous variations with intricate compatibility issues, while Linux does not really care - all shared libs will load anywhere regardless of kernel version.

So, let's try to come up with some logic, and extend / perfect it along the way. Here's a start:

    proc platform {} {
        global tcl_platform
        set plat [lindex $tcl_platform(os) 0]
        set mach $tcl_platform(machine)
        switch -glob -- $mach {
            sun4* { set mach sparc }
            intel -
            i*86* { set mach x86 }
            "Power Macintosh" { set mach ppc }
        }
        set mach [regsub -all {[ /]} $mach "-"]
        return "$plat-$mach"
    }

Some output:

   Linux-x86
   Darwin-ppc   (Mac OSX)
   Windows-x86
   SunOS-sparc

The goal is to come up with a unique identifier that can be used as directory name (no slashes or colons, please), for each context that requires a different binary. For contexts that are compatible, the goal is to end up with the same identifier each time.

Failures are not necessarily show-stoppers. If a specific shared library needs more differentiation than the identifier provides, one can always add add-hoc logic for it (example: a high-performance binary that is built differently for Pentium vs. Athlon cpu's). If it turns out that two different identifiers are produced for what is determined to be compatible after all (IRIX comes to mind), then one can either store copies, or post-process the output from the [platform] command.


JE It may be a better idea for the outer switch to dispatch on $tcl_plaform(os) instead of $tcl_platform(machine); the meaning of the other fields depends more heavily on the former (which comes from uname -s) than the latter.

We should also check $tcl_platform(platform) (unix vs. windows vs. macos vs. ...). Also need to consider threaded vs. non-threaded, 32-bit vs. 64-bit, and (at least on Windows) debug vs. non-debug. Basically any build variant that affects binary compatibility.

Anyway, this is what I suggest for IRIX (following LV's suggestion below, it returns a GNU-style "canonical system"):

    switch $tcl_platform(os) {
        ...
        IRIX -
        IRIX64 {  return "mips-sgi-irix$tcl_platform(osVersion)" }
        ...
    }

(IRIX64 only indicates that the machine is 'capable' of 64-bit. Since by default Tcl uses the n32 ABI even on IRIX64 machines, IRIX64 and IRIX are handled the same way. This is what GNU config.guess does.)


Anyway, the idea for this page is to tweak the above [platform] command, so we can use it as "easy shared-lib discriminator" on more and more platforms. Here's a modified version with Bob's HPUX changes added:

    proc platform {} {
        global tcl_platform
        set plat [lindex $tcl_platform(os) 0]
        set mach $tcl_platform(machine)
        switch -glob -- $mach {
            sun4* { set mach sparc }
            intel -
            i*86* { set mach x86 }
            "Power Macintosh" { set mach ppc }
            9000* {set mach 9000}
        }
        set mach [regsub -all {[ /]} $mach "-"]
        return "$plat-$mach"
    }

Please feel free to alter the above (carefully, so it doesn't break). The history of this page (warning: it only gets updated once per day) can be used to see all previous versions if you need to go back and check things out:

    http://mini.net/tclhist/8522~   (annotated per line)
    http://mini.net/tclhist/8522*   (list of all versions)

LV Any reason to not pattern the names after autoconf's config.guess ? For instance, platform, above, says on my sparc solaris 2.6 machine:

 SunOS-sparc

while config.guess says

 sparc-sun-solaris2.6

Without OS version, you run into basic runtime incompatibilities (like between major releases of Linux and Solaris at least - maybe other platforms as well). config.guess when possible.

stevel Yes - there's a reason - this was done to facilitate loading platform specific shared libraries when, in most instances, the OS version is irrelevant (and can be obtained if you need it).

JE But on platforms where the OS version 'is' relevant (SunOS and IRIX come to mind), it should be included in the identifier.


Another reason for generating generic platform names is to make tcl_platform(machine) consistent across platforms. For example, Power PC should be "ppc" on both AIX and MacOSX. At the moment on MacOSX it is "Power Macintosh" whereas on Linux it is "ppc". Note that CriTcl tries to do this as far as is possible (e.g. ppc, x86, sparc, etc) - stevel


Another issue we need to address is that certain platforms encode the unique machine ID in tcl_platform(machine) - and this needs to be extracted if the value of tcl_platform(machine) is going to be useful.

For example, whereas on Linux/x86 you get something like

    tcl_platform(byteOrder) = littleEndian
    tcl_platform(machine)   = i686
    tcl_platform(os)        = Linux
    tcl_platform(osVersion) = 2.4.19-64GB-SMP
    tcl_platform(platform)  = unix
    tcl_platform(user)      = steve
    tcl_platform(wordSize)  = 4

But on AIX, for example, I get

    tcl_platform(byteOrder) = bigEndian
    tcl_platform(machine)   = 0042065B4C00
    tcl_platform(os)        = AIX
    tcl_platform(osVersion) = 5.1
    tcl_platform(platform)  = unix
    tcl_platform(user)      = steve
    tcl_platform(wordSize)  = 4

In particular, note the machine field - this is the same as is returned by "uname -m". The uname man page says

"The machine ID number contains 12 characters in the following digit format: xxyyyyyymmss. The xx positions indicate the system and is always 00. The yyyyyy positions contain the unique ID number for the entire system. The mm position represents the model ID. The ss position is the submodel number and is always 00. The model ID describes the ID of the CPU Planar, not the model of the System as a whole."

Note the "unique ID number for the entire system" :-(

I suspect what we need in the tcl_platform(machine) entry is the equivalent of the "uname -p", which returns "powerpc" on the machine I have access to.

According to the man page uname -p "Displays the architecture of the system processor" for both powerpc and itanium, so I guess it is safe to use the equivalent if it is available through the utsname.h interface.

LV: Hopefully Tcl doesn't actually use uname -p though - turns out that ksh's interpretation of uname -p is broken on SPARC Solaris x86 and Linux x86 machines - there's a builtin interpretation of either uname or /bin/uname to ksh values compiled in. On Linux, this means that ksh says that /bin/uname -p is i386 while /bin/sh for instance says that the same command is i686!!!

Anyway, when building cross-platform packages I use some code to recognise and deal with this. It's not ideal, but appears to be sufficient ... just add to the end of platform()

    if {$plat eq "AIX"} { set mach ppc }
    if {$plat eq "HP-UX"} { set mach hppa }

Note the last one conflicts with Bob's suggestion - so I haven't added it to platform()

stevel


See also PkgIndexUtil.tcl - determining a platform name


PWQ 13 Feb 06, The obvious qustion that noone has asked is , why is the above such a mess and who determines what string is returned for what platform?.

The big problem with all the above is that you cannot test your code unless you have the platform in question as you won't know what string to test for , or more importantly what difference this platform has that you have to have a test.

All the above is to make up for poor decision making that has not been corrected.


LV In Tcl 8.5, a new package called platform has various functions some of which, when called, return a string of a specific format for identifying a platform.