Roman numerals

See math::roman for a package which handles Roman numerals. It is a part of the math module of Tcllib.

This paged used to be called "Roman numbers". Larry Smith Sorry to interject a pedantic nit here, but these are properly called roman numerals not roman "numbers". A "number" is a pure mathematical concept representing a unique value and is independent of its representation. A "numeral" is a glyph intended to communicate a number. Hence roman "numerals" are a way of expressing numbers. Other ways are base 8 arabic, ascii, or even cuneiform. I have edited the text below to correct this, but the page name should be corrected as well.

Roman numerals are an additive (and partially subtractive) system with the following letter values:

 I=1 V=5 X=10 L=50 C=100 D=500 M=1000; MCMXCIX = 1999

Here's some Tcl routines for dealing with Roman numerals - enjoy!

Sorting roman numerals

I,V,X already come in the right order; for the others we have to introduce temporary collation transformations, which we'll undo right after sorting:

 proc roman:sort list {
    set map {IX VIIII L Y XC YXXXX C Z D {\^} ZM {\^ZZZZ} M _}
    foreach {from to} $map {
        regsub -all $from $list $to list
    set list [lsort $list]
    foreach {from to} [lrevert $map] {
        regsub -all $from $list $to list
    set list

Roman numerals from integer

 proc roman:numeral {i} {
        set res ""
        foreach {value roman} {
        1000 M 900 CM 500 D 400 CD 100 C 90 XC 50 L 40 XL 10 X 9 IX 5 V 4 IV 1 I} {
                while {$i>=$value} {
                        append res $roman
                        incr i -$value
        set res

Roman numerals parsed into integer

 proc roman:get {s} {
        array set r_v {M 1000 D 500 C 100 L 50 X 10 V 5 I 1}
        set last 99999; set res 0
        foreach i [split [string toupper $s] ""] {
                if [catch {set val $r_v($i)}] {error "un-Roman digit $i in $s"}
                incr res $val
                if {$val>$last} {incr res [expr -2*$last]}
                set last $val
        set res
 } ;#RS

Roman expressions

With the two above, it's easy to write

 proc roman:expr args {
        regsub -all {[^IVXLCDM]} $args { & } args
        foreach i $args {
                catch {set i [roman:get $i]}
                lappend res $i
        roman:numeral [expr $res]
 } ;#RS
 % roman:expr XXIII*VI
 % roman:expr XXIII+VI

Validate Roman numeral

proc roman:validate {s} {
    set re {(?i)^\s*M{0,4}(?:CM|CD|D?C{0,3})(?:XC|XL|L?X{0,3})(?:IX|IV|V?I{0,3})\s*$}
    return [regexp $re $s]
} ;# KPV

Note that, as entertaining as this construction is, real Romans would have required an explanation. In fact, even among moderns, our conventional symbols for arithmetic operations were only invented at the end of the fifteenth century [link?]. Johann Widmann published a book on mercantile arithmetic in 1489 which used '+' and '-' to indicate surplus and deficit. By 1514, Vander Hoecke first uses the symbols for operations in algebraic expressions. Some time around 1500, then, the burgeoning European market economy and Renaissance "scientific" culture had yielded general recognition of formulae such as "3 + 6 = 9". Before this time, propositions were expressed in terms of human-language words.

If an operation was implied by writing things next to each other, then that through most of history was obviously addition. (This makes sense not only for the Roman numerals, but also for e.g. Greek and Egyptian numerals.) However in modern mathematical notation (which in this respect was canonized mostly by Descartes), the implied operation is instead multiplication.

PT 13-May-2003: I just noticed that Roman numerals have a slot in the unicode table - beginning at \u2160 for Roman Numeral One and going up to \u2182 Roman Numeral Ten Thousand. Includes some that I didn't know. :)

  string map {
    M \u216f C \u216d D \u216e L \u216c 
    VIII \u2167 VII \u2166 VI \u2165 IV \u2163 V \u2164
    X11  \u216b XI  \u216a IX \u2168 X  \u2169
    III  \u2162 II  \u2161 I  \u2160

escargo 21 May 2004 - Anybody know why MIM isn't 1999? RS: The rule appears to be that subtraction only applies inside the same "order of magnitude", so 1999 must be MCM(1900)XC(90)IX = MCMXCIX. BR: As I learned it, MIM is a valid representation of 1999, and I have actually seen it, too.

KPV According to Conway and Guy's The Book of Numbers, the subtraction rule (IV for 4, etc.) was rarely used until medieval times. If that was true, then there were probably some very long Roman numerals: '''MDCCCCLXXXXVIIII' [L1 ].

AM A practical occurrence of this fact is that the faces of clocks that use Roman numerals often have IIII instead of IV ...

CLN As I've heard it, "IV" was avoided because it was close enough to the prefix of the god Jupiter's name (JU was rendered IV) that it was considered blasephemous and/or unlucky so IIII was used instead. If the subtractive rule was to be avoided, 9 would have been VIIII but it's on clocks as IX.

KBK It's also attributed to IVLIVS CÆSAR rather than IVPITER - but in practice, it's simply that a clock (or sundial) looks unbalanced with IV opposite VIII - the other pairs of numbers are not as different in length (I vs XI, II vs X, III vs IX, V vs VII). IIII vs VIII looks more pleasant. (Were it simply that the subtraction rule were not used, then clock faces would have VIIII instead of IX.)

DAG There are two other explanations. Since IV could be mistaken for VI by ignorant people, working in the fields, it was preferred to use IIII. But the one I prefer is simply that numerals on clock faces were written with molded lead. A single template with one X, one V and five I could be used to produce all numerals, if used four times: VIIIIIX.


The last two were reversed to obtain XI and XII.

Shortlist of well-known programs whose name is a Roman numeral: cc, cl, xv, vi, dc, xli, ...

DKF: There are also MX records in DNS. :^)