Version 7 of Erasure Tolerant Codes

Updated 2008-02-11 21:06:20 by jaf

During a Discussion at a LUG meeting a few had the idea to use the free data repositories that Google, Yahoo,gmx etc, provide to store a kind of off site backup.

In our opinion the data had to be somewhat securely stored, that meant that one should be able to access the data even when one or more of the servers were down.

A bit of 'Googling' showed that there are such things like erasure tolerant codes that don't need 100% replication of the data, actually like 'RAID', but on a file by file basis.

The idea used for erasure tolerant codes is quite simple, an example:

  • There are 4 bytes to store
  • The storage can be made on 6 different locations
  • This means 2 erasures are possible without compromising the data integrity
  • Make a vector out of the 4 bytes, multiply a 4(columns)X6(rows) matrix with the vector
  • The result is a vector with 6 elements
  • The matrix is chosen so, that any 4 of its rows result in an invertible matrix
  • Upon loosing 2 of the result bytes, choose the rows that generated the remaining bytes
  • Build a matrix out of the rows and invert it
  • Multiply the inverted matrix with vector made out of the remaining data
  • Et voila your original vector is back.

A big BUT remains there: But you can only get you data back if the multiplication can be made without loss of precision, e.g. no floats.

That's why one uses Galois fields for all the operations needed to encode or to decode the data. In a Galois field the inverse of a number is also from the field, if you start with an integer field, the result of any operation will still be in the field.


Here is my implementation of the above, actually a proof of concept, that it can be done in Tcl and then you get the GUI for free.

In conjunction with fuse I think it would even be possible to implement a Web-Based-Raid


Galois Field Part:

  • Matrix and vector operations
  • Operations using lookup tables
  • Creation of look up tables
  • Basic functions needed until look up tables are initialised
  • Helpers
 ################################################################################
 # vector and matrix operations                                                 #
 ################################################################################
 proc mat_invert_dec {mat} {
 ###############################
 # inverts a matrix in GF(2^n) #
 ###############################
     set rows [llength $mat]
     # make sure each row has same length, insuring we got a square matrix
     foreach row $mat {
         if {[llength $row] != $rows} {return}
       }
     # initialize the inverse with E, inversion like at school:
     # transform mat into E, apply needed operations to inv too, at the 
     # end inv will contain the inverse of mat.
     set inv [mat_unity_dec $rows]
     for {set row 0} {$row < $rows} {incr row} {
         # if the pivot is zero, then seek other row to add until it is nonzero
         if {[lindex [lindex $mat $row] $row] == 0} {
             for {set other $row} {$other < $rows} {incr other} {
                 # don't add this row onto itself, would set the row to zero
                 if {$other == $row} {continue}
                 # get the appropriate element
                 if {[lindex [lindex $mat $other] $row] != 0} {break}
               }
             # at this point either we have a row we can add or no other row
             # is possible. 
             # The latter case makes the matrice non invertible
             if {$other == $rows} {return}
             # a possible row was found
             set new_row [vect_vect_add_dec [lindex $mat $row]\
                                            [lindex $mat $other]]
             set mat [lreplace $mat $row $row $new_row]
             # reflect to E
             set new_inv [vect_vect_add_dec [lindex $inv $row]\
                                            [lindex $inv $other]]
             set inv [lreplace $inv $row $row $new_inv]
           }; # if pivot == 0
         # set the pivot to 1
         set piv_inv [inv_dec [lindex [lindex $mat $row] $row]]
         set new_row [scal_vect_mult_dec $piv_inv [lindex $mat $row]]
         set mat [lreplace $mat $row $row $new_row]
         # reflect to E
         set new_inv [scal_vect_mult_dec $piv_inv [lindex $inv $row]]
         set inv [lreplace $inv $row $row $new_inv]
         # now pivot  is 1
         for {set other 0} {$other < $rows} {incr other} {
             # don't apply to self
             if {$other == $row} {continue}
             # make the entries in the column where the pivot resides zero
             set this_elem [lindex [lindex $mat $other] $row]
             set new_mat [scal_vect_mult_dec $this_elem [lindex $mat $row]]
             set new_mat [vect_vect_add_dec $new_mat [lindex $mat $other]]
             set mat [lreplace $mat $other $other $new_mat]
             # reflect to E
             set new_inv [scal_vect_mult_dec $this_elem [lindex $inv $row]]
             set new_inv [vect_vect_add_dec $new_inv [lindex $inv $other]]
             set inv [lreplace $inv $other $other $new_inv]
           }
       }; # end foreach row of the matrix
     # check that row is not only zeros, matrice would have been not invertible
     foreach row $mat {
         set sorted [lsort -unique $row]
         if {([llength $sorted] == 1) && ([lindex $sorted 0] == 0)} { return }
       }
     return $inv
   }
 proc scal_vect_mult_dec {scal vect} {
 #####################################
 # multiplies a vector with a scalar #
 #####################################
     set result {}
     foreach elem $vect {lappend result [mult_dec $scal $elem] }
     return $result
   }
 proc vect_vect_add_dec {vecta vectb} {
 ###################
 # Adds two vectors#
 ###################
     if {[llength $vecta] != [llength $vectb]} {return}
     set result {}
     foreach elema $vecta elemb $vectb {lappend result [add_dec $elema $elemb]}
     return $result
   }
 proc mat_unity_dec {ord} {
 ####################################
 # makes an (ord x ord) unit matrix #
 ####################################
     for {set row 0} {$row < $ord} {incr row} {
         set row_str [string replace [string repeat 0 $ord] $row $row 1]
         lappend unity [split $row_str ""]
       }
     return $unity
   }
 proc mat_vect_mult_dec {mat vect} {
 #########################
 # matrix vector product #
 #########################
     set retvect {}
     foreach row $mat {
         set prod [vect_vect_dot_dec $row $vect]
         if {$prod eq ""} {return}
         lappend retvect $prod
       }
     return $retvect
   }
 proc vect_vect_dot_dec {vecta vectb} {
 ######################################
 # compute the dot product of vectors #
 ######################################
     if {[llength $vecta] != [llength $vectb]} {return}

     set result 0
     foreach elema $vecta elemb $vectb {
         set prod [mult_dec $elema $elemb]
         if {$prod == ""} {return}
         set result [add_dec $result $prod]
       }
     return $result
   }
 ################################################################################
 # operations using the gal_state look up tables will work after initialisation #
 # with an irreducible polynomial and after choice of a generator element.      #
 # All operations use decimals                                                  #
 ################################################################################
 proc add_dec {a b} {
 #####################
 # performs addition #
 #####################
     return [gal_add_dec $a $b]
   }
 proc mult_dec {a b} {
 #######################################################
 # multiply using the logarithm and exp look up tables #
 #######################################################
     global gal_state

     if {($a == 0)||($b == 0)} {return 0}
     set loga [log $a]
     set logb [log $b]
     if {($loga == "")||($logb == "")} {return}
     if {$a == 1} {return $b}
     if {$b == 1} {return $a}
     set logres [expr ($loga+$logb)%$gal_state(max_elem)]
     return [exp $logres]
   }
 proc inv_dec {num} {
 ###############################
 # returns  the inverse of num #
 ###############################
     global gal_state

     if {![info exists gal_state]} {return}
     return [lindex $gal_state(inv)   $num]
   }
 proc exp {num} {
 ###########################
 # returns $primitive^$num #
 ###########################
     global gal_state

     if {![info exists gal_state]} {return}
     return [lindex $gal_state(exp) $num]
   }
 proc log {num} {
 ################################
 # returns log($num)|$primitive #
 ################################
     global gal_state

     if {![info exists gal_state]} {return}
     if {$num == 0} {return}
     return [lindex $gal_state(log)   $num]
   }



 ################################################################################
 # Creation of the look up tables to speed up operations                        #
 ################################################################################
 proc init_gal_dec {poly primitive} {
 #########################################################################
 # create logarithm table in GF(2^ord), ord is the order of the polynom  #
 # create a exponetial table in GF(2^ord)                                #
 # records the polynomial used for the field                             #
 # records the primitive element                                         #
 # records the maximum of the field                                      #
 #########################################################################
     global gal_state
     catch {unset gal_state}

     if {![gal_is_primitive_elem_dec $primitive $poly]} {return 0}

     set num_ent [get_two_power_dec $poly]
     set gal_state(max_elem) [expr {$num_ent-1}]

     set num_exps [expr $num_ent - 1]

     set gal_state(poly) $poly
     set gal_state(primitive) $primitive
     # make exponential table
     set gal_state(exp) [list 1 $primitive]
     incr num_exps -1
     while {$num_exps} {
         lappend gal_state(exp) \
           [gal_mult_dec $poly [lindex $gal_state(exp) end] $primitive]
           incr num_exps -1
       }
     # make logarithm table
     lappend gal_state(log) {}
     set num_logs [expr $num_ent - 1]
     for {set num 1} {$num <= $num_logs} {incr num} {
         lappend gal_state(log) [lsearch -exact $gal_state(exp) $num]
       }
     # make inverse table  
     set log1 [log 1]
     set gal_state(inv)  [list {} "1"]
     for {set num 2} {$num <= $num_logs} {incr num} {
         set logn [log $num]
         lappend gal_state(inv) [exp [expr {$gal_state(max_elem) - $logn}]]
       }
     return 1
   }
 ################################################################################
 # Basic operations in GF(2^n), these are to be used in order to create the     #
 # look up tables for exp,log & inv. The look up tables are supposed to speed   #
 # up calculations                                                              #
 ################################################################################
 proc gal_add_dec {a b} {
 ##################################
 # performs addition in GF(2^n)   #
 # Addition is done bitwise mod 2 #
 ##################################
     if {![string is integer $a]} {return ""}
     if {![string is integer $b]} {return ""}
     return [expr {$a^$b}]
   }
 proc gal_poly_mult_dec {a b} {
 ##############################################
 # performs polynom multiplication in GF(2^n) #
 # Amounts to long multiplication with the    #
 # addition being done bitwise mod 2          #
 ##############################################
     set multiplicator $a
     set multiplicand  $b
     if {$a < $b} {
         set multiplicator $b
         set multiplicand  $a
       }
     # peasants multiplication
     set result 0
     while {$multiplicator != 0} {
         if {[expr {$multiplicator&1}]} {
             set result [gal_add_dec $result $multiplicand]
           }
         set multiplicand  [expr {$multiplicand  << 1}]
         set multiplicator [expr {$multiplicator >> 1}]
       } 
     set result
   }
 proc gal_remainder_dec {a poly} {
 #######################
 # performs a mod poly #
 #######################
     set poly_power [get_two_power_dec $poly]
     set a_power    [get_two_power_dec $a]
     while {$a_power >= $poly_power} {
         set pow_two [get_two_power_dec $a]
         set trial $poly
         while {$trial < $pow_two} {set trial [expr {$trial << 1}]}
         set a [gal_add_dec $a $trial]
         set a_power [get_two_power_dec $a]
       }
     return $a
   }
 proc gal_mult_dec {poly a b} {
 #####################################################
 # performs multiplication of two numbers in gf(2^n) #
 # a x b = polymult(a,b) mod poly                    #
 #####################################################
     return [gal_remainder_dec [gal_poly_mult_dec $a $b] $poly]
   }
 proc gal_find_gen_poly_dec {ord} {
 ##########################################################
 # finds generator polys in GF(2^n), generators are prime #
 # polynoms e.g. irreducible in GF(2^n)                   #
 ##########################################################
     set max_find   [expr int(pow(2,$ord+1))]
     set start_find [expr int(pow(2,$ord))]
     set poly_list {}
     set ret_list {}
     # init with  3
     lappend poly_list 3
     set counter 4
     while {$counter < $max_find} {
         # only odd numbers
         if {[expr {$counter&1}]} {
             set prime 1
             foreach elem $poly_list {
                 if {[gal_remainder_dec $counter $elem] == 0} {
                     set prime 0
                     break
                   }
               }
             if {$prime} {
                 lappend poly_list $counter
                 if {$counter >= $start_find} {
                     lappend ret_list $counter
                   }
               }
           }
         incr counter  
       }
     set ret_list  
   }
 proc gal_is_primitive_elem_dec {num poly} {
 ############################################################
 # checks whether num is a primitive element in the GF(2^n) #
 # defined by poly. Aprimitive element generates all field  #
 # values with its powers (excluding 0)                     #
 ############################################################
     set max_num [get_two_power_dec $poly]
     if {$num >= $max_num} {return 0}
     if {$num == 1} {return 0}
     incr max_num -1
     if {$max_num <= 0} {return 0}
     set power_list {}
     set power $num
     lappend power_list $power
     while {$power != 1} {   
         set power [gal_mult_dec $poly $power $num]
         lappend power_list $power
       }
     if {[llength [lsort -integer -unique $power_list]] == $max_num} {return 1}
     return 0

   }
 proc gal_is_primitive_poly_dec {poly} {
 ######################################################################
 # a polynom is primitive in GF(2^n)|poly if 2 is a primitive element #
 ######################################################################
     return [gal_is_primitive_elem_dec 2 $poly]
   }
 proc gal_get_generators_dec {poly} {
 #####################################################
 # finds all generating elements of the GF(2^n)|poly #
 #####################################################
     set max_num [get_two_power_dec $poly]
     incr max_num -1
     if {$max_num <= 0} {return}
     set gen_list {}
     for {set num 2} {$num < $max_num} {incr num} {
         if {[gal_is_primitive_elem_dec $num $poly]} {
             lappend gen_list $num
           }
       }
     return $gen_list  
   }
 ################################################################################
 # HELPER                                                                       #
 ################################################################################
 proc get_two_power_dec {num} {
 ##################################
 # find power of 2 lower than num #
 ##################################
     set result $num
     set shifted $num
     while {$shifted} {
         set shifted [expr {$shifted >> 1}]
         set result [expr {$result | $shifted}]
       }
     return [expr {$result ^ ($result >>1)}]
   }



enter categories here