'''[http://www.skyfree.org/linux/references/ELF_Format.pdf%|%ELF]''', or '''Executable and Linkable Format''', widely used on [UNIX] systems, is a platform-independent binary [data format%|%format] for object files, libraries and executables. ** See Also ** [Object Dive]: Extracts symbols from objects, producing a graph of relationships. http://repos.modelrealization.com/cgi-bin/fossil/mrtools/wiki?name=ELF+decode%|%elfdecode%|% is a package to read an ELF file and query its components. It supports 32 and 64 bit ELF files. https://github.com/jbroll/riscv-asm/tree/main/elf%|%ELF reader in pure Tcl%|% is a rewrite of [GAM]s elfdecode ** Reading an ELF file *** [AMG]: Here's code to dump the contents of an ELF file, by section. At present, it only supports 32-bit little-endian ELF, since I don't have any other kind of ELF file on hand to check it against. It uses code adapted from [Dump a file in hex and ASCII]. There are many section attributes and other headers it could print, but doesn't; it's easy to add support for what you need. It might not work on stripped binaries. Reference: [http://www.muppetlabs.com/~breadbox/software/ELF.txt] ====== #!/usr/bin/env tclsh if {[llength $argv] == 0 || [llength $argv] > 2} { puts stderr "Usage: [file tail $argv0] FILENAME ?PATTERN?" puts stderr "FILENAME: Name of a 32-bit LE ELF file to dump" puts stderr "PATTERN: Glob-style section name match pattern" puts stderr "All section names are printed if PATTERN is omitted" exit 1 } proc hex {data} { set result "" for {set i 0} {$i < [string length $data]} {incr i 16} { set row [string range $data $i [expr {$i + 15}]] binary scan $row H* hex set hex [regsub -all {(.{4})} [format %-32s $hex] {\1 }] set row [regsub -all {[^[:print:]]} $row .] append result [format "%08x: %s %-16s\n" $i $hex $row] } string range $result 0 end-1 } proc unsigned {bits args} { foreach varname $args { upvar 1 $varname var set var [expr {$var & ((1 << $bits) - 1)}] } } proc sections {chan} { seek $chan 32 binary scan [read $chan 4] i shoff unsigned 32 shoff seek $chan 46 binary scan [read $chan 12] sss shentsize shnum shstrndx unsigned 16 shentsize shnum shstrndx seek $chan [expr {$shoff + 16 + $shstrndx * $shentsize}] binary scan [read $chan 8] ii strtaboff strtabsize unsigned 32 strtaboff strtabsize seek $chan $strtaboff set strtab [read $chan $strtabsize] seek $chan $shoff set result {} for {set i 0} {$i < $shnum} {incr i} { binary scan [read $chan $shentsize] ix12ii name offset size unsigned 32 name offset size if {[string index $strtab $name] ne "\0"} { set end [expr {[string first \0 $strtab $name] - 1}] lappend result [string range $strtab $name $end] $offset $size } } return $result } proc dumpsections {filename {pattern ""}} { set chan [open $filename rb] if {[read $chan 7] ne "\177ELF\1\1\1"} { error "unsupported format" } foreach {name offset size} [sections $chan] { if {$pattern eq ""} { puts [format "%-32s %08x %08x" $name $offset $size] } elseif {[string match $pattern $name]} { seek $chan $offset puts [format "%-32s %08x %08x" $name $offset $size] puts [hex [read $chan $size]] } } close $chan } dumpsections [lindex $argv 0] [lindex $argv 1] ====== Example: ======none [andy@toaster|~/dwarf]$ ./dumpsections.tcl test.o .text 00000034 0000001c .rel.text 00001688 00000018 .data 00000050 00000000 .bss 00000050 00000000 .debug_abbrev 00000050 000000d9 .debug_info 00000129 00000160 .rel.debug_info 000016a0 000000a8 .debug_line 00000289 0000003a .rel.debug_line 00001748 00000008 .debug_macinfo 000002c3 00000cd7 .debug_loc 00000f9a 0000002c .debug_pubnames 00000fc6 0000002b .rel.debug_pubnames 00001750 00000008 .debug_aranges 00000ff1 00000020 .rel.debug_aranges 00001758 00000010 .debug_str 00001011 0000005c .comment 0000106d 00000012 .note.GNU-stack 0000107f 00000000 .debug_frame 00001080 0000002c .rel.debug_frame 00001768 00000010 .shstrtab 000010ac 000000d4 .symtab 00001540 00000130 .strtab 00001670 00000015 [andy@toaster|~/dwarf]$ ./dumpsections.tcl test.o .debug_str .debug_str 00001011 0000005c 00000000: 756e 7369 676e 6564 2069 6e74 006c 6f6e unsigned int.lon 00000010: 676e 616d 6500 6c6f 6e67 2069 6e74 0061 gname.long int.a 00000020: 7267 7600 6172 6763 006d 6169 6e00 6368 rgv.argc.main.ch 00000030: 6172 006e 6578 7400 2f68 6f6d 652f 616e ar.next./home/an 00000040: 6479 2f64 7761 7266 0074 6573 742e 6300 dy/dwarf.test.c. 00000050: 474e 5520 4320 342e 342e 3400 GNU C 4.4.4. ====== ---- [AMG]: Here is code to read and modify the symbol table of a 32-bit little-endian ELF file. The modification performed is to find every common symbol (whose name is listed in an input file) and make it be undefined instead. This is done to work around some assorted strangeness with [Fortran] and the dynamic linker. You may find the code useful as an example of reading and writing a symbol table. ====== #!/usr/bin/env tclsh package require Tcl 8.4 # Read an 8-bit unsigned value. proc read8 {chan} { binary scan [read $chan 1] c result expr {$result & 0xff} } # Read a 16-bit unsigned value. proc read16 {chan} { binary scan [read $chan 2] s result expr {$result & 0xffff} } # Read a 32-bit unsigned value. proc read32 {chan} { binary scan [read $chan 4] i result expr {$result & 0xffffffff} } # Edit $objfile to change SHN_COMMON symbols listed in $commonfile to SHN_UNDEF. proc common_to_undef {objfile commonfile} { # Read the common symbol list file. set chan [open $commonfile] set commons [lsort -unique [split [read $chan] \n]] if {[lindex $commons 0] eq ""} { set commons [lrange $commons 1 end] } close $chan # Open the ELF object file read/write. set chan [open $objfile r+] fconfigure $chan -translation binary -buffering none # Read initial header and confirm type is supported. # Bytes 0-3: EI_MAG0-3 "\177ELF" # Byte 4: EI_CLASS "\1" 32-bit object # Byte 5: EI_DATA "\1" ELFDATA2LSB two's complement little endian # Byte 6: EI_VERSION "\1" ELF 1 if {[read $chan 7] ne "\177ELF\1\1\1"} { error "unsupported format; must be 32-bit little-endian ELF" } # Read section header table location, entry size, and entry count. Also # read index of section name string table. seek $chan 32 set shoff [read32 $chan] seek $chan 46 set shentsize [read16 $chan] set shnum [read16 $chan] set shstrndx [read16 $chan] if {$shoff == 0} { error "no section header table" } elseif {$shentsize != 40} { error "bad section header table entry size $shentsize: must be 40" } elseif {$shstrndx == 0} { error "no section name string table" } # Read section name string table location and size. seek $chan [expr {$shoff + 16 + $shstrndx * 40}] set strtaboff [read32 $chan] set strtabsize [read32 $chan] # Read section name string table. seek $chan $strtaboff set shstrtab [read $chan $strtabsize] # Read section header table. Search for the SHT_SYMTAB section named # ".symtab" and the SHT_STRTAB section named ".strtab". seek $chan $shoff for {set i 0} {$i < $shnum} {incr i} { # Process the section according to the name and type. set name [read32 $chan] set type [read32 $chan] set end [expr {[string first \0 $shstrtab $name] - 1}] set name [string range $shstrtab $name $end] if {$type == 2 && $name eq ".symtab"} { # Found SHT_SYMTAB named ".symtab". seek $chan 8 current set symtab_offset [read32 $chan] set symtab_size [read32 $chan] seek $chan 12 current set symtab_entsize [read32 $chan] } elseif {$type == 3 && $name eq ".strtab"} { # Found SHT_STRTAB named ".strtab". seek $chan 8 current set strtab_offset [read32 $chan] set strtab_size [read32 $chan] seek $chan 16 current } else { # Ignore all other sections. seek $chan 32 current } } if {![info exists symtab_offset]} { error "no symbol table" } elseif {![info exists strtab_offset]} { error "no symbol string table" } elseif {$symtab_entsize != 16} { error "bad symbol table entry size $symtab_entsize: must be 16" } # Read symbol string table. seek $chan $strtab_offset set strtab [read $chan $strtab_size] # Read symbol table. Search for STB_GLOBAL symbols in SHN_COMMON whose # names match those in the common file. Modify these symbols to have zero # value and be in SHN_UNDEF. seek $chan $symtab_offset for {set i 0} {$i < $symtab_size} {incr i 16} { # Get symbol information. set name [read32 $chan] seek $chan 4 current set size [read32 $chan] set info [read8 $chan] seek $chan 1 current set shndx [read16 $chan] set end [expr {[string first \0 $strtab $name] - 1}] set name [string range $strtab $name $end] # Modify selected symbols to be undefined. if {$info >> 4 == 1 && $shndx == 0xfff2 && [lsearch -sorted $commons $name] != -1} { seek $chan -12 current puts -nonewline $chan \0\0\0\0 seek $chan 6 current puts -nonewline $chan \0\0 } } # Close the ELF file. close $chan } # First two arguments must be object file and common file names. common_to_undef [lindex $argv 0] [lindex $argv 1] # vim: set sts=4 sw=4 tw=80 et ft=tcl: ====== The "`buffering -none`" is to work around a bug in Tcl 8.4. Without it, I get this error: ======none andy@slack:~/elf$ ./common_to_undef.tcl test.o commons error during seek on "file5": bad address in system call argument while executing "seek $chan 4 current" (procedure "common_to_undef" line 94) invoked from within "common_to_undef [lindex $argv 0] [lindex $argv 1]" (file "./common_to_undef.tcl" line 140) ====== Tcl 8.6 does not have this bug. Testing... ======none andy@slack:~/elf$ cat test.c #define EXIT_SUCCESS 0 struct foo { unsigned x, y[5], *z[2][3][4]; struct foo *next; } foo; union bar { struct foo foo; long a; int b; } bar; extern int zzz; int data = 22; int main(int argc, const char *const *argv) { foo.x = argc; bar.a = foo.x; zzz = 999; return EXIT_SUCCESS; } andy@slack:~/elf$ objdump -t test.o test.o: file format elf32-i386 SYMBOL TABLE: 00000000 l df *ABS* 00000000 test.c 00000000 l d .text 00000000 .text 00000000 l d .data 00000000 .data 00000000 l d .bss 00000000 .bss 00000000 l d .note.GNU-stack 00000000 .note.GNU-stack 00000000 l d .eh_frame 00000000 .eh_frame 00000000 l d .comment 00000000 .comment 0000007c O *COM* 00000020 foo 0000007c O *COM* 00000020 bar 00000000 g O .data 00000004 data 00000000 g F .text 00000026 main 00000000 *UND* 00000000 zzz andy@slack:~/elf$ cat commons foo andy@slack:~/elf$ ./common_to_undef.tcl test.o commons andy@slack:~/elf$ objdump -t test.o test.o: file format elf32-i386 SYMBOL TABLE: 00000000 l df *ABS* 00000000 test.c 00000000 l d .text 00000000 .text 00000000 l d .data 00000000 .data 00000000 l d .bss 00000000 .bss 00000000 l d .note.GNU-stack 00000000 .note.GNU-stack 00000000 l d .eh_frame 00000000 .eh_frame 00000000 l d .comment 00000000 .comment 00000000 O *UND* 0000007c foo 0000007c O *COM* 00000020 bar 00000000 g O .data 00000004 data 00000000 g F .text 00000026 main 00000000 *UND* 00000000 zzz ====== As shown by the second objdump, `foo` has become undefined. Common and undefined symbols work almost the same way; the linker will replace either with a defined symbol if it finds one. But if it doesn't, the behaviors differ when making a shared object (dynamic library). Undefined symbols remain undefined, but common symbols become BSS symbols. For my application, I require undefined symbols, but Fortran does not have "extern" variables, only commons, so I add this feature using the above script. If anyone reading this says I shouldn't want Fortran code to communicate via commons across shared objects, you are right, I shouldn't want this, and I don't. But that doesn't matter; I am constrained by circumstances I can't change. <> Glossary | Operating System