loading from memory

Interesting discussions on comp.lang.tcl recorded here, prompted by a list of "helpful projects" for Tcl; see project wish list

> Jeffrey Hobbs <[email protected]> wrote:
>
> >OS internals:
> > * loading DLLs from memory
>
> What's that? It sounds interesting.

I would mean that a shared library could be loaded directly from an in-memory VFS instead of having to write the library to an actual file and invoking the loader on that.

The problem is that, at least on Solaris, libdl provides no facilities at all for doing this. You'd have to code it yourself. Which make this project a bit involved...

>Hey, I didn't say all the projects were *easy*. :)

My bag of tricks is usually pretty good, but I'll tip my hat to someone who can do this on windows. LoadLibrary() needs a filename. We can cheat and use global object names, like "\\.\pipe\somepipe". Now how do we create a block of process memory as a shared object? We can do the reverse with MapViewOfFile() ... whoa, this is tricky. Whatever happened to ram disks anyways?

You'd of course have to do the loading yourself, rather than with LoadLibrary(), but it looks possible - albeit difficult. The last couple of issues of MSDN Magazine have had the details of the file format, and a discussion of how a DLL image gets mapped. An initial implementation wouldn't need to worry about sharing code segments, who cares about swap space anyway?

Text of the articles is at:

    http://msdn.microsoft.com/msdnmag/issues/02/02/PE/PE.asp
    http://msdn.microsoft.com/msdnmag/issues/02/03/PE2/PE2.asp

Sounds very interesting. Loading dlls from ftp servers (or if we ever get to it the great unified tcl tk extension repository (GUTTER, what a disgusting acronym, but you would find real gems in it ;-)) That would be really cool.

This is actually more something for wrapped applications. Sort of pseudostatic linking, but actually dynamic linking.

A merge of this with your idea provides extensibility at runtime ... I.e. package require foo, load DLL from the network, save it into the wrapped application, used that in the future. ... The security risks are high. IOW, I will not do that without proper authentication. A wrapped tcl application as the new worm of the century surfing the net, extending itself for any machine architecture it encounters ... Especially if there are services which compile a package on demand for some architecture ...


JCW - for a hint on how the UPX compressor loads a compressed executable, see the Linux strace output below. The key trick seems to be to exec from a filedescriptor, using /proc/[pid]/$fd - that in itself might also be applicable for loading shared libs where /proc is supported. For Windows, UPX also may be a source of inspiration - given that it can load both compressed exe's and compressed dll's (but I don't know how it does it).

Here's the strace log output showing what happens when an executable uncompresses itself and then launches itself, at least partially memory-based:

  $ strace tclkitsh-linux-x86.upx.bin
  execve("./tclkitsh-linux-x86.upx.bin", ["tclkitsh-linux-x86.upx.bin"], [/* 68 vars */]) = 0
  getpid()                                = 8162
  open("/proc/8162/exe", O_RDONLY)        = 3
  lseek(3, 1608, SEEK_SET)                = 1608
  read(3, "\226k\265\275\2609\25\0\0\0\10\0", 12) = 12
  gettimeofday({1021908344, 154617}, NULL) = 0
  unlink("/tmp/upxCSOH1XOAH5C")           = -1 ENOENT (No such file or directory)
  open("/tmp/upxCSOH1XOAH5C", O_WRONLY|O_CREAT|O_EXCL, 0700) = 4
  ftruncate(4, 1391024)                   = 0
  old_mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40000000
  read(3, "\0\0\10\0s_\3\0", 8)           = 8
  read(3, "\177?d\371\177ELF\1\0\2\0\3\0\32\340\200\4\10}\233g\267"..., 221043) = 221043
  write(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\2\0\3\0\1\0\0\0\340\200"..., 524288) = 524288
  read(3, "\0\0\10\0\212y\3\0", 8)        = 8
  read(3, "]\376\377\377E\374P\215E\370PW\350c\224\377\377\203\304"..., 227722) = 227722
  write(4, "E\374P\215E\370PW\350c\224\377\377\203\304\20\205\300t"..., 524288) = 524288
  read(3, "\2609\5\0{v\1\0", 8)           = 8
  read(3, "\357\366\277\377\213\225\220\356\377\377\213\215\\\n\4"..., 95867) = 95867
  write(4, "\213\225\220\356\377\377\213\215\\\356\377\377\213\4\221"..., 342448) = 342448
  read(3, "\0\0\0\0UPX!", 8)              = 8
  munmap(0x40000000, 528384)              = 0
  close(4)                                = 0
  close(3)                                = 0
  open("/tmp/upxCSOH1XOAH5C", O_RDONLY)   = 3
  access("/proc/8162/fd/3", R_OK|X_OK)    = 0
  unlink("/tmp/upxCSOH1XOAH5C")           = 0
  fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
  execve("/proc/8162/fd/3", ["tclkitsh-linux-x86.upx.bin"], [/* 68 vars */]) = 0
  [...]

DKF: Looking at that, what it appears to be doing is very straight-forward. It decompresses its payload to a temporary file (opened with suitable permissions in /tmp and with other people unable to touch it), reopens that temporary file for reading, deletes the temporary file (relying on classic unix filesystem reference counting semantics) and executes the file descriptor (passing the input argv onwards, presumably).

The only really interesting thing is that you're allowed to execve() a file descriptor. The rest of it is fairly straight-forward (the /tmp directory should have the sticky bit set of course). It has to be this way, of course, since there's no alternative to execve() for executing a program; the system call itself takes a filename.

The problem is that this tells us nothing about how to handle loading a library. That's a different kettle of fish altogether.


Looking at:

http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/upx/cvsroot-upx/src/p_w32pe.cpp?rev=1.40&content-type=text/vnd.viewcvs-markup

(the upx sourceforge project), someone knowledgeable on windoze ought to be able to work out how it loads .dlls. Starting here:

 unsigned PackW32Pe::processImports() // pass 1
 {
    static const upx_byte kernel32dll[] = "KERNEL32.DLL";
    static const char llgpa[] = "\x0\x0""LoadLibraryA\x0\x0""GetProcAddress\x0\x0";
    static const char exitp[] = "ExitProcess\x0\x0\x0";

might be a good start...


APN For Windows, [L1 ] seems to do what's desired. Of course, this capability would need to be in the core.

Code for the above link is in Github at: [L2 ]. Last commit 2019.

chw This looks very promising. However, is it a licensing tar pit? The MemoryModule code is MPL licensed and would require to be statically linked to the Tcl core library to be useful.