loading from memory

Difference between version 17 and 18 - Previous - Next
Interesting discussions on comp.lang.tcl recorded here, prompted by a
list of "helpful projects" for Tcl; see [project wish list]

    :   > Jeffrey Hobbs <[email protected]> wrote:
    :   > 
    :   > >OS internals:
    :   > >        * loading DLLs from memory
    :   > 
    :   > What's that?  It sounds interesting.

I would mean that a shared library could be loaded directly from an
in-memory VFS instead of having to write the library to an actual file
and invoking the loader on that.

The problem is that, at least on [Solaris], libdl provides no facilities at all
for doing this.  You'd have to code it yourself.  Which make this project a bit
involved...

    :   >Hey, I didn't say all the projects were *easy*.  :)

My bag of tricks is usually pretty good, but I'll tip my hat to someone who can
do this on windows.  LoadLibrary() needs a filename.  We can cheat and use
global object names, like "\\.\pipe\somepipe".  Now how do we create a block of
process memory as a shared object?  We can do the reverse with MapViewOfFile()
...  whoa, this is tricky.  Whatever happened to ram disks anyways?

You'd of course have to do the loading yourself, rather than with
LoadLibrary(), but it looks possible - albeit difficult.  The last couple
of issues of ''MSDN Magazine'' have had the details of the file format,
and a discussion of how a DLL image gets mapped.  An initial implementation
wouldn't need to worry about sharing code segments, who cares about swap
space anyway?

Text of the articles is at:

    http://msdn.microsoft.com/msdnmag/issues/02/02/PE/PE.asp
    http://msdn.microsoft.com/msdnmag/issues/02/03/PE2/PE2.asp

    
Sounds very interesting. Loading dlls from ftp servers (or if we ever 
get to it the
great unified tcl tk extension repository ([GUTTER], what a disgusting 
acronym, but you would find real gems in it ;-)) That would be really cool.

This is actually more something for wrapped applications. Sort of pseudostatic linking,
but actually dynamic linking.

A merge of this with your idea provides extensibility at runtime ...
I.e. package require foo, load DLL from the network, save it into the
wrapped application, used that in the future. ... The security risks are
high. IOW, I will not do that without proper authentication. A wrapped tcl
application as the new worm of the century surfing the net, extending itself
for any machine architecture it encounters ... Especially if there are
services
which compile a package on demand for some architecture ...
----
[JCW] - for a hint on how the UPX compressor loads a compressed executable, see the Linux ''strace'' output below.  The key trick seems to be to exec from a filedescriptor, using /proc/[[pid]]/$fd - that in itself might also be applicable for loading shared libs where /proc is supported.  For Windows, UPX also may be a source of inspiration - given that it can load both compressed exe's and compressed dll's (but I don't know how it does it).

Here's the strace log output showing what happens when an executable uncompresses itself and then launches itself, at least partially memory-based:
  $ strace tclkitsh-linux-x86.upx.bin
  execve("./tclkitsh-linux-x86.upx.bin", ["tclkitsh-linux-x86.upx.bin"], [/* 68 vars */]) = 0
  getpid()                                = 8162
  open("/proc/8162/exe", O_RDONLY)        = 3
  lseek(3, 1608, SEEK_SET)                = 1608
  read(3, "\226k\265\275\2609\25\0\0\0\10\0", 12) = 12
  gettimeofday({1021908344, 154617}, NULL) = 0
  unlink("/tmp/upxCSOH1XOAH5C")           = -1 ENOENT (No such file or directory)
  open("/tmp/upxCSOH1XOAH5C", O_WRONLY|O_CREAT|O_EXCL, 0700) = 4
  ftruncate(4, 1391024)                   = 0
  old_mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40000000
  read(3, "\0\0\10\0s_\3\0", 8)           = 8
  read(3, "\177?d\371\177ELF\1\0\2\0\3\0\32\340\200\4\10}\233g\267"..., 221043) = 221043
  write(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\2\0\3\0\1\0\0\0\340\200"..., 524288) = 524288
  read(3, "\0\0\10\0\212y\3\0", 8)        = 8
  read(3, "]\376\377\377E\374P\215E\370PW\350c\224\377\377\203\304"..., 227722) = 227722
  write(4, "E\374P\215E\370PW\350c\224\377\377\203\304\20\205\300t"..., 524288) = 524288
  read(3, "\2609\5\0{v\1\0", 8)           = 8
  read(3, "\357\366\277\377\213\225\220\356\377\377\213\215\\\n\4"..., 95867) = 95867
  write(4, "\213\225\220\356\377\377\213\215\\\356\377\377\213\4\221"..., 342448) = 342448
  read(3, "\0\0\0\0UPX!", 8)              = 8
  munmap(0x40000000, 528384)              = 0
  close(4)                                = 0
  close(3)                                = 0
  open("/tmp/upxCSOH1XOAH5C", O_RDONLY)   = 3
  access("/proc/8162/fd/3", R_OK|X_OK)    = 0
  unlink("/tmp/upxCSOH1XOAH5C")           = 0
  fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
  execve("/proc/8162/fd/3", ["tclkitsh-linux-x86.upx.bin"], [/* 68 vars */]) = 0
  [...]

[DKF]: Looking at that, what it appears to be doing is very straight-forward. It decompresses its payload to a temporary file (opened with suitable permissions in /tmp and with other people unable to touch it), reopens that temporary file for reading, deletes the temporary file (relying on classic unix filesystem reference counting semantics) and executes the file descriptor (passing the input argv onwards, presumably).

The only really interesting thing is that you're allowed to execve() a file descriptor. The rest of it is fairly straight-forward (the /tmp directory should have the sticky bit set of course). It has to be this way, of course, since there's no alternative to execve() for executing a program; the system call itself takes a filename.

The problem is that this tells us nothing about how to handle loading a library. That's a different kettle of fish altogether.
----
Looking at:

http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/upx/cvsroot-upx/src/p_w32pe.cpp?rev=1.40&content-type=text/vnd.viewcvs-markup

(the upx sourceforge project), someone knowledgeable on windoze ought to be able to work out how it loads .dlls.  Starting here:

 unsigned PackW32Pe::processImports() // pass 1
 {
    static const upx_byte kernel32dll[] = "KERNEL32.DLL";
    static const char llgpa[] = "\x0\x0""LoadLibraryA\x0\x0""GetProcAddress\x0\x0";
    static const char exitp[] = "ExitProcess\x0\x0\x0";

might be a good start...

----

[APN] For Windows, [http://www.joachim-bauch.de/tutorials/loading-a-dll-from-memory/] seems to do what's desired. Of course, this capability would need to be in the core.
Code for the above link is in Guithub at: [https://github.com/fancycode/MemoryModule]. Last commit 2019.
[chw] This looks very promising. However, is it a licensing tar pit? The MemoryModule code is MPL licensed and would require to be statically linked to the Tcl core library to be useful. 
<<categories>> Concept