A different take on remote execution

Usually we experience two types of remote execution in practice:

  • You log in to some remote machine and run the application there (ssh), and
  • You get remote application copied locally and run it (mounting remote filesystems, javascript download, etc.)

There are other ideas too of course, but less popular in practice. For example executing a remote application and forwarding its graphics back to local machine using X’s protocols - but that only moves the interface.

When does location matter?

So when does it matter which computer does the application execute on? Mostly when it actually interacts with the attached devices. (virtual or real) It doesn’t really matter if it’s the remote or local side that calculates “1+2” - apart from the speed, the result will be the same. But if you want to open a file and read from it, it really matters which machine executes that code.

The same thing applies to network connection, character and block devices, various (pseudo-)terminals, local queues, and other system bits. It also matters in case the application memory is shared with other threads or programs.

Thankfully, most of those operations happen via libc calls and then syscalls which are a definite border between the app execution and the OS execution. They separate side effects from calculations.

So how to execute something remotely?

Unless you start moving to Plan 9 you cannot easily just access something from other machine. But there’s a way we could do local syscalls forwarding and libremotec is a toy project which shows it’s partially possible… just very very hard to apply in reality.

The way it works is by using LD_PRELOAD variable to inject itself into a process and override some standard library functions. Currently these are open, read, close, fstat, lstat, lseek, faccessat, getxattr. Not a lot, but: it works for some basic programs, it’s not supposed to actually do everything.

When starting a program with libremotec, you need a server somewhere (can be the same machine for testing) - it will listen on port 12345. Run server part there. Then start your own application with the right LD_PRELOAD and other configuration options. For example like this:

LD_PRELOAD=./libremotec.so LOCAL_PATHS=/dev:/usr/lib \
    cat /etc/resolv.conf

This will override the functions mentioned before and execute the cat command. The output looks like this:

open() on remote file /etc/resolv.conf
opened /etc/resolv.conf, local 2048, remote 5
fstat() on remote fd 2048
read() 131072 bytes on remote fd 2048
# Generated by resolvconf
read() 131072 bytes on remote fd 2048

This output shows that the /etc/resolv.conf was not in the LOCAL_PATHS and was actually opened on the remote side. Then all the operations on it have been forwarded too: fstat() and read().

What won’t work easily

First, even though it’s possible to rewrite most of the requests to have logical equivalents on the remote side, it’s going to be hard if someone uses (for example) 32bit little-endian system on one side and 64bit big-endian on the other. Or when different systems may have different errno results. In general this hack is safest between two (almost) identical systems in early stages. At least the architecture should match… Then once most of the things work, it can be improved on to add compatibility.

Finally, there are some libc functions which just can’t be imported without big hacks. For example various calls of mmap() are going to be very tricky to implement properly. If we need to have some memory pointing to a file which happens to be on the remote side, how would that be handled? The memory reads/writes do not cause the read()/write() functions to be called. For this reason, file utility cannot be easily used this way.

The next interesting bit would be sockets on both remote and local side. They couldn’t be easily passed to select() or poll() together. But if they’re only on one side at a time, they should work just fine.

How does the forwarding work in the current project?

Each call has some extra logic. First, the originals get saved in the library’s constructor:

static void __attribute__((constructor)) initialise() {
    orig_close = dlsym(RTLD_NEXT, "close");

Then each wrapper function is defined on it’s own like this: (application itself uses the new functions by default, because of matching symbols)

int close(int fildes) {
    if (fildes < REMOTE_FD_SHIFT) {
        // local part
        return orig_close(fildes);
    } else {
        // remote part
        int remote_fd = remote_fds[fildes - REMOTE_FD_SHIFT];
        int ret = call_remote_close(remote_fd);
        if (ret == -1) {
            errno = remote_errno;
        } else {
            remote_fds[fildes - REMOTE_FD_SHIFT] = 0;
        return ret;

Each time one of those functions is called, libremotec decides which side should handle the request. This may be done based on either the path (open, stat), or fd (read, seek). Some are a bit more complicated - for example faccessat() should ideally first get the absolute path from the fd, then append and normalise the string, then check against known local paths again. fcntl(F_DUPFD, ...) is likely to have interesting edge cases too. Some calls should actually be replicated on both sides - for example prctl(SET_...) operations.

Any file descriptor which is opened on the remote side is assigned a local fd, which is high enough that it hopefully doesn’t collide with local ones. The open() wrapper should actually make sure that there’s no possibility of collision. In the example above:

opened /etc/resolv.conf, local 2048, remote 5

The file got fd 5 on the remote side, but the wrapper will handle and forward all calls to fd 2048 as if they were local. 2048 is the first remote fd reserved and is the REMOTE_FD_SHIFT in the code.

Why is that interesting?

There may be many reasons to do something like this:

  • To offload computation itself and just transfer the data
  • To operate on local and remote resources at the same time
  • To run local application against a system where it cannot be installed

My ideal use case would be to run a real remote-shell where my local home directory and tools are transparently available on the remote machine. That means - a shell which reads my local .zshrc, runs local htop even though it’s not available on the remote side, and gives me the display I expected locally. Unfortunately I found out this may take a long time before becoming reality. The best I can do for now is fuse and static binaries. But I have high hopes for things like 9P in the future.

This is not a new idea in grid computing either, but there it’s done at a slightly different level and with different intentions. There are a few papers / books talking about I/O system calls redirection in grids.

Was it useful? BTC: 182DVfre4E7WNk3Qakc4aK7bh4fch51hTY