Usually we experience two types of remote execution in practice:
- You log in to some remote machine and run the application there (ssh), and
- You get remote application copied locally and run it (mounting remote filesystems, javascript download, etc.)
There are other ideas too of course, but less popular in practice. For example executing a remote application and forwarding its graphics back to local machine using X’s protocols - but that only moves the interface.
When does location matter?
So when does it matter which computer does the application execute on? Mostly when it actually interacts with the attached devices. (virtual or real) It doesn’t really matter if it’s the remote or local side that calculates “1+2” - apart from the speed, the result will be the same. But if you want to open a file and read from it, it really matters which machine executes that code.
The same thing applies to network connection, character and block devices, various (pseudo-)terminals, local queues, and other system bits. It also matters in case the application memory is shared with other threads or programs.
Thankfully, most of those operations happen via libc calls and then syscalls which are a definite border between the app execution and the OS execution. They separate side effects from calculations.
So how to execute something remotely?
Unless you start moving to Plan 9 you cannot easily just access something from other machine. But there’s a way we could do local syscalls forwarding and libremotec is a toy project which shows it’s partially possible… just very very hard to apply in reality.
The way it works is by using LD_PRELOAD
variable to inject itself into a
process and override some standard library functions. Currently these are
open
, read
, close
, fstat
, lstat
, lseek
, faccessat
, getxattr
.
Not a lot, but: it works for some basic programs, it’s not supposed to actually
do everything.
When starting a program with libremotec
, you need a server somewhere (can be
the same machine for testing) - it will listen on port 12345. Run server
part
there. Then start your own application with the right LD_PRELOAD
and other
configuration options. For example like this:
LD_PRELOAD=./libremotec.so LOCAL_PATHS=/dev:/usr/lib \
REMOTE_SERVER=remote.host.ip LIBREMOTEC_DEBUG=1 \
cat /etc/resolv.conf
This will override the functions mentioned before and execute the cat command. The output looks like this:
open() on remote file /etc/resolv.conf
opened /etc/resolv.conf, local 2048, remote 5
fstat() on remote fd 2048
read() 131072 bytes on remote fd 2048
# Generated by resolvconf
nameserver 192.168.0.1
read() 131072 bytes on remote fd 2048
This output shows that the /etc/resolv.conf
was not in the LOCAL_PATHS
and
was actually opened on the remote side. Then all the operations on it have been
forwarded too: fstat()
and read()
.
What won’t work easily
First, even though it’s possible to rewrite most of the requests to have
logical equivalents on the remote side, it’s going to be hard if someone uses
(for example) 32bit little-endian system on one side and 64bit big-endian on
the other. Or when different systems may have different errno
results. In
general this hack is safest between two (almost) identical systems in early
stages. At least the architecture should match… Then once most of the things
work, it can be improved on to add compatibility.
Finally, there are some libc
functions which just can’t be imported without
big hacks. For example various calls of mmap()
are going to be very tricky to
implement properly. If we need to have some memory pointing to a file which
happens to be on the remote side, how would that be handled? The memory
reads/writes do not cause the read()
/write()
functions to be called. For
this reason, file
utility cannot be easily used this way.
The next interesting bit would be sockets on both remote and local side. They
couldn’t be easily passed to select()
or poll()
together. But if they’re
only on one side at a time, they should work just fine.
How does the forwarding work in the current project?
Each call has some extra logic. First, the originals get saved in the library’s constructor:
static void __attribute__((constructor)) initialise() {
...
orig_close = dlsym(RTLD_NEXT, "close");
...
Then each wrapper function is defined on it’s own like this: (application itself uses the new functions by default, because of matching symbols)
int close(int fildes) {
if (fildes < REMOTE_FD_SHIFT) {
// local part
return orig_close(fildes);
} else {
// remote part
int remote_fd = remote_fds[fildes - REMOTE_FD_SHIFT];
int ret = call_remote_close(remote_fd);
if (ret == -1) {
errno = remote_errno;
} else {
remote_fds[fildes - REMOTE_FD_SHIFT] = 0;
}
return ret;
}
}
Each time one of those functions is called, libremotec
decides which side
should handle the request. This may be done based on either the path (open,
stat), or fd (read, seek). Some are a bit more complicated - for example
faccessat()
should ideally first get the absolute path from the fd, then
append and normalise the string, then check against known local paths again.
fcntl(F_DUPFD, ...)
is likely to have interesting edge cases too. Some calls
should actually be replicated on both sides - for example prctl(SET_...)
operations.
Any file descriptor which is opened on the remote side is assigned a local fd,
which is high enough that it hopefully doesn’t collide with local ones. The
open()
wrapper should actually make sure that there’s no possibility of
collision. In the example above:
opened /etc/resolv.conf, local 2048, remote 5
The file got fd 5 on the remote side, but the wrapper will handle and forward
all calls to fd 2048 as if they were local. 2048 is the first remote fd
reserved and is the REMOTE_FD_SHIFT
in the code.
Why is that interesting?
There may be many reasons to do something like this:
- To offload computation itself and just transfer the data
- To operate on local and remote resources at the same time
- To run local application against a system where it cannot be installed
My ideal use case would be to run a real remote-shell where my local home
directory and tools are transparently available on the remote machine. That
means - a shell which reads my local .zshrc
, runs local htop
even though
it’s not available on the remote side, and gives me the display I expected
locally. Unfortunately I found out this may take a long time before becoming
reality. The best I can do for now is fuse and static binaries. But I have high
hopes for things like 9P in the future.
This is not a new idea in grid computing either, but there it’s done at a slightly different level and with different intentions. There are a few papers / books talking about I/O system calls redirection in grids.