Function Hooking Part I: Hooking Shared Library Function Calls in Linux
When assessing an application for weaknesses in a linux environment, we won’t always have the luxury of freely available source code or documentation. As a result, these situations require more of a black box approach where much of the information about the application will be revealed by attempting to monitor things such as network communications, calls to cryptographic functions, and file I/O.
One method of monitoring applications to extract information is to attach a debugger, such as GDB, to the process and to dump register or stack values as breakpoints are hit for the desired function calls. While this has the advantage of giving fine grained control over things such as code flow and register contents, it is also a cumbersome process compared to hooking the function calls of interest to modify their behavior.
Function call hooking refers to a range of techniques used to intercept calls to pre-existing functions and wrap around them to modify the function’s behavior at runtime. In this article we’ll be focusing on function hooking in linux using the dynamic loader API, which allows us to dynamically load and execute calls from shared libraries on the system at runtime, and allows us to wrap around existing functions by making use of the LD_PRELOAD environment variable.
The LD_PRELOAD environment variable is used to specify a shared library that is to be loaded first by the loader. Loading our shared library first enables us to intercept function calls and using the dynamic loader API we can bind the originally intended function to a function pointer and pass the original arguments through it, effectively wrapping the function call.
Let’s use the ubiquitous “hello world” demonstration as an example. In this example we’ll intercept the puts function and change the output. Here’s our helloworld.c file:
#include <stdio.h> #include <unistd.h> int main() { puts("Hello world!n"); return 0; }
Here’s our libexample.c file:
#include <stdio.h> #include <unistd.h> #include <dlfcn.h> int puts(const char *message) { int (*new_puts)(const char *message); int result; new_puts = dlsym(RTLD_NEXT, "puts"); if(strcmp(message, "Hello world!n") == 0) { result = new_puts("Goodbye, cruel world!n"); } else { result = new_puts(message); } return result; }
Let’s take a moment to examine what’s going on here in our libexample.c file:
- Line 5 contains our puts function declaration. To intercept the original puts we define a function with the exact same name and function signature as the original libc puts function.
- Line 7 declares the function pointer new_puts that will point to the originally intended puts function. As before with the intercepting function declaration this pointer’s function signature must match the function signature of puts.
- Line 10 initializes our function pointer using the dlsym() function. The RTLD_NEXT enum tells the dynamic loader API that we want to return the next instance of the function associated with the second argument (in this case puts) in the load order.
- We compare the argument passed to our puts hook against “Hello world!n” on line 12 and if it matches, we replace it with “Goodbye, cruel world!n”. If the two strings do not match we simply pass the original message on to puts on line 14.
Now let’s build everything and test it out:
sigma@ubuntu:~/code$ gcc helloworld.c -o helloworld sigma@ubuntu:~/code$ gcc libexample.c -o libexample.so -fPIC -shared -ldl -D_GNU_SOURCE sigma@ubuntu:~/code$
First we compile helloworld.c as one normally would. Next we compile libexample.c into a shared library by specifying the -shared and -fPIC compile flags and link against libdl using the -ldl flag. The -D_GNU_SOURCE flag is specified to satisfy #ifdef conditions that allow us to use the RTLD_NEXT enum. Optionally this flag can be replaced by adding “#define _GNU_SOURCE” somewhere near the top of our libexample.c file. After compiling our source files, we set the LD_PRELOAD environment variable to point to the location of our newly created shared library.
sigma@ubuntu:~/code$ export LD_PRELOAD="/home/sigma/code/libexample.so"
After setting LD_PRELOAD we’re ready to run our helloworld binary. Executing the binary produces the following output:
sigma@ubuntu:~/code$ ./helloworld Goodbye, cruel world! sigma@ubuntu:~/code$
As expected, when our helloworld binary is executed the puts function is intercepted and “Goodbye, cruel world!” rather than the original “Hello world!” string is displayed. Now that we’re familiar with the process of hooking function calls let’s apply it towards a bit more practical example. Let’s pretend for a moment that we have an application that we are assessing and that this application uses OpenSSL to encrypt communications of sensitive data. Let’s also assume that attempts to man-in-the-middle these communications at the network level have been fruitless. To get at this sensitive data we will intercept calls to SSL_write, the function responsible for encrypting then sending data over a socket. Intercepting SSL_write will allow us to log the string sent to the function and pass the original parameters along, effectively bypassing the encryption protections while allowing the application to run normally. To get started let’s take a look at the SSL_write function definition:
int SSL_write(SSL *ssl, const void *buf, int num);
Here’s the code I’ve written to intercept SSL_write in hook.c:
#include <stdio.h> #include <unistd.h> #include <dlfcn.h> #include <openssl/ssl.h> int SSL_write(SSL *context, const void *buffer, int bytes) { int (*new_ssl_write)(SSL *context, const void *buffer, int bytes); new_ssl_write = dlsym(RTLD_NEXT, "SSL_write"); FILE *logfile = fopen("logfile", "a+"); fprintf(logfile, "Process %d:nn%snnn", getpid(), (char *)buffer); fclose(logfile); return new_ssl_write(context, buffer, bytes); }
As we can see our function definition needs to return an integer and take three arguments: a pointer to an SSL context, a pointer to a buffer containing the string to encrypt, and the number of bytes to write. In addition to our intercepting function definition we define a matching function pointer that will point to the originally intended SSL_write function and initialize it with the dlsym function. After pointing our pointer to the original function, we log the process ID of the process calling SSL_write, and the string sent to it. Next we compile our source to a shared library:
sigma@ubuntu:~/code$ gcc hook.c -o libhook.so -fPIC -shared -lssl -D_GNU_SOURCE sigma@ubuntu:~/code$
The only difference between this compilation and last is the -lssl flag, which we specify in order to link our code against the OpenSSL library. Now let’s go ahead and set LD_PRELOAD to point to our newly created libhook library:
sigma@ubuntu:~/code$ export LD_PRELOAD="/home/sigma/code/libhook.so" sigma@ubuntu:~/code$
Now that LD_PRELOAD is set we’re ready to start intercepting calls to SSL_write on processes executed from here onward. To test this let’s go ahead and use the curl utility over HTTPS and intercept the HTTPS request.
sigma@ubuntu:~/code$ curl https://www.netspi.com > /dev/null % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 19086 0 19086 0 0 37437 0 --:--:-- --:--:-- --:--:-- 60590 sigma@ubuntu:~/code$
After successful completion of the command there should be a log file that we can examine:
sigma@ubuntu:~/code$ cat logfile Process 11423: GET / HTTP/1.1 User-Agent: curl/7.22.0 (i686-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3 Host: www.netspi.com Accept: */* sigma@ubuntu:~/code$
As we can see the request has been logged in plaintext, while the application was allowed to function normally. Had this been a scenario where data integrity relied heavily upon SSL encryption and the assumption that man-in-the-middle attacks would be occurring only at the network level, any such integrity would have been compromised. These are really just a few examples of what’s possible using the dynamic loader API and LD_PRELOAD. Since the shared library we create will be loaded into the running process’ memory space we could do things like dump the memory of the process to examine the memory at runtime or tamper with runtime variables. Other uses for this method of function call hooking and loading generally fall under the use case of user-land rootkits and malware, which will be the focus on the next article in this series.
Explore more blog posts
Clarifying CAASM vs EASM and Related Security Solutions
Unscramble common cybersecurity acronyms with our guide to CAASM vs EASM and more to enhance attack surface visibility and risk prioritization.
Filling up the DagBag: Privilege Escalation in Google Cloud Composer
Learn how attackers can escalate privileges in Cloud Composer by exploiting the dedicated Cloud Storage Bucket and the risks of default configurations.
Bytes, Books, and Blockbusters: The NetSPI Agents’ Top Cybersecurity Fiction Picks
Craving a cybersecurity movie marathon? Get recommendations from The NetSPI Agents on their favorite media to get inspired for ethical hacking.