System Programming: Hijack Linux System Calls: Part III. System Call Table

Thursday, October 13, 2011

Hijack Linux System Calls: Part III. System Call Table

This is the last part of the Hijack Linux System Calls series. By now, we have created a simple loadable kernel module which registers a miscellaneous character device. This means, that we have everything we need in order to patch the system call table. Almost everything, to be honest. We still have to fill the our_ioctl function and add a couple of declarations to our source file. By the end of this article we will be able to intercept any system call in our system should there be a need for that.

System Call Table

System Call table is simply an area in the kernel memory space that contains addresses of system call handlers. Actually, a system call number is an offset into that table. This means that when we call sys_write (to be more precise - when libc calls sys_write) on a 32 bit system and passes number 4 in EAX register before int 0x80, it simply tells the kernel to go to the system call table, get the value at offset 4 from the system call table's address and call the function that address points to. It may be number 1 in RAX in case of a 64 bit system (and syscall instead of int 0x80). System call numbers are defined in arch/x86/include/asm/unistd_32.h and arch/x86/include/asm/unistd_64.h for 32 and 64 bit platforms respectively. In this article, we are going to deal with sys_open system call which is number 5 for 32 bit systems and number 2 for 64 bit systems.

Due to the fact, that modern kernels do not export the sys_call_table symbol any more, we will have to find its location in memory ourselves. There are some "hackish" ways of finding the location of the sys_call_table programmatically, but the problem is that they may work, but may not work as well. Especially the way they are written. Therefore, we are going to use the simplest and the safest way - read its location from /boot/System.map file. For simplicity reasons, we will just use grep and hardcode the address. On my computer, the command grep "sys_call_table" /boot/System.map (you should check the file name on your system, as on mine it is /boot/System.map-2.6.38-11-generic) gives this output "ffffffff816002e0 R sys_call_table". Add global variable unsigned long *sys_call_table = (unsigned long*)0xYour_Address_Of_Sys_call_table.

Preparations

We will start, as usual, by adding new includes to our code. This time, those include files are:

#include <linux/highmem.h>

#include <asm/unistd.h>

The first one is needed due to the fact that system call table is located in read only memory area in modern kernels and we will have to modify the protection attributes of the memory page containing the address of the system call that we want to intercept. The second one is self explanatory after the previous paragraph. We are not going to use hardcoded values for system calls, instead, we will use the values defined in unistd.h header.

Now we define two values, which would be used as cmd argument to our_ioctl function. One will tell us to patch the table, another one will tell us to fix it by restoring the original value.

/* IOCTL commands */

#define IOCTL_PATCH_TABLE 0x00000001

#define IOCTL_FIX_table 0x00000004

Add one more global variable int is_set=0 which will be used as flag telling whether the real (0) or custom(1) system call is in use.

It is important to save the address of the original sys_open as we are not going to fully implement our own, instead, our function will log information about the call arguments and then perform the actual (original) call. Therefore, we define a function pointer (for original call) and a function (for custom call):

/* Pointer to the original sys_open */

asmlinkage int (*real_open)(const char* __user, int, int);

/* Our replacement */

asmlinkage int custom_open(const char* __user file_name, int flags, int mode)

{

printk("interceptor: open(\"%s\", %X, %X)\n", file_name,

flags,

mode);

return real_open(file_name, flags, mode);

}

You have noticed the "asmlinkage" attribute. Well, it is, actually, a define for the attribute. We will not go that deep this time, I will just say that this attribute tells the compiler about how it should pass arguments to the function, given that it is being called from an assembly code. The "__user" macro, signifies that the argument is in user space and the function must perform certain operations to copy it to kernel space when needed. We do not need that, meaning that we may ignore it for now.

Another couple of crucial functions is the set that will allow us modify the memory page protection attributes directly. One may say that his is risky, but, in my opinion, this is less risky then actually patching the system call table as it is, first of all, architecture dependent and we know that architectures do not change drastically, second - we use kernel functions for that.

/* Make the page writable */

int make_rw(unsigned long address)

{

unsigned int level;

pte_t *pte = lookup_address(address, &level);

if(pte->pte &~ _PAGE_RW)

pte->pte |= _PAGE_RW;

return 0;

}

/* Make the page write protected */

int make_ro(unsinged long address)

{

unsigned int level;

pte_t *pte = lookup_address(address, &level);

pte->pte = pte->pte &~ _PAGE_RW;

return 0;

}

pte_t stands for typedef struct { unsigned long pte } pte_t and represents the page table entry Although, it is simply an unsigned long, it is declared as struct in order to avoid type misuse.

pte_t *lookup_address(unsigned long address, unsigned int *level) is provided by the kernel and performs all the dirty work for us and returns a pointer to the page table entry that describes the page containing the address. This function accepts the following arguments:

address - an address in virtual memory;

level - pointer to unsigned integer value which accepts the level of the mapping.

Let's Get to Business

We are almost there. The only thing left is the actual implementation of the our_ioctl function. Add the following lines:

switch(cmd)

{

case IOCTL_PATCH_TABLE:

make_rw((unsigned long)sys_call_table);

real_open = (void*)*(sys_call_table + __NR_open);

*(sys_call_table + __NR_open) = (unsigned long)custom_open;

make_ro((unsigned long)sys_call_table);

is_set=1;

break;

case IOCTL_FIX_TABLE:

make_rw((unsigned long)sys_call_table);

*(sys_call_table + __NR_open) = (unsigned long)real_open;

make_ro((unsigned long)sys_call_table);

is_set=0;

break;

default:

printk("Ooops....\n");

break;

}

And these lines to the cleanup_module function:

if(is_set)

{

make_rw((unsigned long)sys_call_table);

*(sys_call_table + __NR_open) = (unsigned long)real_open;

make_ro((unsigned long)sys_call_table);

}

Our interceptor module is ready. Well, almost ready as we need to compile it. Do that as usual - make.

Test

Finally, we have our module set and ready to use, but we have to create a "client" application, the code that will "talk" to our module and tell it what to do. Fortunately, this is much simpler then the rest of the work, that we have done here. Create a new source file and enter the following lines:

#include <stdio.h>
#include <sys/ioctl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

/* Define ioctl commands */
#define IOCTL_PATCH_TABLE 0x00000001
#define IOCTL_FIX_TABLE 0x00000004

int main(void)
{
int device = open("/dev/interceptor", O_RDWR);
ioctl(device, IOCTL_PATCH_TABLE);
sleep(5);
ioctl(device, IOCTL_FIX_TABLE);
close(device);
return 0;
}

save it as manager.c and compile it with gcc -o manager manager.c.

Load the module, run ./manager and then unload the module when manager exits. If you issue the dmesg | tail command. If you see lines containing "interceptor: open(blah blah blah)", then you know that those lines were produced by our handler.

Now we are able to intercept system calls in modern kernels despite the fact that sys_call_table is no longer exported. Although, we deal with low level structures, which normally are only used by kernel, this still is a relatively safe method as long as your module is compiled against the running kernel.

Hope this post was helpful. See you at the next one!

17 comments:

AnonymousApril 4, 2012 at 12:03 PM
Thanks Alexey- great job you do!

Then follows some questions:
sys_call_table address is not any more exported with new kernels. I use 3.2.0. You can get address for ex. as "sudo grep sys_call_table /boot/System.map-3.2.0" But what about the future? Map is changed?
Brute force method works with __NR_... seeking whole memory. But what then when __NR_... is not exported any more?
ReplyDelete
Replies
AnonymousApril 4, 2012 at 12:15 PM
To be more specific: to seek whole memory as "ptr[__NR_close] == (unsigned long) sys_close".
If sys_close is not exported any more?
ReplyDelete
Replies
Rui HanJune 14, 2012 at 6:41 AM
This comment has been removed by the author.
ReplyDelete
Replies
Rui HanJune 14, 2012 at 7:20 AM
Thanks for sharing. I went through all your three tutorials in this serie, but my source code wont compile base on those articles, multiple errors appeared. I am a newbie in linux kernel module programming. I just want to know if you can provide the single workable source code that can intercept sys_open() in the kernel above 2.6.38. for example ubuntu 12.04-kernel 3.2.0. Thank you very much.
ReplyDelete
Replies
Fredrik PerssonJanuary 18, 2013 at 2:45 PM
Thank you for a great tutorial on this.

I'm trying to build this for MIPS architecture. It seems that the symbol lookup_address is not available on that arch. I get:

error: implicit declaration of function 'lookup_address'

... when I try to build. On x86 it works fine. Do you have any idea how to fix this?
ReplyDelete
Replies
UnknownSeptember 19, 2013 at 6:13 PM
Thanks Alexey for publishing this article.

I would like to know here how sys_call is interfaced with LKM.AFAIK when we write to a device file say /dev/fpga using write() call in userspace with 3 arguments will be linked to sys_write in kernel space and which furthur is linked to LKM .Now how this linking between sys_write and LKM is maintained??

ReplyDelete
Replies
UnknownNovember 3, 2013 at 4:48 AM
Hi Alexey,
Thanks to your article I was able to do something. Sorry for my ignorance beforehand but I want to ask a question.
I don't want to rebuild the kernel all again so is it possible to implement a new system call with your method? If not at least I am planning to add a new system call which literally does nothing and then intercept it with custom modules. That would still require building the kernel once but it is more convenient for debugging. What do you think, would it work?
ReplyDelete
Replies

Add comment

Note: Only a member of this blog may post a comment.

Search This Blog

Thursday, October 13, 2011

Hijack Linux System Calls: Part III. System Call Table

17 comments: