Patching a running Linux Kernel
written by John Newbigin
jn@it.swin.edu.au

Although the Linux kernel was not designed to be patched on the fly, there are a number of things which you can manipulate using loadable kernel modules.  I have long though that this could provide a way of applying security patches to running kernels.  This means that you can apply the official fix which requires a reboot at a time convenient to you.  This is not to prolong uptime but to allow some flexibility is the scheduling of downtime.

The bug which this concept demonstration will close was sent to BugTraq on 18 Oct 2001.  You can read the email here.  I am dealing with "II. Root compromise by ptrace(3)" problem.

Not being a kernel programmer, I needed some documentation to get me started.  The "Linux Kernel Module Programming Guide" is a bit out of date but was sufficient for my purposes.

The plan is simple, replace the execve syscall with a custom syscall which does extra sanity checking.  It the checks are successful, call the original syscall.

This is described in the LKMPG so I am not going to cover it here.

To perform the checks, I had to get the inode of the file to be execed.  This required the duplication of some of the code deep inside the execve syscall (fs/exec.c).  There was some locking to be performed so rather than add extra locks and unlocks, I decided to duplicate the sys_execve procedure in it's entirety. (All 12 lines of it).

This done, I had to create a new procedure to perform the extra checking.  The code out of arch/i386/kernel/process.c was good for the picking and well commented so I did not have any trouble in writing my safe_exec procedure.

NOTE: This is a patch for Linux 2.2.19.  Newer kernel include an integrated fix for this problem.

Here then is the module

-- syscall.c --
/* syscall.c
 *
 * System call "stealing" sample
 */


/* Copyright (C) 1998-99 by Ori Pomerantz */
/* adapted by John Newbigin <jn@it.swin.edu.au> */

/* This module is designed to close the ptrace issue described here */
/* http://www.securityfocus.com/cgi-bin/archive.pl?id=1&mid=221337&start=2001-10-15&end=2001-10-21 */


/* The necessary header files */

/* Standard in kernel modules */
#include <linux/kernel.h>   /* We're doing kernel work */
#include <linux/module.h>   /* Specifically, a module */

/* Deal with CONFIG_MODVERSIONS */
#if CONFIG_MODVERSIONS==1
#define MODVERSIONS
#include <linux/modversions.h>
#endif

#include <sys/syscall.h>  /* The list of system calls */
#include <linux/smp_lock.h>
#include <linux/mm.h>


/* In 2.2.3 /usr/include/linux/version.h includes a
 * macro for this, but 2.0.35 doesn't - so I add it
 * here if necessary. */
#ifndef KERNEL_VERSION
#define KERNEL_VERSION(a,b,c) ((a)*65536+(b)*256+(c))
#endif



#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0)
#include <asm/uaccess.h>
#endif



/* The system call table (a table of functions). We
 * just define this as external, and the kernel will
 * fill it up for us when we are insmod'ed
 */
extern void *sys_call_table[];


/* A pointer to the original system call. The reason
 * we keep this, rather than call the original function
 * (sys_open), is because somebody else might have
 * replaced the system call before us. Note that this
 * is not 100% safe, because if another module
 * replaced sys_open before us, then when we're inserted
 * we'll call the function in that module - and it
 * might be removed before we are.
 *
 * Another reason for this is that we can't get sys_open.
 * It's a static variable, so it is not exported. */
asmlinkage int (*original_call)(struct pt_regs regs);


// this only returns 0 if we successfuly find that this is suid and ptrace is set
int safe_exec(char * filename)
{
        struct dentry * dentry;
        struct inode * inode;
        int mode;
        int sid = 0;

        dentry = open_namei(filename, 0, 0);
        if (IS_ERR(dentry))
        {
                return 1;
        }
        inode = dentry->d_inode;
        mode = inode->i_mode;

        if (mode & S_ISUID) {
                // this is suid...
                sid = 1;
        }
        if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
                // this is sgid...
                sid = 1;
        }

        dput(dentry);

        if(sid == 1)
        {
                // check ptrace
                if(current->flags & PF_PTRACED)
                {
                        // not safe
                        return 0;
                }
        }

        return 1;
}

/*
        This is a custom version of sys_execve which comes from
        arch/i386/kernel/process.c

        It is the same except for the extra call to safe_exec
 */

asmlinkage int our_sys_exec(struct pt_regs regs)
{
        int error;
        char * filename;

        lock_kernel();
        filename = getname((char *) regs.ebx);
        error = PTR_ERR(filename);
        if (IS_ERR(filename))
                goto out;
        if(safe_exec(filename) == 1)
        {
                error = do_execve(filename, (char **) regs.ecx, (char **) regs.edx, &regs);
        }
        else
        {
                printk("exec of %s denied because of suid & ptrace\n", filename);
                error = -EPERM;
        }
        if (error == 0)
                current->flags &= ~PF_DTRACE;
        putname(filename);
out:
        unlock_kernel();
        return error;

}


/* Initialize the module - replace the system call */
int init_module(void)
{
  original_call = sys_call_table[__NR_execve];
  sys_call_table[__NR_execve] = our_sys_exec;

        MOD_INC_USE_COUNT;
  printk("exec module installed\n");

  return 0;
}


/* Cleanup - This should never get called because the module use count will always be 1 */
void cleanup_module(void)
{
  /* Return the system call back to normal */
  if (sys_call_table[__NR_execve] != our_sys_exec) {
    printk("Somebody else also played with the ");
    printk("execve system call\n");
    printk("The system may be left in ");
    printk("an unstable state.\n");
  }

  sys_call_table[__NR_execve] = original_call;
}

      
and the makefile (not pretty but works for me)
CC=gcc
MODCFLAGS := -Wall -DMODULE -D__KERNEL__ -DLINUX -I/usr/src/linux-2.2.19/include -Wstrict-prototypes -O2 -fomit-frame-pointer -fno-strict-aliasing -pipe -fno-strength-reduce -m486 -malign-loops=2 -malign-jumps=2 -malign-functions=2 -DCPU=586

syscall.o:      syscall.c
        $(CC) $(MODCFLAGS) -c syscall.c

      
Future Direction

If this patch works out to be stable, then I will apply it to the unpatched machines I am currently in charge of.  My code above it not very high quality but my hope is that vendors like RedHat will start making patches like this which clients can apply so they can better manage their downtime.


Last modified 20031008
Maintained by John Newbigin http://uranus.it.swin.edu.au/~jn