CVE-2021-20226: A Reference-Counting Bug in the Linux Kernel io_uring Subsystem
April 22, 2021 | Lucas LeongIn June 2020, we received a Linux kernel submission detailing a reference-counting bug in the recently introduced io_uring subsystem. The bug leads to a use-after-free on any file
structure, which can be leveraged for privilege escalation in the kernel. This bug was submitted by Ryota Shiga (@Ga_ryo_) of Flatt Security.
We believe that the vulnerability affected the Linux kernel from version 5.6 to 5.7 inclusive. The vulnerability has been assigned identifiers ZDI-21-001 and CVE-2021-20226.
The Vulnerability
Linux kernel 5.1 introduced a new asynchronous I/O feature called io_uring. This subsystem operates by batching I/O operation system calls, so that multiple I/O operations can be performed in one system call.
Linux kernel 5.6 has a flawed implementation of the IORING_OP_CLOSE
operation. When a system call passes a files_struct
to a kernel thread, io_grab_files()
doesn’t increment the reference counter at (1). This can lead to a later access of the freed file structure.
Exploitation
The map_lookup_elem()
and map_update_elem()
functions are good candidates for use in exploiting this bug.
The fdget()
at (2) is an optimized function that doesn't increase the reference count if the current task is single-thread. The returned file
structure, f
, can be freed by a later IORING_OP_CLOSE
. The __bpf_copy_key()
syscall at (3) is actually a wrapper for copy_from_user()
. This provides an opportunity to produce a race condition by using userfaultfd and triggering the vulnerability. At this point, file structure f
and its corresponding map
are freed. The memory of the map
can be reallocated with fake data at (4) and (5). Finally, we can read arbitrary memory at (6) and disclose to usermode.
Here is an overview for the exploit timeline:
The recvmsg()
function is for timing control. The freed bpf_map
can be faked by spraying with setxattr()
. The arbitrary write can be achieved by map_update_elem()
. This exploit method is restricted to a single-core environment due to the condition of fdget()
.
Conclusion
New features mean new attack surfaces, and new attack surfaces often lead to new bugs being discovered. It will be interesting to see if any other vulnerabilities are found in this subsystem. Regardless, it was a great find by Ryota, and we appreciate his submission. If that name sounds familiar at all, Ryota also competed in the most recent Pwn2Own and won $30,000 demonstrating a different privilege escalation bug on Ubuntu. We look forward to seeing more from him in the future.
You can find me on Twitter @_wmliang_, and follow the team for the latest in exploit techniques and security patches.