A Brief Exploration of CVE-2018-10938

A recent post to the OSS Security mailing list brought up a potential DoS fixed in Linux about a year ago. This got a decent amount of attention on Twitter, and so I decided to see if I could create a proof-of-concept for this relatively simple bug.

Affected Systems

Before we get started, it’s worthwhile pointing out that while this is a single packet remote DoS on affected boxes, it’s very unlikely any systems will be affected. It requires a number of things to trigger:

Going into detail on each of these requirements, the only common distros with vulnerable kernels are (as far as I can tell with some quick googling):

Out of these, only RHEL/CentOS use SELinux by default (the others use AppArmor). Because of this, we’re already down to a single distro running a non-stock kernel, and we haven’t even gotten to the “weird” requirement yet:

CIPSO (Commercial IP Security Option) is an IP option that is supposed to allow for machines to communicate the “security” of net flows to other servers (e.g. DoD classification level). This IP option was never actually standardized and only remains as an expired IETF draft today, yet it is still implemented in Linux.

Given that this option was never fully standardized, I highly doubt many are using it in production networks.

First Attempt

The commit referenced in the mailing list post seems to explain exactly how to reproduce the bug:

Test: receive a packet which the ip length > 20 and the first byte of ip option is 0, produce this issue

Comparing this to what was changed in the commit doesn’t quite add up though. Based on the changes, it’s obvious that we enter a spinloop (DoS) condition when the length of an option is 0, but the length is stored in the second byte of the option. Setting an option length to 0 still seems like the way to go though, so I put together a small Scapy script with a “Stream Identifier” option (type 136), but forcefully set the length byte to 0:

...
opt_type = chr(136)
opt_len = '\x00'
stream_id = '\x00\x01'
ip.options = [IPOption(opt_type + opt_len + stream_id)]
...

This didn’t work immediately, so I needed to dig a bit more. I started by using the kprobe tool from perf-tools to check if the function was being hit at all. And it wasn’t.

At this point I figured I needed to somehow activate this subsystem since I wasn’t seeing any hits at all. I worked my way up the call tree from the vulnerable function (cipso_v4_optptr) using elixir and found that it was called by a few functions in net/netlabel/netlabel_kapi.c, the first of which (netlbl_skbuff_getattr) seemed to be the most interesting (based on the name). This was in turn called by selinux_netlbl_skbuff_getsid, which calls netlbl_enabled before doing anything, presumably so that the kernel doesn’t unnecessarily burn cycles looking for CIPSO headers when it wouldn’t do any processing based on them anyways.

Looking into netlbl_enabled, I realized I probably needed to add a CIPSO rule for this entire subsystem to activate.

Sure enough, after installing and fumbling around with netlabelctl to add the rule, I started seeing kprobe hits on cipso_v4_optptr with other network traffic so I knew I was headed the right direction. But the exploit still didn’t work :(

And curiously, I didn’t see any kprobe hits when my PoC was firing…

Second Attempt

Given everything I had seen (and not seen) at this point, I figured that something upstream of netlbl_skbuff_getattr was doing some sanity checking on the IP option lengths to make sure they were valid. A bit more testing with other IP options seemed to confirm this, so I began hunting for the place this validation was happening and found some code in net/ipv4/ip_options.c which looks exactly like the validation code I was expecting to find (see ip_options_compile).

While I don’t think (due to the name of the function) that this is the function actually doing the opt len checking on each packet, this brought something to my attention: not only is IPOPT_END a special case, but IPOPT_NOOP is as well! Both of these are checked for before length checking is done, and reviewing the original Internet Protocol RFC confirms that these are the 2 IP options that do not have a length field:

The option field is variable in length.  There may be zero or more
options.  There are two cases for the format of an option:

  Case 1:  A single octet of option-type.

  Case 2:  An option-type octet, an option-length octet, and the
           actual option-data octets.

...

The following internet options are defined:

  CLASS NUMBER LENGTH DESCRIPTION
  ----- ------ ------ -----------
    0     0      -    End of Option list.  This option occupies only
                      1 octet; it has no length octet.
    0     1      -    No Operation.  This option occupies only 1
                          octet; it has no length octet.

But of course cipso_v4_optptr doesn’t take the NO-OP special case into account.

At this point I realized that since both NO-OP and END are 1 octet/byte, and that the END opt is type 0, if we have a NO-OP option then an END option, the properly implemented option checking upstream would validate this, but the CIPSO code would attempt to decode the END option (which, as type 0, is just the NULL byte) as a length for the NO-OP option, giving us a spinloop!

Modifying the first attempt’s code to test this is trivial (the END opt is automatically added by Scapy):

ip.options = [IPOption('\x01')]

Bingo! A stall.

To clean things up for the Twitter PoC I searched around a bit and found that Scapy has some helper classes for IP options, one of which (IPOption_NOP) does everything we need. And so the final exploit small enough to fit in a tweet looks like this:

from scapy.all import *
import sys
ip = IP(dst=sys.argv[1], options=[IPOption_NOP()])
udp = UDP(sport=12345, dport=12345)
pkt = ip/udp/"TEST"
sr1(pkt, verbose=0)

This successfully hangs a CentOS 7 box that’s fully upgraded with ELRepo kernel-lt (as of 28 Aug 2018), a single CIPSO rule added, and a netcat instance listening on UDP 12345 with the necessary iptables/firewall-cmd rules added.

Conclusion

While this bug won’t affect many boxes due to the obscure setup required to be vulnerable, it was nevertheless a fun way to spend a night triaging, debugging, reproducing, and eventually exploiting the bug. It also gave me some interesting ideas about ways to automatically detect these types of simple DoS bugs which I hope to cover in the future (hopefully after writing some tooling).