Posted by Natalie Silvanovich, Project Zero
This is a three-part series on exploiting messenger applications using vulnerabilities in WebRTC. This series highlights what can go wrong when applications don't apply WebRTC patches and when the communication and notification of security issues breaks down. Part 2 is scheduled for August 5 and Part 3 is scheduled for August 6.
Part 1: First Attempts
WebRTC is an open source video conferencing solution used by a variety of software including browsers, messaging clients and streaming services. While Project Zero has reported several vulnerabilities in WebRTC in the past, it was not clear whether these bugs were exploitable, especially outside of browsers. I investigated whether two recent bugs are exploitable in popular Android messaging applications.
The Bugs
Both of these vulnerabilities are in WebRTC’s Remote Transport Protocol (RTP) processing. RTP is the protocol WebRTC uses to transport audio and video content from peer to peer. RTP supports extensions, which are extra pieces of data that can be included in each packet to tell the destination peer how to display or process the data. For example, there is an extension that contains information about the screen orientation of the sending device, and one that contains the volume level. Both of these vulnerabilities occurred in extensions that had been implemented in WebRTC in 2023.
CVE-2023-6389 occurred in the frame marking extension, which contains information on how video content is split into frames. The bug is in how it processes layer information: WebRTC only supports five layers, but the layer number is a three-bit field in the extension, which means it can go as high as seven. This leads to an out-of-bounds write in the following code. temporal_idx is set from the layer number in the extension.
The final line of code is where the out-of-bounds write occurs, as the array only contains five elements. This bug also has some limitations not obvious from the above code. To start, there is a check before the write, that checks whether the current value of the memory, casted to a 16-bit unsigned integer is more than the current sequence number. The write only occurs if this is true. Practically, this wasn’t much of a limitation, a crash usually occurred after two or three times when I tested it. A more serious limitation is that the layer_info_it->second field has a 64-bit integer type, but frame->id.picture_id is a 16-bit integer. This means that while this bug allows an attacker to write up to three 64-bit integers outside of a fixed size heap buffer, the values that can be written are very limited, and are too small to represent pointers.
CVE-2023-6387 is a bug in how the video timing extension is processed by Forward Error Correction (FEC). FEC copies incoming RTP packets, and then clears certain extensions when attempting to correct errors. This vulnerability occurs because extensions of the video timing type are not verified to be of the expected length before they are cleared. The code causing this bug is as follows:
The value of VideoSendTiming::kPacerExitDeltaOffset is 7, so this code writes six zeros from offset 7 to offset 13 from the start of the extension in the packet. However, there is no check that the extension data is more than 13 bytes long, or even that the packet has this number of bytes left. The result of this bug is that an attacker can write up to six zeros to the heap at an offset of up to seven bytes from a variable sized heap buffer. This bug is better than CVE-2023-6389 in some ways and worse in others. It is better in that the heap buffer that can be overflowed is variable size, which gives a lot more options of what can be overwritten by this bug on the heap. The offset also offers some flexibility on where the zeros are written, and the write does not have to be aligned, whereas CVE-2023-6389 requires 64-bit alignment. This bug is worse in that the value written has to be zero, and the size of the area that can be written is smaller (six bytes versus 24).
Moving the Instruction Pointer
I started off by seeing if it was possible to use either of these bugs to move the instruction pointer. Modern Android uses jemalloc, a slab allocator which doesn’t use inline heap headers, so corrupting heap metadata was not an option. Instead, I compiled WebRTC for Android with symbols, and loaded it in IDA. I then went through the available object types to see if there was anything that could obviously be used to move the instruction pointer or improve the capabilities of the bug. I didn’t find anything.
I thought maybe I could use CVE-2023-6389 to overwrite a length and cause a larger overflow, but this had some problems. To start, the bug writes a 64-bit integer, meanwhile a lot of length fields are 32-bit integers, which means the write also overwrites something else, and can only write a non-zero value if the length is 64-bit aligned. The location of the bug in processing is also problematic, as it does the overwrite near the end of the incoming packet being processed, meaning that many objects are not accessed again after this point, so any overwritten memory would never be used again. CVE-2023-6389 also overwrites a heap buffer of fixed size 80, which limits the object types that can be affected by this bug. I didn’t think CVE-2023-6387 would be viable for this purpose either, as it can only write zeros, which can only make a length smaller.
I wasn’t sure where to go at this point, so I triggered CVE-2023-6389 a few dozen times on Android to see if there were any crashes at an address wider than 16-bits, hoping they might give me ideas of ways that this bug could influence the behavior of the code other than overwriting a pointer with an invalid 16-bit value. To my surprise, it crashed with the instruction pointer set to a value that had clearly been read off the heap about one in 20 times.
Analyzing the crash, it turned out that a StunMessage object was being allocated after the overflowed region. The members of the StunMessage class are as follows.
So after the vtable, the first member is a vector. How are vectors laid out in memory? It turns out its first two members are as follows.
These pointers point to the beginning and the end of the vector’s contents in memory. During the crash, the __end_ member was overwritten with a small 16-bit integer. Vector iteration works by starting at the __begin_ pointer and incrementing until the __end_ pointer is reached, so this change means that the next time the vector is iterated over, usually in the destructor, it will go out of bounds. Since this vector contains virtual objects of type StunAttribute, it will perform a virtual call to each element, to call its destructor. This virtual call on out-of-bounds memory was what was moving the instruction pointer.
This seemed like a reasonable way to control the instruction pointer, except for one problem: in a typical configuration, it is not possible for an attacker at one end of a WebRTC connection to send STUN to the user at the other, instead they each communicate with their own STUN server. I asked Philipp Hancke of webrtchacks if he knew of a way. He suggested this method, which involves specifying a TCP server controlled by the attacker as a potential routable path between two peers, called an ICE candidate. Both the attacker and target device will then communicate through this server, including STUN messages.
This allowed me to send STUN messages with an unusually large number of attributes. This was necessary because in order to control the instruction pointer, I would need to be able to control what showed up in memory after the STUN attribute vector. jemalloc allocates similar sized allocations, determined by predefined size classes in contiguous memory runs. The less used a size class is, the more likely it is that two objects of the same size class will be allocated one after the other.
Typically, STUN messages have a small number of attributes, which translates to a vector buffer size of 32 or 64 bytes, which are both very frequently used size classes. Instead, I sent STUN messages with 128 attributes, which translated to a vector buffer size of 1024 bytes, which happens to be an infrequently used size class in WebRTC. By sending many STUN messages with this number of attributes, while at the same time sending RTP packets of size 1024 containing the desired pointer value, interspersed with packets containing the bug, I was able to get a virtual call on that pointer value about one in five times. This was good enough for use in an exploit, and I decided to move on to breaking ASLR.
Breaking ASLR
There were two possible approaches for breaking ASLR in this exploit. One was to use one of the above bugs to read memory and send it back to the attacker device or TCP server somehow, the other was to use some sort of crash oracle to determine the memory layout.
I started off by seeing whether it was possible to use one of the bugs to read memory remotely from the target device. Mark Brand suggested that it might be possible to use CVE-2023-6387 to accomplish this by setting the low bytes of a pointer to outgoing data to zero, causing out-of-bounds data to be sent instead of the actual data. This seemed like a promising approach, so I used IDA to look for potential objects.
It turned out there were quite a few, and they all had problems. I spent some time on SendPacketMessageData and DataReceivedMessageData. These objects are used to store pointers to outgoing RTP data while it is queued. They contain a CopyOnWriteBuffer object, and its first member is a ref-counted pointer to an rtc::Buffer object. It was possible to set the bottom bytes of this pointer to be zero using CVE-2023-6387. Unfortunately, the structure of rtc::Buffer made revealing memory this way challenging.
I was hoping that it would be possible to make the clipped pointer to this structure to point to some other object on the heap that had a pointer in the location of the data_ pointer, and that data would get sent instead. However, it turned out that in the process of sending data, all four members on the object above get accessed and need to be reasonably valid. I went through all the available objects in the same size class as the rtc::Buffer class, but couldn’t find one with these exact properties.
I then considered that instead of using a different object, I could use an rtc::Buffer object that had already been freed, with a specific backing buffer size that could be replaced with an object containing pointers using heap manipulation. This didn’t work out either. This was largely an issue of reliability. To start off, an rtc::Buffer object is 36 bytes, which translates to size class 48 in jemalloc, meaning 48 bytes get allocated. Imagining some contiguous allocations of this type, the addresses would be as follows.
If the first byte of buffers 0 through 5 are set to zero by the vulnerability, they will land on a valid buffer, but if buffer 6 is set to zero, it will not, because 256 doesn’t divide evenly into 48. The end result is that every time the bug hits the SendPacketMessageData object, there is only a one in three chance it will end up pointing to a valid rtc::Buffer. Hitting the object in the first place is also unreliable, because there are many other allocations of a similar size being made by WebRTC. It’s possible to increase the number of these objects on the heap, and the amount of time before they are sent by using the TCP server to make the connection very slow, but even then I could only hit the structure less than 10% of the time. Having to manipulate the heap so that there are many freed rtc::Buffer objects in a row in the first place, and the backing has been replaced by something containing pointers added even more unreliability. I eventually abandoned this approach because I didn’t think I could get it reliable enough to use in an exploit with a reasonable amount of effort, though I think it’s probably possible. The crash behavior of the application being attacked also matters a lot. This would probably work on an application that respawns immediately in the case of a crash, but would be a lot less practical on an application that stops respawning unless there is a certain delay, which is common on Android.
I also looked a lot at how outgoing packets are generated by WebRTC, especially Remote Transport Control Protocol (RTCP), which a peer always sends, even if it is just receiving audio or video. However, most outgoing packets are generated on the stack, so it is not possible to alter them using heap corruption bugs.
I also considered using a crash oracle to break ASLR, but I felt it was unlikely to succeed with these specific bugs. To start, hitting a heap allocation with them is unreliable, so it would be difficult to tell whether a crash had occurred due to a specific condition, or just because the bug had failed. I was also unsure whether it would even be possible to create detectable conditions considering the limited capabilities of these bugs.
I also thought about using CVE-2023-6387 to alter a vtable or a function pointer in order to read memory, cause behavior detectable by a crash oracle or perform offset-based exploitation that doesn’t require ASLR to be broken. I decided not to pursue this path, because the end result would depend on which functions and vtables are loaded at locations ending in zero, which varies greatly between builds. An exploit written using this method would require a large amount of modification to work on even slightly different versions of WebRTC, and there is no guarantee it would work at all.
I decided at this point that I needed to look for new bugs that could break ASLR, as neither of the ones I’d found recently could do it easily.
Stay tuned for Part 2: A Better Bug, which is scheduled for Wednesday, August 5.
Posting Komentar