I programmed a little C program that is vulnerable to a buffer overflow. Everything is working as expected, though I came across a little problem now:
I want to call a function which lies on address 0x00007ffff7a79450 and since I am passing the arguments for the buffer overflow through the bash terminal (like this:
./a "$(python -c 'print "aaaaaaaaaaaaaaaaaaaaaa\x50\x94\xA7\xF7\xFF\x7F\x00\x00"')" )
I get an error that the bash is ignoring the nullbytes.
/bin/bash: warning: command substitution: ignored null byte in input
As a result I end up with the wrong address in memory (0x7ffff7a79450instead of0x00007ffff7a79450).
Now my question is: How can I produce the leading 0's and give them as an argument to my program?
I'll take a bold move and assert what you want to do is not possible in a POSIX environment, because of the way arguments are passed.
Programs are run using the execve system call.
int execve(const char *filename, char *const argv[], char *const envp[]);
There are a few other functions but all of them wrap execve in the end or use an extended system call with the properties that follow:
Program arguments are passed using an array of NUL-terminated strings.
That means that when the kernel will take your arguments and put them aside for the new program to use, it will only read them up to the first NUL character, and discard anything that follows.
So there is no way to make your example work if it has to include nul characters. This is why I suggested reading from stdin instead, which has no such limitation:
char buf[256];
read(STDIN_FILENO, buf, 2*sizeof(buf));
You would normally need to check the returned value of read. For a toy problem it should be enough for you to trigger your exploit. Just pipe your malicious input into your program.
Related
I have a simple C program that is copying with strcpy an argument ( argv ) in a buffer ( a char array ), which I am not allowed to modify!
I want to write a python script to input a list of bytes (let's say \x00\x01\x04\x70 ) as one of the executable's arguments so it get's coppied into the buffer.
The way I'm doing it right now is just by calling the system method and start the program with the proper argument.
os.system('./program ' + '\xaa\xaa\xaa\xaa')
My problem is that I want to write a the bytes stating with a null byte, which python complains about and I cannot find a way to get pass this problem.
In the end, after the python script run with the following bytes : (00, 01, 04, 70),
the buffer should look like this : [\x00, \x01, \x04, \x70]
Edit:
Or is there any way to create some sort of pipeline to inject null bytes into the arguments?
Edit:
[workaround] for My specific task
I found out that most of the data had a null byte as the first character, So I would consider the system as little endian and add to the workflow another script that reverse the bytes, ignoring the last \x00 as we can consider the memory locations in the buffer 0 by default.
I will still not mark the question as answered as maybe someone would Find another workaround this problem:)
0
Story would be: I was using a hardware which can be automatic controlled by a objc framework, it was already used by many colleagues so I can see it as a "fixed" library. But I would like to use it via Python, so with pyobjc I can already connect to this device, but failed to send data into it.
The objc command in header is like this
(BOOL) executeabcCommand:(NSString*)commandabc
withArgs:(uint32_t)args
withData:(uint8_t*)data
writeLength:(NSUInteger)writeLength
readLength:(NSUInteger)readLength
timeoutMilliseconds:(NSUInteger)timeoutMilliseconds
error:(NSError **) error;
and from my python code, data is an argument which can contain 256bytes of data such
as 0x00, 0x01, 0xFF. My python code looks like this:
senddata=Device.alloc().initWithCommunicationInterface_(tcpInterface)
command = 'ABCw'
args= 0x00
writelength = 0x100
readlength = 0x100
data = '\x50\x40'
timeout = 500
success, error = senddata.executeabcCommand_withArgs_withData_writeLength_readLength_timeoutMilliseconds_error_(command, args, data, writelength, readlength, timeout, None)
Whatever I sent into it, it always showing that.
ValueError: depythonifying 'char', got 'str'
I tired to dig in a little bit, but failed to find anything about convert string or list to char with pyobjc
Objective-C follows the rules that apply to C.
So in objc as well as C when we look at uint8_t*, it is in fact the very same as char* in memory. string differs from this only in that sense that it is agreed that the last character ends in \0 to indicate that the char* block that we call string has its cap. So char* blocks end with \0 because, well its a string.
What do we do in C to find out the length of a character block?
We iterate the whole block until we find \0. Usually with a while loop, and break the loop when you find it, your counter inside the loop tells you your length if you did not give it somehow anyway.
It is up to you to interpret the data in the desired format.
Which is why sometime it is easier to cast from void* or to take indeed a char* block which is then cast to and declared as uint8_t data inside the function which makes use if it. Thats the nice part of C to be able to define that as you wish, use that force that was given to you.
So to make your life easier, you could define a length parameter like so
-withData:(uint8_t*)data andLength:(uint64_t)len; to avoid parsing the character stream again, as you know already it is/or should be 256 characters long. The only thing you want to avoid at all cost in C is reading attempts at indices that are out of bound throwing an BAD_ACCESS exception.
But this basic information should enable you to find a way to declare your char* block containing uint8_t data addressed with the very first pointer (*) which also contains the first uint8_t character of the block as str with a specific length or up to the first appearance of \0.
Sidenote:
objective-c #"someNSString" == pythons u"pythonstring"
PS: in your question is not clear who throw that error msg.
Python? Because it could not interpret the data when receiving?
Pyobjc? Because it is python syntax hell when you mix with objc?
The objc runtime? Because it follows the strict rules of C as well?
Python has always been very forgiving about shoe-horning one type into another, but python3 uses Unicode strings by default, which need to be converted into binary strings before plugging into pyobjc methods.
Try specifying the strings as byte objects as b'this'
I was hitting the same error trying to use IOKit:
import objc
from Foundation import NSBundle
IOKit = NSBundle.bundleWithIdentifier_('com.apple.framework.IOKit')
functions = [("IOServiceGetMatchingService", b"II#"), ("IOServiceMatching", b"#*"),]
objc.loadBundleFunctions(IOKit, globals(), functions)
The problem arose when I tried to call the function like so:
IOServiceMatching('AppleSmartBattery')
Receiving
Traceback (most recent call last):
File "<pyshell#53>", line 1, in <module>
IOServiceMatching('AppleSmartBattery')
ValueError: depythonifying 'charptr', got 'str'
While as a byte object I get:
IOServiceMatching(b'AppleSmartBattery')
{
IOProviderClass = AppleSmartBattery;
}
I have a python program like this:
raw_data = sys.stdin.buffer.read(nbytes) # Read from standard input stream
# Do something with raw_data to get output_data HERE...
output_mask = output_data.tostring() # Convert to bytes
sys.stdout.buffer.write(b'results'+output_mask) # Write to standard output stream
Then I get the my_py.exe of this python program using Pyinstaller. I test the my_py.exe using subprocess.run() in Python. It is fine.
However, I need to call this my_py.exe in IDL. IDL has this tutorial on how to use its SPAWN command with pipes. So my IDL program which calls the my_py.exe is like this:
SPAWN['my_py.exe', arg], COUNT=COUNT , UNIT=UNIT
WRITEU, UNIT, nbytes, data_to_stream
READU, UNIT, output_from_exe
Unfortunately, the IDL program above hang at READU. Does anyone know the issue I have here? Is the problem in my python read and write?
You are missing a comma in the SPAWN command, although I imagine if that typo was in your code, IDL would issue a syntax error before you ever got to READU. But, if for some reason IDL is quietly continuing execution with an erroneous SPAWN call, maybe READU is hanging because it's trying to read some nonsense logical unit. Anyway, it should read:
SPAWN,['my_py.exe', arg], UNIT=UNIT
Here's the full syntax for reference:
SPAWN [, Command [, Result] [, ErrResult] ]
Keywords (all platforms): [, COUNT=variable] [, EXIT_STATUS=variable] [ ,/NOSHELL] [, /NULL_STDIN] [, PID=variable] [, /STDERR] [, UNIT=variable {Command required, Result and ErrResult not allowed}]
UNIX-Only Keywords: [, /NOTTYRESET] [, /SH]
Windows-Only Keywords: [, /HIDE] [, /LOG_OUTPUT] [, /NOWAIT]
I've eliminated the COUNT keyword, because, according to the documentation, COUNT contains the number of lines in Result, if Result is present, which it is not. In fact, Result is not even allowed here, since you're using the UNIT keyword. I doubt that passing the COUNT keyword is causing READU to hang, but it's unnecessary.
Also, check this note from the documentation
to make sure that the array you are passing as a command is correct:
If Command is present, it must be specified as follows:
On UNIX, Command is expected to be scalar unless used in conjunction with the NOSHELL keyword, in which case Command is expected to be a string array where each element is passed to the child process as a separate argument.
On Windows, Command can be a scalar string or string array. If it is a string array, SPAWN glues together each element of the string array, with each element separated by whitespace.
I don't know the details of your code, but here's some further wild speculation:
You might try setting the NOSHELL keyword, just as a shot in the dark.
I have occasionally had problems with IDL not seeming to finish writing to disk when I haven't closed the file unit, so make sure that you are using FREE_LUN, UNIT after READU. I know you said it hangs at READU, but my thinking here is that maybe it's only appearing to hang, and just can't continue until the file unit is closed.
Finally, here's something that could actually be the problem, and is worth looking into (from the tutorial you linked to):
A pipe is simply a buffer maintained by the operating system with an interface that makes it appear as a file to the programs using it. It has a fixed length and can therefore become completely filled. When this happens, the operating system puts the process that is filling the pipe to sleep until the process at the other end consumes the buffered data. The use of a bidirectional pipe can lead to deadlock situations in which both processes are waiting for the other. This can happen if the parent and child processes do not synchronize their reading and writing activities.
I've found that
input('some\x00 text')
will prompt for some instead of some text.
From sources, I've figured out that this function uses C function PyOS_Readline, which ignores everything in prompt after NULL byte.
From PyOS_StdioReadline(FILE *sys_stdin, FILE *sys_stdout, const char *prompt):
fprintf(stderr, "%s", prompt);
https://github.com/python/cpython/blob/3.6/Python/bltinmodule.c#L1989
https://github.com/python/cpython/blob/3.6/Parser/myreadline.c#L251
Is this a bug or there is a reason for that?
Issue: http://bugs.python.org/issue30431
The function signature pretty much requires a NUL terminated C-string, PyOS_StdioReadline(FILE *sys_stdin, FILE *sys_stdout, const char *prompt), so there isn't much than can be done about this without changing the API and breaking interoperability with GNU readline.
I am trying to understand two different behaviors of an overflow from a C program(call it vulnerable_prog)in Linux that asks for input, in order to allow you to overflow a buffer. I understand that the compiler lays out the stack frame in particular ways, causing a bit of unpredictability sometimes. What I can't understand is the difference in the way the memory is handled when I overflow the buffer using a python script to feed 20 characters to the program, as opposed to running vulnerable_prog manually and inputting the 20 characters manually.
The example program declares an array of "char name[20]", and the goal is to overflow it and write a specific value into the other variable that will be overwritten. (This is from a classic wargaming site).
I understand that the processor(64 bit) reads 8 bytes at a time, so this requires padding of arrays that are not multiples of 8 to keep memory organized. Therefore my char [20] is actually occupying 24 bytes of memory and accessible to the processor as 8-byte words.
The unexpected behavior is this:
When using a python script, the overflow behaves as follows:
$python -c'print "A"*20 + "\xre\xhe\xyt\xhe"' | /path/vulnerable_prog
The 20 characters overflow the buffer, and the expected value is written into the correct spot in memory.
HOWEVER, when you try to overflow the buffer by running the program from the command prompt and inputting 20 characters manually, followed by the required hex string to be written to memory, you must use one additional hex character in order to have your value end up in the correct place that you want it:
$echo$ 'AAAAAAAAAAAAAAAAAAAA\xre\xhe\xyt\xhe\xaf'
(output of the 'echo' is then copied and pasted into the prompt that vulnerable_prog offers when run from the command line)
Where does this difference in the padding of the character array between the script and the command line exploitation come into play?
I have been doing a lot of research of C Structure padding and reading in the ISO/IEC 9899:201x, but cannot find anything that would explain this nuance.
(This is my first question on Stack Overflow so I apologize if I did not quite ask this correctly.)
Your Python script, when piped, actually sends 25 characters into /path/vulnerable_prog. The print statement adds a newline character. Here is your Python program plus a small Python script that counts the characters written to its standard input:
python -c'print "A"*20 + "\xre\xhe\xyt\xhe"' | python -c "import sys; print(len(sys.stdin.read()))"
I'm guessing you're not pasting the newline character that comes from echo into the program's prompt. Unfortunately, I don't think I have enough information to explain why you need 25, not 24, characters to achieve what you're attempting.
P.S. Welcome to Stack Overflow!