Python bug: null byte in input prompt

Python bug: null byte in input prompt - python

I've found that
input('some\x00 text')
will prompt for some instead of some text.
From sources, I've figured out that this function uses C function PyOS_Readline, which ignores everything in prompt after NULL byte.
From PyOS_StdioReadline(FILE *sys_stdin, FILE *sys_stdout, const char *prompt):
fprintf(stderr, "%s", prompt);
https://github.com/python/cpython/blob/3.6/Python/bltinmodule.c#L1989
https://github.com/python/cpython/blob/3.6/Parser/myreadline.c#L251
Is this a bug or there is a reason for that?
Issue: http://bugs.python.org/issue30431

The function signature pretty much requires a NUL terminated C-string, PyOS_StdioReadline(FILE *sys_stdin, FILE *sys_stdout, const char *prompt), so there isn't much than can be done about this without changing the API and breaking interoperability with GNU readline.

Related

depythonifying 'char', got 'str' for pyobjc

0
Story would be: I was using a hardware which can be automatic controlled by a objc framework, it was already used by many colleagues so I can see it as a "fixed" library. But I would like to use it via Python, so with pyobjc I can already connect to this device, but failed to send data into it.
The objc command in header is like this
(BOOL) executeabcCommand:(NSString*)commandabc
withArgs:(uint32_t)args
withData:(uint8_t*)data
writeLength:(NSUInteger)writeLength
readLength:(NSUInteger)readLength
timeoutMilliseconds:(NSUInteger)timeoutMilliseconds
error:(NSError **) error;
and from my python code, data is an argument which can contain 256bytes of data such
as 0x00, 0x01, 0xFF. My python code looks like this:
senddata=Device.alloc().initWithCommunicationInterface_(tcpInterface)
command = 'ABCw'
args= 0x00
writelength = 0x100
readlength = 0x100
data = '\x50\x40'
timeout = 500
success, error = senddata.executeabcCommand_withArgs_withData_writeLength_readLength_timeoutMilliseconds_error_(command, args, data, writelength, readlength, timeout, None)
Whatever I sent into it, it always showing that.
ValueError: depythonifying 'char', got 'str'
I tired to dig in a little bit, but failed to find anything about convert string or list to char with pyobjc

Objective-C follows the rules that apply to C.
So in objc as well as C when we look at uint8_t*, it is in fact the very same as char* in memory. string differs from this only in that sense that it is agreed that the last character ends in \0 to indicate that the char* block that we call string has its cap. So char* blocks end with \0 because, well its a string.
What do we do in C to find out the length of a character block?
We iterate the whole block until we find \0. Usually with a while loop, and break the loop when you find it, your counter inside the loop tells you your length if you did not give it somehow anyway.
It is up to you to interpret the data in the desired format.
Which is why sometime it is easier to cast from void* or to take indeed a char* block which is then cast to and declared as uint8_t data inside the function which makes use if it. Thats the nice part of C to be able to define that as you wish, use that force that was given to you.
So to make your life easier, you could define a length parameter like so
-withData:(uint8_t*)data andLength:(uint64_t)len; to avoid parsing the character stream again, as you know already it is/or should be 256 characters long. The only thing you want to avoid at all cost in C is reading attempts at indices that are out of bound throwing an BAD_ACCESS exception.
But this basic information should enable you to find a way to declare your char* block containing uint8_t data addressed with the very first pointer (*) which also contains the first uint8_t character of the block as str with a specific length or up to the first appearance of \0.
Sidenote:
objective-c #"someNSString" == pythons u"pythonstring"
PS: in your question is not clear who throw that error msg.
Python? Because it could not interpret the data when receiving?
Pyobjc? Because it is python syntax hell when you mix with objc?
The objc runtime? Because it follows the strict rules of C as well?

Python has always been very forgiving about shoe-horning one type into another, but python3 uses Unicode strings by default, which need to be converted into binary strings before plugging into pyobjc methods.

Try specifying the strings as byte objects as b'this'
I was hitting the same error trying to use IOKit:
import objc
from Foundation import NSBundle
IOKit = NSBundle.bundleWithIdentifier_('com.apple.framework.IOKit')
functions = [("IOServiceGetMatchingService", b"II#"), ("IOServiceMatching", b"#*"),]
objc.loadBundleFunctions(IOKit, globals(), functions)
The problem arose when I tried to call the function like so:
IOServiceMatching('AppleSmartBattery')
Receiving
Traceback (most recent call last):
File "<pyshell#53>", line 1, in <module>
IOServiceMatching('AppleSmartBattery')
ValueError: depythonifying 'charptr', got 'str'
While as a byte object I get:
IOServiceMatching(b'AppleSmartBattery')
{
IOProviderClass = AppleSmartBattery;
}

Python SQLite: enforce UTF-8 encoding

I'm developing a cross-platform Python (3.7+) application, and I need to rely on sort order of TEXT columns in SQLite, meaning the comparison algorithm of TEXT values must be based on UTF-8 bytes. Even if the system encoding (sys.getdefaultencoding()) is not utf-8.
But in documentation of sqlite3 module I can't find an encoding option for sqlite3.connect.
And I read that the use of sys.setdefaultencoding("utf-8") is an ugly hack and highly discouraged (that's why we need to reload(sys) before calling it)
So what's the solution?

Looking at Python's _sqlite/connection.c code, either sqlite3_open_v2 or sqlite3_open is called (depending on a compile flag). And based on sqlite doc, both of them use UTF-8 as default database encoding. I'm still not sure about the meaning of word "default" since it doesn't mention any way to override it! But I it doesn't look like that Python can open with another encoding.
#ifdef SQLITE_OPEN_URI
Py_BEGIN_ALLOW_THREADS
rc = sqlite3_open_v2(database, &self->db,
SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE |
(uri ? SQLITE_OPEN_URI : 0), NULL);
#else
if (uri) {
PyErr_SetString(pysqlite_NotSupportedError, "URIs not supported");
return -1;
}
Py_BEGIN_ALLOW_THREADS
rc = sqlite3_open(database, &self->db);
#endif

Why do I have to press CTRL-D three times to quit a program during an input? [duplicate]

So basically I want to copy everything i write to stdin (including newline char) to string for hash purposes. I managed to accomplish that and made small code to represent my problem.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define BUFFERSIZE 10000
int main()
{
char *myStr = calloc(1,1);
char buffer[BUFFERSIZE];
while( fgets(buffer, BUFFERSIZE , stdin) != NULL ){
myStr = realloc(myStr, strlen(myStr)+1+strlen(buffer) );
strcat( myStr, buffer );
}
printf("\n%s\n",myStr);
}
everything works when I enter some text then press ENTER and after I call EOF.
But when I start program enter "a" then I try to call EOF (using Ctrl Z + ⏎ (Windows cmd prompt), Ctrl D (Linux)) I have to do it three times for program to actually break the loop. I was expecting maximum of 2 times.
Can someone explain how using EOF, stdin and fgets works? Or should I use something else (for example getline)? I am sorry if I am not clear about my problem, just ask anything you need.
Thank you.

First of all, ^Z or ^D are control characters that mean something to the terminal you are using, and sometimes that means for the terminal to signal end-of-file condition.
Anyway, your three keypresses are processed by the terminal to take the following actions, after entering text:
Flush the input (i.e. send the characters that have been input so far from the terminal to the program - by default this doesn't happen as the terminal uses line buffering)
Set end-of-file condition
Set end-of-file condition again
Inside your program that corresponds to:
Nothing happens: even though a is received, fgets keeps reading until end-of-file or newline
fgets completes because of end-of file. However it does not return NULL because characters were read, "a" to be specific.
A bug in old versions of glibc causes fgets to try to read again, even though it previously reached end-of-file. fgets completes because of end-of-file, and returns NULL because there were no characters read.

Using NULL bytes in bash (for buffer overflow)

I programmed a little C program that is vulnerable to a buffer overflow. Everything is working as expected, though I came across a little problem now:
I want to call a function which lies on address 0x00007ffff7a79450 and since I am passing the arguments for the buffer overflow through the bash terminal (like this:
./a "$(python -c 'print "aaaaaaaaaaaaaaaaaaaaaa\x50\x94\xA7\xF7\xFF\x7F\x00\x00"')" )
I get an error that the bash is ignoring the nullbytes.
/bin/bash: warning: command substitution: ignored null byte in input
As a result I end up with the wrong address in memory (0x7ffff7a79450instead of0x00007ffff7a79450).
Now my question is: How can I produce the leading 0's and give them as an argument to my program?

I'll take a bold move and assert what you want to do is not possible in a POSIX environment, because of the way arguments are passed.
Programs are run using the execve system call.
int execve(const char *filename, char *const argv[], char *const envp[]);
There are a few other functions but all of them wrap execve in the end or use an extended system call with the properties that follow:
Program arguments are passed using an array of NUL-terminated strings.
That means that when the kernel will take your arguments and put them aside for the new program to use, it will only read them up to the first NUL character, and discard anything that follows.
So there is no way to make your example work if it has to include nul characters. This is why I suggested reading from stdin instead, which has no such limitation:
char buf[256];
read(STDIN_FILENO, buf, 2*sizeof(buf));
You would normally need to check the returned value of read. For a toy problem it should be enough for you to trigger your exploit. Just pipe your malicious input into your program.

Python extension for Upskirt: garbage at end of string

I've been trying to make a Python extension for Upskirt. I though it would not be too hard for a first C project since there are examples (example program in the Upskirt code and the Ruby extension).
The extension works, it converts the Markdown I throw at it, but sometimes the output has some garbage at the end of the string. And I don't know what causes it.
Here's some output:
python test.py
<module 'pantyshot' from '/home/frank/Code/pantyshot/virtenv/lib/python2.7/site-packages/pantyshot.so'>
<built-in function render>
'<p>This <strong>is</strong> <em>a</em> <code>test</code>. Test.</p>\n\x7f'
<p>This <strong>is</strong> <em>a</em> <code>test</code>. Test.</p>
--------------------------------------------------------------------------------
'<p>This <strong>is</strong> <em>a</em> <code>test</code>. Test.</p>\n\x7f'
<p>This <strong>is</strong> <em>a</em> <code>test</code>. Test.</p>
--------------------------------------------------------------------------------
My code can be found in my Github repo. I called it pantyshot, because I thought of that when I heard upskirt. Strange name, I know.
I hope someone can help me.

You are doing a strdup in pantyshot_render:
output_text = strdup(ob->data); /* ob is a "struct buf *" */
But I don't think ob->data is a nul-terminated C string. You'll find this inside upskirt/buffer.c:
/* bufnullterm • NUL-termination of the string array (making a C-string) */
void
bufnullterm(struct buf *buf) {
if (!buf || !buf->unit) return;
if (buf->size < buf->asize && buf->data[buf->size] == 0) return;
if (bufgrow(buf, buf->size + 1))
buf->data[buf->size] = 0; }
So, you're probably running off the end of the buffer and getting lucky by hitting a '\0' before doing any damage. I think you're supposed to call bufnullterm(ob) before copying ob->data as a C string; or you could look at ob->size, use malloc and strncpy to copy it, and take care of the nul-terminator by hand (but make sure you allocation ob->size + 1 bytes for your copied string).
And if you want to get rid of the newline (i.e. the trailing \n), then you'll probably have to do some whitespace stripping by hand somewhere.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python bug: null byte in input prompt - python

The function signature pretty much requires a NUL terminated C-string, PyOS_StdioReadline(FILE sys_stdin, FILE sys_stdout, const char *prompt), so there isn't much than can be done about this without changing the API and breaking interoperability with GNU readline.

Related

depythonifying 'char', got 'str' for pyobjc

Python SQLite: enforce UTF-8 encoding

Why do I have to press CTRL-D three times to quit a program during an input? [duplicate]

Using NULL bytes in bash (for buffer overflow)

Python extension for Upskirt: garbage at end of string

Categories

Resources