How to access Python standard stream in IDL Spawn command? - python

I have a python program like this:
raw_data = sys.stdin.buffer.read(nbytes) # Read from standard input stream
# Do something with raw_data to get output_data HERE...
output_mask = output_data.tostring() # Convert to bytes
sys.stdout.buffer.write(b'results'+output_mask) # Write to standard output stream
Then I get the my_py.exe of this python program using Pyinstaller. I test the my_py.exe using subprocess.run() in Python. It is fine.
However, I need to call this my_py.exe in IDL. IDL has this tutorial on how to use its SPAWN command with pipes. So my IDL program which calls the my_py.exe is like this:
SPAWN['my_py.exe', arg], COUNT=COUNT , UNIT=UNIT
WRITEU, UNIT, nbytes, data_to_stream
READU, UNIT, output_from_exe
Unfortunately, the IDL program above hang at READU. Does anyone know the issue I have here? Is the problem in my python read and write?

You are missing a comma in the SPAWN command, although I imagine if that typo was in your code, IDL would issue a syntax error before you ever got to READU. But, if for some reason IDL is quietly continuing execution with an erroneous SPAWN call, maybe READU is hanging because it's trying to read some nonsense logical unit. Anyway, it should read:
SPAWN,['my_py.exe', arg], UNIT=UNIT
Here's the full syntax for reference:
SPAWN [, Command [, Result] [, ErrResult] ]
Keywords (all platforms): [, COUNT=variable] [, EXIT_STATUS=variable] [ ,/NOSHELL] [, /NULL_STDIN] [, PID=variable] [, /STDERR] [, UNIT=variable {Command required, Result and ErrResult not allowed}]
UNIX-Only Keywords: [, /NOTTYRESET] [, /SH]
Windows-Only Keywords: [, /HIDE] [, /LOG_OUTPUT] [, /NOWAIT]
I've eliminated the COUNT keyword, because, according to the documentation, COUNT contains the number of lines in Result, if Result is present, which it is not. In fact, Result is not even allowed here, since you're using the UNIT keyword. I doubt that passing the COUNT keyword is causing READU to hang, but it's unnecessary.
Also, check this note from the documentation
to make sure that the array you are passing as a command is correct:
If Command is present, it must be specified as follows:
On UNIX, Command is expected to be scalar unless used in conjunction with the NOSHELL keyword, in which case Command is expected to be a string array where each element is passed to the child process as a separate argument.
On Windows, Command can be a scalar string or string array. If it is a string array, SPAWN glues together each element of the string array, with each element separated by whitespace.
I don't know the details of your code, but here's some further wild speculation:
You might try setting the NOSHELL keyword, just as a shot in the dark.
I have occasionally had problems with IDL not seeming to finish writing to disk when I haven't closed the file unit, so make sure that you are using FREE_LUN, UNIT after READU. I know you said it hangs at READU, but my thinking here is that maybe it's only appearing to hang, and just can't continue until the file unit is closed.
Finally, here's something that could actually be the problem, and is worth looking into (from the tutorial you linked to):
A pipe is simply a buffer maintained by the operating system with an interface that makes it appear as a file to the programs using it. It has a fixed length and can therefore become completely filled. When this happens, the operating system puts the process that is filling the pipe to sleep until the process at the other end consumes the buffered data. The use of a bidirectional pipe can lead to deadlock situations in which both processes are waiting for the other. This can happen if the parent and child processes do not synchronize their reading and writing activities.

Related

python subprocess multiple commands with win path [duplicate]

This question already has answers here:
How do you activate an Anaconda environment within a Python Script?
(5 answers)
Closed 2 years ago.
I'm trying to trigger the execution of a python script via conda.
I would then capture the output and report it to command prompt where this is executed.
This is basically the concept in the easiest way
wrap.py - wrapper inted to execute multiple times the following script
import subprocess
def wrap():
while True:
cmd1=r"call C:\\Users\\my_user\\anaconda3\\Scripts\\activate.bat"
cmd2=r"cd C:\\myfolder\\mysubfolder"
cmd3=r"C:\\Users\\my_user\\anaconda3\\python.exe C:\\myfolder\\mysubfolder\\test.py"
proc = subprocess.run([cmd1,cmd2,cmd3])
if __name__ == '__main__':
wrap()
test.py - script that has to be executed
def mytest():
print("success")
if __name__ == '__main__':
mytest()
since mytest prints success once, I would like the output of the wrapper (run on anaconda) to be
(base) C:\myfolder\mysubfolder> python wrap.py
success
success
success
...
I tried with
1 - subprocess.Popen
2 - using shell=True or not
3 - using a list ["first command","second command","third command"] or a single string "first;second;third"
4 - using or removing "r" in front of the string, here the blanks are breaking the game
5 - using single or double ""
6- in my_user the underscore is also resulting in an encoding error
I actually tried to replicate at least 20 different stackoverflow "solutions" but none of them really worked for me. I also read properly the subprocessing page of python documentation, but this didn't help.
Any hint is appreciated, I'm lost.
The syntax subprocess.run([cmd1, cmd2, cmd3]) means run cmd1 with cmd2 and cmd3 as command-line arguments to cmd1. You instead want to execute a single sequence of shell commands; several of the things you are trying to do here require the shell, so you do want shell=True, which dictates the use of a single string as input, rather than a list consisting of a command and its arguments.
(Windows has some finicky processing behind the scenes which makes it not completely impossible to use a list of strings as the first argument with shell=True; but this really isn't portable or obvious. Just don't.)
Regarding the requirement for shell=True here, commands like call and cd (and source or . in Bourne-family shells) are shell built-ins which do not exist as separate binaries; if you don't have shell=True you will simply get "command not found" or your local equivalent. (Under other circumstances, you should generally avoid shell=True when you can. But this is not one of those cases; here, it really is unavoidable without major code changes.)
If your shell is cmd I guess the command might look like
subprocess.run(
r"call C:\Users\my_user\anaconda3\Scripts\activate.bat & C:\Users\my_user\anaconda3\python.exe C:\myfolder\mysubfolder\test.py",
shell=True)
or equivalently the same without r before the string and with all backslashes doubled; the only difference between an r"..." string and a regular "..." string is how the former allows you to put in literal backslashes, whereas the latter requires you to escape them; in the former case, everything in the string is literal, whereas in the latter case, you can use symbolic notations like \n for a newline character, \t for tab, etc.
In Python, it doesn't really matter whether you use single or double quotes; you can switch between them freely, obviously as long as you use the same opening and closing quotes. If you need literal single quotes in the string, use double quotes so you don't have to backslash-escape the literal quote, and vice versa. There's also the triple-quoted string which accepts either quoting character, but is allowed to span multiple lines, i.e. contain literal newlines without quoting them.
If your preferred shell is sh or bash, the same syntax would look like
subprocess.run(r"""
source C:\Users\my_user\anaconda3\Scripts\activate.bat &&
C:\Users\my_user\anaconda3\python.exe C:\myfolder\mysubfolder\test.py""",
shell=True)
I left out the cd in both cases because nothing in your code seems to require the subprocess to run in a particular directory. If you do actually have that requirement, you can add cwd=r'C:\myfolder\mysubfolder' after shell=True to run the entire subprocess in a separate directory.
There are situations where the facilities of subprocess.run() are insufficient, and you need to drop down to bare subprocess.Popen() and do the surrounding plumbing yourself; but this emphatically is not one of those scenarios. You should stay far away from Popen() if you can, especially if your understanding of subprocesses is not very sophisticated.

Why does subprocess.run not read new lines yet subprocess.call does?

Why is it that calling an executable via subprocess.call gives different results to subprocess.run?
The output of the call method is perfect - all new lines removed, formatting of the document is exactly right, '-' characters, bullets and tables are handled perfectly.
Running exactly the same function with the run method however and reading the output from stdout completely throws the output. Full of '\n', 'Â\xad', '\x97', '\x8f' characters with spacing all over the place.
Here's the code I'm using:
Subprocess.CALL
result=subprocess.call(['/path_to_pdftotext','-layout','/path_to_file.pdf','-'])
Subprocess.RUN
result=subprocess.run(['/path_to_pdftotext','-layout','/path_to_file.pdf','-'],stdout=PIPE, stderr=PIPE, universal_newlines=True, encoding='utf-8')
I don't understand why the run method doesn't parse and display the file in the same way. I'd use call however I need to save the result of the pdftotext conversion to a variable (in the case of run: var = result.stdout).
I can go through and just identify all the unicode it's not picking up in run and strip it out but I figure there must just be some encoding / decoding settings that the run method changes.
EDIT
Having read a similarly worded question - I believe this is different in scope as I'm wanting to understand why the output is different.
I've made some tests.
Are you printing the content on the console? Try to send the text in a text file with subprocess in both cases and see if it is different:
result=subprocess.call(['/path_to_pdftotext','-layout','/path_to_file.pdf','test.txt'])
result=subprocess.run(['/path_to_pdftotext','-layout','/path_to_file.pdf','test2.txt'])
and compare test.txt and test2.txt. In my case they are identical.
I suspect that the difference you are experiencing is not strictly related to subprocess, but how the console represent the output in both cases.
As said in the answer I linked in the comments, call():
It is equivalent to: run(...).returncode (except that the input and
check parameters are not supported)
That is your result stores an integer (the returncode) and the output is printed in the console, which seems to show it with the correct encoding, newlines etc.
With run() the result is a CompletedProcess instance. The CompletedProcess.stdout argument is:
Captured stdout from the child process. A bytes sequence, or a string
if run() was called with an encoding or errors. None if stdout was not
captured.
So being a bytes sequence or a string, python represents it differently when printed on the console, showing all the stuffs '\n', 'Â\xad', '\x97', '\x8f' and so on.

Using NULL bytes in bash (for buffer overflow)

I programmed a little C program that is vulnerable to a buffer overflow. Everything is working as expected, though I came across a little problem now:
I want to call a function which lies on address 0x00007ffff7a79450 and since I am passing the arguments for the buffer overflow through the bash terminal (like this:
./a "$(python -c 'print "aaaaaaaaaaaaaaaaaaaaaa\x50\x94\xA7\xF7\xFF\x7F\x00\x00"')" )
I get an error that the bash is ignoring the nullbytes.
/bin/bash: warning: command substitution: ignored null byte in input
As a result I end up with the wrong address in memory (0x7ffff7a79450instead of0x00007ffff7a79450).
Now my question is: How can I produce the leading 0's and give them as an argument to my program?
I'll take a bold move and assert what you want to do is not possible in a POSIX environment, because of the way arguments are passed.
Programs are run using the execve system call.
int execve(const char *filename, char *const argv[], char *const envp[]);
There are a few other functions but all of them wrap execve in the end or use an extended system call with the properties that follow:
Program arguments are passed using an array of NUL-terminated strings.
That means that when the kernel will take your arguments and put them aside for the new program to use, it will only read them up to the first NUL character, and discard anything that follows.
So there is no way to make your example work if it has to include nul characters. This is why I suggested reading from stdin instead, which has no such limitation:
char buf[256];
read(STDIN_FILENO, buf, 2*sizeof(buf));
You would normally need to check the returned value of read. For a toy problem it should be enough for you to trigger your exploit. Just pipe your malicious input into your program.

Is there any way to get the full command line that's executed when using subprocess.call?

I'm using subprocess.call where you just give it an array of argumets and it will build the command line and execute it.
First of all is there any escaping involved? (for example if I pass as argument a path to a file that has spaces in it, /path/my file.txt will this be escaped? "/path/my file.txt")
And is there any way to get this command line that's generated (after escaping and all) before being executed?
As I need to check if the generated command line is not longer than certain amount of characters (to make sure it will not give an error when it gets executed).
If you're not using shell=True, there isn't really a "command line" involved. subprocess.Popen is just passing your argument list to the underlyingexecve() system call.
Similarly, there's no escaping, because there's no shell involved and hence nothing to interpret special characters and nothing that is going to attempt to tokenize your string.
There isn't a character limit to worry about because the arguments are never concatenated into a single command line. There may be limits on the maximum number of arguments and/or the length of individual arguments.
If you are using shell=True, you have to construct the command line yourself before passing it to subprocess.

How do I get the ORIGINAL command line in Python? with spaces, tabs, etc [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Full command line as it was typed
sys.argv is already a parsed array, losing double quotes, double spaces and maybe even tab characters (it all depends on the OS/shell, of course).
How can I access the original string before parsing?
Shortly, you don't.
Long: on Unix command line is parsed by the calling program and by the time python starts you already have the command line parsed.
PS. On Windows it is possible, but I suppose you are looking for a general response.
You can't do that explicitly because, this is how a shell passes the arguments to a program.
The sys.argv is all Python got. The shell processed the filename generation (globs), parameter (variable) expansion, quotes, and word splitting before passing the arguments to the Python process (in Unix; in Windows it's the startup actually parsing it, but for portability, you can't rely on that).
However, remember that POSIX shell quoting rules allow passing any characters you may want (except NUL bytes that terminate strings).
Compare starting a process from Python using subprocess.call with or without the shell argument set. With shell=False the list of strings is what comes up in the sys.argv in the started process (starting with the script path; parameters processed by Python itself are removed) while with shell=True the string is passed to the shell which interprets it according to its own rules.

Categories

Resources