Disable file access to a python program from command line - python

Is there a command line command to disable access to all files and folders in the system from a python program? It should give an error if the program tries to access any file. For example, in the command line:
$ python filename.py <some_command>
or something similar.
It should not allow functions like open('filename.txt') in the program.
Edit : sudo allows to run programs with admin access. Can we create command like sudo which will limits access to another files and folders?
Thank you.

From the list of command line options for python there doesn't seem to be this option (and sandboxing to prevent IO appears to not be too effective). You could make your own command arguments to gain this functionality, for example
import argparse
class blockIOError(Exception):
pass
parser = argparse.ArgumentParser(description='Block IO operations')
parser.add_argument('-bIO','-blockIO',
action='store_true',
help='Flag to prevent input/output (default: False)')
args = parser.parse_args()
blockIO = args.bIO
if not blockIO:
with open('filename.txt') as f:
print(f.read())
else:
raise blockIOError("Error -- input/output not allowed")
The down side is you need to wrap every open, read etc in an if statement. The advantage is you can specify exactly what you want to allow. Output then would look like:
$ python 36477901.py -bIO
Traceback (most recent call last):
File "36477901.py", line 19, in <module>
raise blockIOError("Error -- input/output not allowed")
__main__.blockIOError: Error -- input/output not allowed

Related

Pyinstaller not allowing multiprocessing with MacOS

I have a python file that I would like to package as an executable for MacOS 11.6.
The python file (called Service.py) relies on one other json file and runs perfectly fine when running with python. My file uses argparse as the arguments can differ depending on what is needed.
Example of how the file is called with python:
python3 Service.py -v Zephyr_Scale_Cloud https://myurl.cloud/ philippa#email.com password1 group3
The file is run in exactly the same way when it is an executable:
./Service.py -v Zephyr_Scale_Cloud https://myurl.cloud/ philippa#email.com password1 group3
I can package the file using PyInstaller and the executable runs.
Command used to package the file:
pyinstaller --paths=/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/ Service.py
However, when I get to the point that requires multiprocessing, the arguments get lost. My second argument (here noted as https://myurl.cloud) is a URL that I require.
The error I see is:
[MAIN] Starting new process RUNID9157
url before constructing the client recognised as pipe_handle=15
usage: Service [-h] test_management_tool url
Service: error: the following arguments are required: url
Traceback (most recent call last):
File "urllib3/connection.py", line 174, in _new_conn
File "urllib3/util/connection.py", line 72, in create_connection
File "socket.py", line 954, in getaddrinfo
I have done some logging and the url does get correctly read. But as soon as the process started and picking up what it needs to, the url is changed to 'pipe_handle=x', in the code above it is pipe_handle=15.
I need the url to retrieve an authentication token, but it just stops being read as the correct value and is changed to this pipe_handle value. I have no idea why.
Has anyone else seen this?!
I am using Python 3.9, PyInstaller 4.4 and ArgParse.
I have also added
if __name__ == "__main__":
if sys.platform.startswith('win'):
# On Windows - multiprocessing is different to Unix and Mac.
multiprocessing.freeze_support()
to my if name = "main" section as I saw this on other posts but it doesn't help.
Can someone please assist?
Sending commands via sys.argv is complicated by the fact that multiprocessing's "spawn" start method uses that to pass the file descriptors for the initial communication pipes between the parent and child.
I'm projecting here a little because you did not share the code of how/where you call argparse, and how/where you call multiprocessing
If you are parsing args outside of if __name__ == "__main__":, the args may get parsed (re-parsed on child import __main__) before sys.argv gets automatically cleaned up by multiprocessing.spawn.prepare() in the child. You should be able to fix this by moving the argparse stuff inside your target function. It also may be easier to parse the args in the parent, and simply send the parsed results as an argument to the target function. See this answer of mine for further discussion on sys.argv with multiprocessing.

Finding which files are being read from during a session (python code)

I have a large system written in python. when I run it, it reads all sorts of data from many different files on my filesystem. There are thousands lines of code, and hundreds of files, most of them are not actually being used. I want to see which files are actually being accessed by the system (ubuntu), and hopefully, where in the code they are being opened. Filenames are decided dynamically using variables etc., so the actual filenames cannot be determined just by looking at the code.
I have access to the code, of course, and can change it.
I try to figure how to do this efficiently, with minimal changes in the code:
is there a Linux way to determine which files are accessed, and at what times? this might be useful, although it won't tell me where in the code this happens
is there a simple way to make an "open file" command also log the file name, time, etc... of the open file? hopefully without having to go into the code and change every open command, there are many of them, and some are not being used at runtime.
Thanks
You can trace file accesses without modifying your code, using strace.
Either you start your program with strace, like this
strace -f -e trace=file your_program.py
Otherwise you attach strace to a running program like this
strace -f -e trace=file -p <PID>
For 1 - You can use
ls -la /proc/<PID>/fd`
Replacing <PID> with your process id.
Note that it will give you all the open file descriptors, some of them are stdin stdout stderr, and often other things, such as open websockets (which use a file descriptor), however filtering it for files should be easy.
For 2- See the great solution proposed here -
Override python open function when used with the 'as' keyword to print anything
e.g. overriding the open function with your own, which could include the additional logging.
One possible method is to "overload" the open function. This will have many effects that depend on the code, so I would do that very carefully if needed, but basically here's an example:
>>> _open = open
>>> def open(filename):
... print(filename)
... return _open(filename)
...
>>> open('somefile.txt')
somefile.txt
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in open
FileNotFoundError: [Errno 2] No such file or directory: 'somefile.txt'
As you can see my new open function will return the original open (renamed as _open) but will first print out the argument (the filename). This can be done with more sophistication to log the filename if needed, but the most important thing is that this needs to run before any use of open in your code

Python: Executing a shell command

I need to do this:
paste file1 file2 file3 > result
I have the following in my python script:
from subprocess import call
// other code here.
// Here is how I call the shell command
call ["paste", "file1", "file2", "file3", ">", "result"])
Unfortunately I get this error:
paste: >: No such file or directory.
Any help with this will be great!
You need to implement the redirection yourself, if you're wisely deciding not to use a shell.
The docs at https://docs.python.org/2/library/subprocess.html warn you not to use a pipe -- but, you don't need to:
import subprocess
with open('result', 'w') as out:
subprocess.call(["paste", "file1", "file2", "file3"], stdout=out)
should be just fine.
There are two approaches to this.
Use shell=True:
call("paste file1 file2 file3 >result", shell=True)
Redirection, >, is a shell feature. Consequently, you can only access it when using a shell: shell=True.
Keep shell=False and use python to perform the redirection:
with open('results', 'w') as f:
subprocess.call(["paste", "file1", "file2", "file3"], stdout=f)
The second is normally preferred as it avoids the vagaries of the shell.
Discussion
When the shell is not used, > is just another character on the command line. Thus, consider the error message:
paste: >: No such file or directory.
This indicates that paste had received > as an argument and was trying to open a file by that name. No such file exists. Therefore the message.
As the shell command line, one can create a file by that name:
touch '>'
If such a file had existed, paste, when called by subprocess with shell=False, would have used that file for input.
If you don't mind adding an additional dependency in your code base you might consider installing the sh Python module (from PyPI:sh using pip, of course).
This is a rather clever wrapper around Python's subprocess module's functionality. Using sh your code would look something like:
#!/usr/bin/python
from sh import paste
paste('file1', 'file2', 'file3', _out='result')
... although I think you'd want some exception handling around that so you could use something like:
#!/usr/bin/python
from __future__ import print_function
import sys
from sh import paste
from tempfile import TemporaryFile
with tempfile.TemporaryFile() as err:
try:
paste('file1', 'file2', 'file3', _out='result', _err=err)
except (EnvironmentError, sh.ErrorReturnCode) as e:
err.seek(0)
print("Caught Error: %s" % err.read(), file=sys.stderr)
sh makes such things almost trivially easy although there are some tricks as you get more advanced. You also have to note the difference between _out= and other keyword arguments of that form, vs. sh's magic for most other keyword arguments.
All that sh magic make confuse anyone else who ever reads your code. You might also find that using Python modules with sh code interlaced into it makes you complacent about portability issues. Python code is generally fairly portable while Unix command line utilities can vary considerably from one OS to another and even from one Linux distribution or version to another. Having lots of shell utilities interlaced with your Python code in such a transparent way may make that problem less visible.

Python stdin filename

I'm trying to get the filename thats given in the command line. For example:
python3 ritwc.py < DarkAndStormyNight.txt
I'm trying to get DarkAndStormyNight.txt
When I try fileinput.filename() I get back same with sys.stdin. Is this possible? I'm not looking for sys.argv[0] which returns the current script name.
Thanks!
In general it is not possible to obtain the filename in a platform-agnostic way. The other answers cover sensible alternatives like passing the name on the command-line.
On Linux, and some related systems, you can obtain the name of the file through the following trick:
import os
print(os.readlink('/proc/self/fd/0'))
/proc/ is a special filesystem on Linux that gives information about processes on the machine. self means the current running process (the one that opens the file). fd is a directory containing symbolic links for each open file descriptor in the process. 0 is the file descriptor number for stdin.
You can use ArgumentParser, which automattically gives you interface with commandline arguments, and even provides help, etc
from argparse import ArgumentParser
parser = ArgumentParser()
parser.add_argument('fname', metavar='FILE', help='file to process')
args = parser.parse_args()
with open(args.fname) as f:
#do stuff with f
Now you call python2 ritwc.py DarkAndStormyNight.txt. If you call python3 ritwc.py with no argument, it'll give an error saying it expected argument for FILE. You can also now call python3 ritwc.py -h and it will explain that a file to process is required.
PS here's a great intro in how to use it: http://docs.python.org/3.3/howto/argparse.html
In fact, as it seams that python cannot see that filename when the stdin is redirected from the console, you have an alternative:
Call your program like this:
python3 ritwc.py -i your_file.txt
and then add the following code to redirect the stdin from inside python, so that you have access to the filename through the variable "filename_in":
import sys
flag=0
for arg in sys.argv:
if flag:
filename_in = arg
break
if arg=="-i":
flag=1
sys.stdin = open(filename_in, 'r')
#the rest of your code...
If now you use the command:
print(sys.stdin.name)
you get your filename; however, when you do the same print command after redirecting stdin from the console you would got the result: <stdin>, which shall be an evidence that python can't see the filename in that way.
I don't think it's possible. As far as your python script is concerned it's writing to stdout. The fact that you are capturing what is written to stdout and writing it to file in your shell has nothing to do with the python script.

Using python subprocess to fake running a cmd from a terminal

We have a vendor-supplied python tool ( that's byte-compiled, we don't have the source ). Because of this, we're also locked into using the vendor supplied python 2.4. The way to the util is:
source login.sh
oupload [options]
The login.sh just sets a few env variables, and then 2 aliases:
odownload () {
${PYTHON_CMD} ${OCLIPATH}/ocli/commands/word_download_command.pyc "$#"
}
oupload () {
${PYTHON_CMD} ${OCLIPATH}/ocli/commands/word_upload_command.pyc "$#"
}
Now, when I run it their way - works fine. It will prompt for a username and password, then do it's thing.
I'm trying to create a wrapper around the tool to do some extra steps after it's run and provide some sane defaults for the utility. The problem I'm running into is I cannot, for the life of me, figure out how to use subprocess to successfully do this. It seems to realize that the original command isn't running directly from the terminal and bails.
I created a '/usr/local/bin/oupload' and copied from the original login.sh. Only difference is instead of doing an alias at the end, I actually run the command.
Then, in my python script, I try to run my new shell script:
if os.path.exists(options.zipfile):
try:
cmd = string.join(cmdargs,' ')
p1 = Popen(cmd, shell=True, stdin=PIPE)
But I get:
Enter Opsware Username: Traceback (most recent call last):
File "./command.py", line 31, in main
File "./controller.py", line 51, in handle
File "./controllers/word_upload_controller.py", line 81, in _handle
File "./controller.py", line 66, in _determineNew
File "./lib/util.py", line 83, in determineNew
File "./lib/util.py", line 112, in getAuth
Empty Username not legal
Unknown Error Encountered
SUMMARY:
Name: Empty Username not legal
Description: None
So it seemed like an extra carriage return was getting sent ( I tried rstripping all the options, didn't help ).
If I don't set stdin=PIPE, I get:
Enter Opsware Username: Traceback (most recent call last):
File "./command.py", line 31, in main
File "./controller.py", line 51, in handle
File "./controllers/word_upload_controller.py", line 81, in _handle
File "./controller.py", line 66, in _determineNew
File "./lib/util.py", line 83, in determineNew
File "./lib/util.py", line 109, in getAuth
IOError: [Errno 5] Input/output error
Unknown Error Encountered
I've tried other variations of using p1.communicate, p1.stdin.write() along with shell=False and shell=True, but I've had no luck in trying to figure out how to properly send along the username and password. As a last result, I tried looking at the byte code for the utility they provided - it didn't help - once I called the util's main routine with the proper arguments, it ended up core dumping w/ thread errors.
Final thoughts - the utility doesn't want to seem to 'wait' for any input. When run from the shell, it pauses at the 'Username' prompt. When run through python's popen, it just blazes thru and ends, assuming no password was given. I tried to lookup ways of maybe preloading the stdin buffer - thinking maybe the process would read from that if it was available, but couldn't figure out if that was possible.
I'm trying to stay away from using pexpect, mainly because we have to use the vendor's provided python 2.4 because of the precompiled libraries they provide and I'm trying to keep distribution of the script to as minimal a footprint as possible - if I have to, I have to, but I'd rather not use it ( and I honestly have no idea if it works in this situation either ).
Any thoughts on what else I could try would be most appreciated.
UPDATE
So I solved this by diving further into the bytecode and figuring out what I was missing from the compiled command.
However, this presented two problems -
The vendor code, when called, was doing an exit when it completed
The vendor code was writing to stdout, which I needed to store and operate on ( it contains the ID of the uploaded pkg ). I couldn't just redirect stdout, because the vendor code was still asking for the username/password.
1 was solved easy enough by wrapping their code in a try/except clause.
2 was solved by doing something similar to: https://stackoverflow.com/a/616672/677373
Instead of a log file, I used cStringIO. I also had to implement a fake 'flush' method, since it seems the vendor code was calling that and complaining that the new obj I had provided for stdout didn't supply it - code ends up looking like:
class Logger(object):
def __init__(self):
self.terminal = sys.stdout
self.log = StringIO()
def write(self, message):
self.terminal.write(message)
self.log.write(message)
def flush(self):
self.terminal.flush()
self.log.flush()
if os.path.exists(options.zipfile):
try:
os.environ['OCLI_CODESET'] = 'ISO-8859-1'
backup = sys.stdout
sys.stdout = output = Logger()
# UploadCommand was the command found in the bytecode
upload = UploadCommand()
try:
upload.main(cmdargs)
except Exception, rc:
pass
sys.stdout = backup
# now do some fancy stuff with output from output.log
I should note that the only reason I simply do a 'pass' in the except: clause is that the except clause is always called. The 'rc' is actually the return code from the command, so I will probably add handling for non-zero cases.
I tried to lookup ways of maybe preloading the stdin buffer
Do you perhaps want to create a named fifo, fill it with username/password info, then reopen it in read mode and pass it to popen (as in popen(..., stdin=myfilledbuffer))?
You could also just create an ordinary temporary file, write the data to it, and reopen it in read mode, again, passing the reopened handle as stdin. (This is something I'd personally avoid doing, since writing username/passwords to temporary files is often of the bad. OTOH it's easier to test than FIFOs)
As for the underlying cause: I suspect that the offending software is reading from stdin via a non-blocking method. Not sure why that works when connected to a terminal.
AAAANYWAY: no need to use pipes directly via Popen at all, right? I kinda laugh at the hackishness of this, but I'll bet it'll work for you:
# you don't actually seem to need popen here IMO -- call() does better for this application.
statuscode = call('echo "%s\n%s\n" | oupload %s' % (username, password, options) , shell=True)
tested with status = call('echo "foo\nbar\nbar\nbaz" |wc -l', shell = True) (output is '4', naturally.)
The original question was solved by just avoiding the issue and not using the terminal and instead importing the python code that was being called by the shell script and just using that.
I believe J.F. Sebastian's answer would probably work better for what was originally asked, however, so I'd suggest people looking for an answer to a similar question look down the path of using the pty module.

Categories

Resources