Python multiple instances of program writing and then reading the same file - python

I wrote a piece of python code that calls a external program to write an intermediate file and thereafter my code reads from it. I want to run multiple instances of my code simultaneously. Will there be any conflict if I code list this?
args=['/usr/bin/program','-o','intermediate_file']
process = subprocess.Popen(args,shell=False)
process.wait()
if process.returncode ==0:
fh = open('intermediate_file', 'r')
process(fh)
...

Concurrent file access is handled by the operating system. There are several scenarios, depending on the OS and or filesystem you use. Take a look at the Wikipedia-article.

Take a look here: tempfile
You can make use of this lib to avoid conflicts - temp files have random names.

Related

Remove file in python with absolute path: error "no such file or directory" but file exists

I'm trying to remove a file in Python 3 on Linux (RHEL) the following way:
os.remove(or.getcwd() + '/file.txt')
(sorry not allowed to publish the real paths).
and it gives me the usual error
No such file or directory: '/path/to/file/file.txt'
(I've respected slash or antislash in the path)
What is strange is that when I just ls the file (by copy pasting, so the very same path) the file does exist.
I've read this post but i'm not on Windows and slash direction seems correct.
Any idea ?
EDIT: as suggested by #DominicPrice os.system('ls') is showing the file while os.listdir() does not show it (but shows other files in the same directory)
EDIT 2: So my issue was due a a bad usage of os.popen. I used this method to copy file but did not wait for the subprocess to be terminated. So my understanding is that the file was not copied yet when I tried to delete it.
The problem is that, as you have explained in the comments, you are creating the file using os.popen("cp ..."). This works asynchronously, so it may not have had time to complete by the time you call os.remove(). You can force python to wait for it to finish by calling the close method:
proc = os.popen("cp myfile myotherfile")
proc.close() # wait for process to finish
os.remove("myotherfile") # we're all good
I would highly recommend staying away from using os.popen in favour of the subprocess library, which has a run function which is way safer to use.
For the specific functions of copying a file, an even better (and cross platform) solution is to use the shutil library:
import shutil
shutil.copyfile("myfile", "myotherfile")
you should use os.path.dirname(__file__).
this is an inbuilt function of os module in python.
you can read more here.
https://www.geeksforgeeks.org/find-path-to-the-given-file-using-python/

Why does python3.7 pass argv elements alongside 'garbage'

I had this script working for me, before I decided I'm gonna rewrite everything and make it portable.
Without delving too much into the details, there's a central Bash script, which calls 5 other Bash scripts in their own respective folders. I have no intention of porting to Windows anytime soon, as of current this is just for Linux.
The execution path of the central Bash script is:
dos.1/1-init.sh dos.1/
dos.2/1-trace-to-file.sh dos.2/ dos.1/
dos.3/1-recognize-categories.sh dos.3/
dos.4/1-ping-in-groups.sh dos.4/ dos.3/
dos.5/init.sh dos.5/ dos.4/
I run with ./init.sh
Before the script was 'portable' I was using explicit file paths inside each respective script. All was well and good. The program itself is a combination of Bash and Python, and writes to files in one directory, so that they can be manipulated in various ways, before being read back into different parts of the program.
I understand that the fastest way to do this would be to write a monolithic Python script, using subprocess calls for the Bash side of things... However, I am doing it this way to ease maintenance, and (before I started making it 'portable') it was lightning fast.
My issue now is this: each time I have to read text into Python (either from SQL or from file) there's always this added garbage. Up until this point, I have been using sed, awk and Python's .rstrip() function to manage this... Which is all well and good, but this one damn function will not play nice... And I feel there must be a better way.
In bash I call it with:
$prog_dir=$1
$data_dir=$2
$prog_dir/2fast-ping.py $data_dir/group0.txt > $prog_dir/group0_averages.txt
$prog_dir/2fast-ping.py $data_dir/group1.txt > $prog_dir/group1_averages.txt
...
Now I know that I could write to file from within Python, but in this instance I have other reasons not to.
The issue, is that when the 2fast-ping.py script is ran, it reads the text file in with commas and a newline char. I have vigorously checked and I can confirm that the group#.txt files 100% do not contain commas. Here's the Python:
import sys
import subprocess
import select
from concurrent.futures import ThreadPoolExecutor
filename = sys.argv[1]
f = open(filename, "r")
ips = [elem.rstrip('\n') for elem in f]
print(ips)
f.close()
The script goes on to do some work on the IPs afterwards, but this is the painful part. If I call the script direct from CLI: ./2fast-ping.py ../dos.3/group0.txt, the text is processed PROPERLY and the superseding instructions actually function. But, when called from the first init script, the program basically sh*ts itself because each line is read in with commas. It works until the point where it starts to use the processed info, then:
<actual IP would be here>
ping: ('##.###.###.###',): Name or service not known
Of course, the issue is the ('',) But, Python is adding that in, and I don't know how to stop it :(
Any ideas?
Python code was okay, just passing an additional / with the argument :(

Python variable value from separate file?

I have a Python script that runs 24hrs a day.
A module from this script is using variables values that I wish to change from time to time, without having to stop the script, edit the module file, then launch the script again (I need to avoid interruptions as much as I can).
I thought about storing the variables in a separate file, and the module would, when needed, fetch the new values from the file and use them.
Pickle seemed a solution but is not human readable and therefore not easily changeable. Maybe a JSON file, or another .py file I import over again ?
Another advantage of doing so, for me, is that in case of interruption (eg. server restart), I can resume the script with the latest variable values if I load them from a separate file.
Is there a recommended way of doing such things ?
Something along the lines :
# variables file:
variable1 = 10
variable2 = 25
# main file:
while True:
import variables
print('Sum:', str(variable1+variable2))
time.sleep(60)
An easy way to maintain a text file with variables would be the YAML format. This answer explains how to use it, basically:
import yaml
stream = open("vars.yaml", "r")
docs = yaml.load_all(stream)
If you have more than a few variables, it may be good to check the file descriptor to see if the file was recently updated, and only re-load variables when there was a change in the file.
import os
last_updated = os.path.getmtime('vars.yaml')
Finally, since you want avoid interruption of the script, it may be good to have the script catch any errors in the YAML file and warn the user, instead of just throwing an exception and die. But also remember that "errors should never pass silently". What is the best approach here would depend on your use-case.

How to grab files generated by a subprocess?

I want to run some command line scripts from within my python program. These scripts generates some output files. I want to grab these output files from the subprocess call as object in my python program, while canceling generation of files on disk. Problem is I don't know how to do it, or whether that is even possible.
A simple example would look like this:
#foo.py
fout1 = open("temp1.txt","w")
fout2 = open("temp2.txt","w")
fout1.write("fout1")
fout2.write("fout2")
fout1.close()
fout2.close()
#test.py
import subprocess
process = subprocess.Popen(["python","foo.py"], ????????) #what arguments to use to grab temp1.txt and temp2.txt
print(process.??????) #how to access those files
I am familiar with subprocess.Popen so that is what the example code uses, but I am open to the use of other modules too if they could do it.

Python run command line (time)

I want to run the 'time' unix command from a Python script, to time the execution of a non Python app. I would use the os.system method.
Is there any way to save the output of this in Python? My goal is to run the app several times, save their execution times and then do some statistics on them.
Thank You
You really should be using the subprocess module to run external commands. The Popen() method lets you specify a file object where stdout should go (note, you can use any Python object that behaves like a file, you don't necessarily need to write it to a file).
For instance:
import subprocess
log_file = open('/path/to/file', 'a')
return_code = subprocess.Popen(['/usr/bin/foo', 'arg1', 'arg2'], stdout=log_file).wait()
You can use the timeit module to time code snippets (say, a function that launches external commands using subprocess module as described in the answer above) and save the data to a csv file. You can do statistics on the csv data using a stats module or externally using Excel/ LogParser/ R etc.
Another approach is to use the hotshot profiler that does the profiling and also returns stats that you can either print using print_stats() method or save to a file by iterating over.

Categories

Resources