Why is a Python exported file not closing? - python

When exporting a csv-file from Python, for some reason it does not close (even when using the 'with' statement) because when I'm calling it afterwards I get the following error:
PermissionError: [WinError 32] The process cannot access the file because it is being used
by another process
I suppose it has to be the close function that hangs, because when I'm printing behind the with statement or the close() statement, it gets printed (e.g. print fileName). Any suggestions that might solve this matter?
(Also when I'm trying to open the exported CSV-file, I get a read-only message because it's used by another program. I can access it properly only when Python is closed, which is just annoying)
import csv, numpy, os
import DyMat
import subprocess
os.chdir("C:/Users/myvhove/Documents/ResultsPyDymInt/Dymola/CoupledClutches")
dm = DyMat.DyMatFile("dymatresfile")
print(dm.names())
varList = ('J1.w', 'J2.w', 'J3.w', 'J4.w')
fileName = dm.fileName + '.csv'
with open(fileName, 'w', newline='') as oFile:
csvWriter = csv.writer(oFile)
vDict = dm.sortByBlocks(varList)
for vList in vDict.values():
vData = dm.getVarArray(vList)
vList.insert(0, dm._absc[0])
csvWriter.writerow(vList)
csvWriter.writerows(numpy.transpose(vData))
subprocess.call("dymatresfile.csv")
print(fileName)

The code is correct. The problem must be somewhere else.
Either another forgotten python process or as #CristiFati mentioned an open editor.
In the worst case restart the PC and call the python script directly after logging in again.
The error should no more be there.

Related

If you os.open() and open() a file, you need to both close() and os.close() it?

I'm opening a file in Python using os.open(), and then I want to write to it using the file interface, ie. f.write().
So I'm doing the following
import os
fd = os.open('log', os.O_CREAT|os.O_RDWR, 0b111110110)
f = open(fd,'w')
But when it's time to clean up, I don't know if I should do call both os.close(fd) and f.close(), or only one.
Calling os.close() xor f.close() is fine (ie. exactly one), but calling both throws a "bad file descriptor" error.

What is 'target' in 'target = open(...)'

I'm learning Python as someone more familiar with databases and ETL. I'm not sure where target comes from in the following code.
from sys import argv
script, filename = argv
target = open(filename, 'w')
I think argv is a class in the sys module, but I don't think target comes from argv.
If you run type(target), you will get this: <_io.TextIOWrapper name='dde-recommendation-engine/sample_data/synthetic-micro/ratings.txt' mode='r' encoding='UTF-8'>
What that means in simple terms is that it is an object accessing that particular file (with only write permission because you have a w mode).
You can use this object to add stuff into the file by target.write(.....)
Do remember to close the file however by doing target.close() at the end.
Another way to do the same and I prefer this most of the times is:
with open(filename, 'w') as target:
target.write(...)
This way the file is closed automatically once you are out of the with context.
argv is the list populated by the arguments provided by user while running the program from shell. Please see https://docs.python.org/3/library/sys.html#sys.argv for more info on that.
User supplied the filename from shell, program used the open call https://docs.python.org/3/library/functions.html#open to get a file handle on that filename
And that file handle is stored in variable called target (which could be named anything you like) so that you can process the file using other file methods.
You are using open() - a built-in function in python. This function returns an File object - which is assigned to target variable. Now you can interact with target to write data (since you are using the w mode)
.

Is there a way to check what part of my code leaves file handles open

Is there a way to track the python process to check where a file is being opened. I have too many files open when I use lsof on my running process but I'm not sure where they are being opened.
ls /proc/$pid/fd/ | wc -l
I suspect one of the libraries I'm using might have not handled the files properly. Is there a way to isolate exactly which line in my python code the files are being opened?
In my code I work with 3rd party libraries to process thousands of media files and since they are being left open I receive the error
OSError: [Errno 24] Too many open files
after running for a few minutes. Now I know raising the limit of open files is an option but this will just push the error to a later point of time.
The easiest way to trace the open calls is to use an audit hook in Python. Note that this method would only trace Python open calls and not the system calls.
Let fdmod.py be a module file with a single function foo:
def foo():
return open("/dev/zero", mode="r")
Now the main code in file fd_trace.py, which is tracing all open calls and importing fdmod, is defined follows:
import sys
import inspect
import fdmod
def open_audit_hook(name, *args):
if name == "open":
print(name, *args, "was called:")
caller = inspect.currentframe()
while caller := caller.f_back:
print(f"\tFunction {caller.f_code.co_name} "
f"in {caller.f_code.co_filename}:"
f"{caller.f_lineno}"
)
sys.addaudithook(open_audit_hook)
# main code
fdmod.foo()
with open("/dev/null", "w") as dev_null:
dev_null.write("hi")
fdmod.foo()
When we run fd_trace.py, we will print the call stack whenever some component is calling open:
% python3 fd_trace.py
open ('/dev/zero', 'r', 524288) was called:
Function foo in /home/tkrennwa/fdmod.py:2
Function <module> in fd_trace.py:17
open ('/dev/null', 'w', 524865) was called:
Function <module> in fd_trace.py:18
open ('/dev/zero', 'r', 524288) was called:
Function foo in /home/tkrennwa/fdmod.py:2
Function <module> in fd_trace.py:20
See sys.audithook and inspect.currentframe for details.
You might get useful information using strace. This will show all system calls made by a process, including calls to open(). It will not directly show you where in the Python code those calls are occurring, but you may be able to deduce some information from the context.
Seeing open file handles is easy on Linux:
open_file_handles = os.listdir('/proc/self/fd')
print('open file handles: ' + ', '.join(map(str, open_file_handles)))
You can also use the following on any OS (e.g. Windows, Mac):
import errno, os, resource
open_file_handles = []
for fd in range(resource.getrlimit(resource.RLIMIT_NOFILE)[0]):
try: os.fstat(fd)
except OSError as e:
if e.errno == errno.EBADF: continue
open_file_handles.append(fd)
print('open file handles: ' + ', '.join(map(str, open_file_handles)))
Note: This should always work assuming you're actually (occasionally) running out of file handles. There are usually a max of 256 file handles. But it might take a long time if the max (set by the OS/user policy) is something huge like a billion.
Note also: There will almost always be at least three file handles open for STDIN, STDOUT, and STDERR respectively.

How to open <del>named pipe</del>character device special file for reading and writing in Python

I have a service running on a Linux box that creates a named pipe character device-special file, and I want to write a Python3 program that communicates with the service by writing text commands and reading text replies from the pipe device. I don't have source code for the service.
I can use os.open(named_pipe_pathname, os.O_RDWR), and I can use os.read(...) and os.write(...) to read and write it, but that's a pain because I have to write my own code to convert between bytes and strings, I have to write my own readline(...) function, etc.
I would much rather use a Python3 io object to read and write the pipe device, but every way I can think to create one returns the same error:
io.UnsupportedOperation: File or stream is not seekable.
For example, I get that message if I try open(pathname, "r+"), and I get that same message if I try fd=os.open(...) followed by os.fdopen(fd, "r+", ...).
Q: What is the preferred way for a Python3 program to write and read text to and from a named pipe character device?
Edit:
Oops! I assumed that I was dealing with a named pipe because documentation for the service describes it as a "pipe" and, because it doesn't appear in the file system until the user-mode service runs. But, the Linux file utility says it is in fact, a character device special file.
The problem occurs because attempting to use io.open in read-write mode implicitly tries to wrap the underlying file in io.BufferedRandom (which is then wrapped in io.TextIOWrapper if in text mode), which assumes the underlying file is not only read/write, but random access, and it takes liberties (seeking implicitly) based on this. There is a separate class, io.BufferedRWPair, intended for use with read/write pipes (the docstring specifically mentions it being used for sockets and two way pipes).
You can mimic the effects of io.open by manually wrapping layer by layer to produce the same end result. Specifically, for a text mode wrapper, you'd do something like:
rawf = io.FileIO(named_pipe_pathname, mode="rb+")
with io.TextIOWrapper(io.BufferedRWPair(rawf, rawf), encoding='utf-8', write_through=True) as txtf:
del rawf # Remove separate reference to rawf; txtf manages lifetime now
# Example use that works (but is terrible form, since communicating with
# oneself without threading, select module, etc., is highly likely to deadlock)
# It works for this super-simple case; presumably you have some parallel real code
txtf.write("abcé\n")
txtf.flush()
print(txtf.readline(), flush=True)
I believe this will close rawf twice when txtf is closed, but luckily, double-close is harmless here (the second close does nothing, realizing it's already closed).
Solution
You can use pexpect. Here is an example using two python modules:
caller.py
import pexpect
proc = pexpect.spawn('python3 backwards.py')
proc.expect(' > ')
while True:
n = proc.sendline(input('Feed me - '))
proc.expect(' > ')
print(proc.before[n+1:].decode())
backwards.py
x = ''
while True:
x = input(x[::-1] + ' > ')
Explanation
caller.py is using a "Pseudo-TTY device" to talk to backwards.py. We are providing input with sendline and capturing input with expect (and the before attribute).
It looks like you need to create separate handles for reading and for writing: to open read/write just requires a seek method. I couldn't figure out how to timeout reading, so it's nice to add an opener (see the docstring for io.open) that opens the reader in non-blocking mode. I set up a simple echo service on a named pipe called /tmp/test_pipe:
In [1]: import io
In [2]: import os
In [3]: nonblockingOpener = lambda name, flags:os.open(name, flags|os.O_NONBLOCK)
In [4]: reader = io.open('/tmp/test_pipe', 'r', opener = nonblockingOpener)
In [5]: writer = io.open('/tmp/test_pipe', 'w')
In [6]: writer.write('Hi have a line\n')
In [7]: writer.flush()
In [8]: reader.readline()
Out[8]: 'You said: Hi have a line\n'
In [9]: reader.readline()
''

Breaking the python code when particular file is being opened

I want to run code under debugger and stop it when file being opened. I want to do that regardless of technique by which the file was opened. AFAIK there are two ways of opening file (if there are more then I want to stop code also on that case) and I want to stop the code when one of those are being executed:
with open(filename, "wb") as outFile:
or
object = open(file_name [, access_mode][, buffering])
is this possible under pdb or ipdb ?
PS: I do not know the line where file is being opened if I know I can set the breakpoint manually. Also I could grep for open( and set the breakpoint on found lines but if my code uses modules this might been problematic. Also if the file is opened another way not by open (I do not know if this is possible just guessing, maybe for appending etc.) this wouldn't work.
Ideally you'd put a breakpoint in the open builtin function, but that is not possible. Instead, you can override it, and place the breakpoint there:
import __builtin__
def open(name, mode='', buffer=0):
return __builtin__.open(name, mode, buffer) # place a BreakPoint here
Of course you'll be breaking at any file opening, not just the one you wanted.
So you can refine that a bit and place a conditional breakpoint:
import ipdb
import __builtin__
def open(name, mode='', buffer=0):
if name == 'myfile.txt':
ipdb.set_trace() ######### Break Point ###########
return __builtin__.open(name, mode, buffer)
f = open('myfile.txt', 'r')
Run your python program with python -m pdb prog.py.
If you don't know where the open call is, you need to patch the original open at the earliest possible point (e.g. the __main__-guard) like this:
import __builtin__
_old_open = open
def my_open(*args, **kwargs):
print "my_open"
return _old_open(*args, **kwargs)
setattr(__builtin__, 'open', my_open)
print open(__file__, "rb").read()

Categories

Resources