I want to run code under debugger and stop it when file being opened. I want to do that regardless of technique by which the file was opened. AFAIK there are two ways of opening file (if there are more then I want to stop code also on that case) and I want to stop the code when one of those are being executed:
with open(filename, "wb") as outFile:
or
object = open(file_name [, access_mode][, buffering])
is this possible under pdb or ipdb ?
PS: I do not know the line where file is being opened if I know I can set the breakpoint manually. Also I could grep for open( and set the breakpoint on found lines but if my code uses modules this might been problematic. Also if the file is opened another way not by open (I do not know if this is possible just guessing, maybe for appending etc.) this wouldn't work.
Ideally you'd put a breakpoint in the open builtin function, but that is not possible. Instead, you can override it, and place the breakpoint there:
import __builtin__
def open(name, mode='', buffer=0):
return __builtin__.open(name, mode, buffer) # place a BreakPoint here
Of course you'll be breaking at any file opening, not just the one you wanted.
So you can refine that a bit and place a conditional breakpoint:
import ipdb
import __builtin__
def open(name, mode='', buffer=0):
if name == 'myfile.txt':
ipdb.set_trace() ######### Break Point ###########
return __builtin__.open(name, mode, buffer)
f = open('myfile.txt', 'r')
Run your python program with python -m pdb prog.py.
If you don't know where the open call is, you need to patch the original open at the earliest possible point (e.g. the __main__-guard) like this:
import __builtin__
_old_open = open
def my_open(*args, **kwargs):
print "my_open"
return _old_open(*args, **kwargs)
setattr(__builtin__, 'open', my_open)
print open(__file__, "rb").read()
Related
I'm need to save some outputs that I run on the Pycharm. I know that I can use the sys.stdout to do it, but when I use it the pycharm console doesnt't show me anything until the end of the run process and I need to see the running text some times to see if something went wrong during the process.
Can someone help me with that?
What I'm using as code to redirect the console text to a .txt file:
import sys
file_path = 'log.txt'
sys.stdout = open(file_path, "w")
ok, I see you're trying to override sys.stdout, it wrong.
you can save something data to file, like this:
file_path = 'log.txt'
my_file = open(file_path, "w")
my_file.write("some text")
my_file.close()
you can also select a file when using the print function to write it to a file instead of sys.stdout (by default "print" writes to sys.stdout):
file_path = 'log.txt'
my_file = open(file_path, "w")
print('some text', file=my_file)
my_file.close()
You might want to try using the python Logging library, which will allow you to save print content to the stdout console and the log file simultaneously, its much more robust and documented than the response I have included next.
The issue for yourself seems to be that when you first launch the python session the stdout stream directed to the python console, when you change stdout to be a IOStream (text file), you essentially redirect it to there; preventing any output from being sent to the python console.
If you want your own solution you can try to create a wrapper around the two streams like the one below:
import _io # For type annotations
import datetime
import sys
class Wrapper:
def __init__(self, stdout: _io.TextIOWrapper, target_log_file: _io.TextIOWrapper):
self.target_log_file = target_log_file
self.stdout = stdout
def write(self, o):
self.target_log_file.write(o)
self.stdout.write(o)
def flush(self):
self.target_log_file.flush()
self.stdout.flush()
def main():
file_path = 'log.txt'
sys.stdout = Wrapper(sys.stdout, open(file_path, 'w'))
for i in range(0, 10):
print(f'stdout test i:{i} time:{datetime.datetime.now()}')
if __name__ == '__main__':
main()
there is a very nice trick from Mark Lutz(Learning Python), which could be useful for you. The idea here is to use a file as long as you need and then go back to normal mode. Now because you need it the other way around you could comment out all the marked parts and as soon as you're satisfied with the result, you could activate them, like this :
import sys # <-comment out to see the output
tmp = sys.stdout # <-comment out to see the output
sys.stdout = open(file_path, "w") # <-comment out to see the output
print "something" # redirected to the file
sys.stdout.close() # <-comment out to see the output
sys.stdout = tmp # returns to normal mode / comment out
Do you need to use sys.stdout? The easiest solution may be to define a method which does both. For instance:
def print_and_export(text, path):
print(text)
f = open(path, 'a')
f.write(text)
f.close()
Then call this method from anywhere using print_and_export('Some text', 'log.txt').
But specifically for logging, I would preferably use the built in 'logging' module:
import logging
FILE_PATH = 'log.txt'
LOG_FORMAT = '[%(asctime)s] - %(filename)-30s - line %(lineno)-5d - %(levelname)-8s - %(message)s'
logging.basicConfig(level=logging.DEBUG,
format=LOG_FORMAT,
handlers=[logging.FileHandler(FILE_PATH), logging.StreamHandler()])
log = logging.getLogger()
Then you can simply call log.info("Some text"). It should print to the PyCharm console and append the text to your log file.
All the answers above are awesome and will give you what you are after. I have a different answer, but it will not give you both console output and file output. For that, you need to do file IO. But I'll mention my method just so you know.
Make your program print the exact things you want to the console. Then, save it and open up the command prompt. You can do this by browsing to the location of the .py files, click the file path bar, type cmd which will replace all the text there and hit enter.
Once in the command prompt, type python, followed by the name of the file you want to run. Let's assume the name is myfile.py. You can write:
python myfile.py > output.txt
If you hit enter, whatever gets printed onto the console will be diverted to a .txt file named output, saved on the same folder where your .py files are. You can add parameter passing to the print() function and have errors get a different output group (stderr), and divert them to a different file. So, in the end you can make it that you have one file for normal outputs and another file for just error outputs. You can give the Python documentation a read to find it.
https://helpdeskgeek.com/how-to/redirect-output-from-command-line-to-text-file/
will help you with explaining what I am trying to describe. Could be useful for you in the future. In doing so, you are not running the code in PyCharm, but straight in the command prompt. So this way, you can literally do file IO without doing file IO. (without doing file IO in Python at least)
I'm wondering if there is a way to write to a file that was opened in a separate script in Python. For example if the following was run within main.py:
f = open(fname, "w")
writer.write()
Then, within a separate script called writer.py, we have a function write() with the form:
def write()
get_currently_open_file().write("message")
Without defining f within writer.py. This would be similar to how matplotlib has the method:
pyplot.gca()
Which returns the current axis that's open for plotting. This allows you to plot to an axis defined previously without redefining it within the script you're working in.
I'm trying to write to a file with inputs from many different scripts and it would help a lot to be able to write to a file without reading a file object or filename as an input to each script.
Yes. Python functions have local variables, but those are only the variables that are assigned in the function. Python will look to the containing scope for the others. If you use f, but don't try to assign f, python will find the one you created in the global scope.
def write():
f.write("text")
fname = "test"
f = open(fname, "w")
write()
This only works if the function is in the same module as the global variable (python "global" is really "module level").
UPDATE
Leveraging a function's global namespace, you could write a module that holds the writing function and a variable holding the file. Every script/module that imports this module could use the write function that gets its file handles from its own module. In this example, filewriter.py is the common place where test.py and somescript.py cooperate on file management.
filewriter.py
def opener(filename, mode="r"):
global f
f = open(filename, mode)
def write(text):
return f.write(text) # uses the `f` in filewriter namespace
test.py
from filewriter import write
def my_test():
write("THIS IS A TEST\n")
somescript.py
import filewriter
import test
filewriter.opener("test.txt", "w")
test.my_test()
# verify
filewriter.f.seek(0)
assert f.read() == "THIS IS A TEST\n"
Writing as a separate answer because it's essentially unrelated to my other answer, the other semi-reasonable solution here is to define a protocol in terms of the contextvars module. In the file containing write, you define:
import contextlib
import io
import sys
from contextvars import ContextVar
outputctx: ContextVar[io.TextIOBase] = ContextVar('outputctx', default=sys.stdout)
#contextlib.contextmanager
def using_output_file(file):
token = outputctx.set(file)
try:
yield
finally:
outputctx.reset(token)
Now, your write function gets written as:
def write():
outputctx.get().write("message")
and when you want to redirect it for a time, the code that wants to do so does:
with open(fname, "w") as f, using_output_file(f):
... do stuff where calling write implicitly uses the newly opened file ...
... original file is restored ...
The main differences between this and using sys.stdout with contextlib.redirect_stdout are:
It's opt-in, functions have to cooperate to use it (mild negative)
It's explicit, so no one gets confused when the code says print or sys.stdout.write and nothing ends up on stdout
You don't mess around with sys.stdout (temporarily cutting off sys.stdout from code that doesn't want to be redirected)
By using contextvars, it's like thread-local state (where changing it in one thread doesn't change it for other threads, which would cause all sorts of problems if multithreaded code), but moreso; even if you're writing asyncio code (cooperative multitasking of tasks that are all run in the same thread), the context changes won't leak outside the task that makes them, so there's no risk that task A (which wants to be redirected) changes how task B (which does not wish to be redirected) behaves. By contrast, contextlib.redirect_stdout is explicitly making global changes; all threads and tasks see the change, they can interfere with each other, etc. It's madness.
Obviously what you're asking for is hacky, but there are semi-standard ways to express the concept "The thing we're currently writing to". sys.stdout is one of those ways, but it's normally sent to the terminal or a specific file chosen outside the program by the user through piping syntax. That said, you can perform temporary replacement of sys.stdout so that it goes to an arbitrary location, and that might satisfy your needs. Specifically, you use contextlib.redirect_stdout in a with statement.
On entering the with, sys.stdout is saved and replaced with an arbitrary open file; while in the with all code (including code called from within the with, not just the code literally shown in the block) that writes to sys.stdout instead writes to the replacement file, and when the with statement ends, the original sys.stdout is restored. Such uses can be nested, effectively creating a stack of sys.stdouts where the top of the stack is the current target for any writes to sys.stdout.
So for your use case, you could write:
import sys
def write():
sys.stdout.write("message")
and it would, by default, write to sys.stdout. But if you called write() like so:
from contextlib import redirect_stdout
with open(fname, "w") as f, redirect_stdout(f): # Open a file and redirect stdout to it
write()
the output would seamlessly go to the file located wherever fname describes.
To be clear, I don't think this is a good idea. I think the correct solution is for the functions in the various scripts to just accept a file-like object as an argument which they will write to ("Explicit is better than implicit", per the Zen of Python). But it's an option.
Is there a way to track the python process to check where a file is being opened. I have too many files open when I use lsof on my running process but I'm not sure where they are being opened.
ls /proc/$pid/fd/ | wc -l
I suspect one of the libraries I'm using might have not handled the files properly. Is there a way to isolate exactly which line in my python code the files are being opened?
In my code I work with 3rd party libraries to process thousands of media files and since they are being left open I receive the error
OSError: [Errno 24] Too many open files
after running for a few minutes. Now I know raising the limit of open files is an option but this will just push the error to a later point of time.
The easiest way to trace the open calls is to use an audit hook in Python. Note that this method would only trace Python open calls and not the system calls.
Let fdmod.py be a module file with a single function foo:
def foo():
return open("/dev/zero", mode="r")
Now the main code in file fd_trace.py, which is tracing all open calls and importing fdmod, is defined follows:
import sys
import inspect
import fdmod
def open_audit_hook(name, *args):
if name == "open":
print(name, *args, "was called:")
caller = inspect.currentframe()
while caller := caller.f_back:
print(f"\tFunction {caller.f_code.co_name} "
f"in {caller.f_code.co_filename}:"
f"{caller.f_lineno}"
)
sys.addaudithook(open_audit_hook)
# main code
fdmod.foo()
with open("/dev/null", "w") as dev_null:
dev_null.write("hi")
fdmod.foo()
When we run fd_trace.py, we will print the call stack whenever some component is calling open:
% python3 fd_trace.py
open ('/dev/zero', 'r', 524288) was called:
Function foo in /home/tkrennwa/fdmod.py:2
Function <module> in fd_trace.py:17
open ('/dev/null', 'w', 524865) was called:
Function <module> in fd_trace.py:18
open ('/dev/zero', 'r', 524288) was called:
Function foo in /home/tkrennwa/fdmod.py:2
Function <module> in fd_trace.py:20
See sys.audithook and inspect.currentframe for details.
You might get useful information using strace. This will show all system calls made by a process, including calls to open(). It will not directly show you where in the Python code those calls are occurring, but you may be able to deduce some information from the context.
Seeing open file handles is easy on Linux:
open_file_handles = os.listdir('/proc/self/fd')
print('open file handles: ' + ', '.join(map(str, open_file_handles)))
You can also use the following on any OS (e.g. Windows, Mac):
import errno, os, resource
open_file_handles = []
for fd in range(resource.getrlimit(resource.RLIMIT_NOFILE)[0]):
try: os.fstat(fd)
except OSError as e:
if e.errno == errno.EBADF: continue
open_file_handles.append(fd)
print('open file handles: ' + ', '.join(map(str, open_file_handles)))
Note: This should always work assuming you're actually (occasionally) running out of file handles. There are usually a max of 256 file handles. But it might take a long time if the max (set by the OS/user policy) is something huge like a billion.
Note also: There will almost always be at least three file handles open for STDIN, STDOUT, and STDERR respectively.
When exporting a csv-file from Python, for some reason it does not close (even when using the 'with' statement) because when I'm calling it afterwards I get the following error:
PermissionError: [WinError 32] The process cannot access the file because it is being used
by another process
I suppose it has to be the close function that hangs, because when I'm printing behind the with statement or the close() statement, it gets printed (e.g. print fileName). Any suggestions that might solve this matter?
(Also when I'm trying to open the exported CSV-file, I get a read-only message because it's used by another program. I can access it properly only when Python is closed, which is just annoying)
import csv, numpy, os
import DyMat
import subprocess
os.chdir("C:/Users/myvhove/Documents/ResultsPyDymInt/Dymola/CoupledClutches")
dm = DyMat.DyMatFile("dymatresfile")
print(dm.names())
varList = ('J1.w', 'J2.w', 'J3.w', 'J4.w')
fileName = dm.fileName + '.csv'
with open(fileName, 'w', newline='') as oFile:
csvWriter = csv.writer(oFile)
vDict = dm.sortByBlocks(varList)
for vList in vDict.values():
vData = dm.getVarArray(vList)
vList.insert(0, dm._absc[0])
csvWriter.writerow(vList)
csvWriter.writerows(numpy.transpose(vData))
subprocess.call("dymatresfile.csv")
print(fileName)
The code is correct. The problem must be somewhere else.
Either another forgotten python process or as #CristiFati mentioned an open editor.
In the worst case restart the PC and call the python script directly after logging in again.
The error should no more be there.
I am trying to output comments on python console and at the same time, saving into a text file and it should run recursively. I found a code and modified:
import sys
def write(input_text):
print("Coming through stdout")
# stdout is saved
save_stdout = sys.stdout
fh = open(path,"w")
sys.stdout = fh
print(input_text)
# return to normal:
sys.stdout = save_stdout
fh.close()
def testing():
write('go')
I reuse this command, and it only saved the last received print data. any clue?
Thanks
All you need is (assuming "path" is defined already):
def print_twice(*args,**kwargs):
print(*args,**kwargs)
with open(path,"a") as f: # appends to file and closes it when finished
print(file=f,*args,**kwargs)
Exactly the same thing will be printed and written to the file. The logging module is overkill for this simple task.
Please tell me that you don't actually think that writing data to a file in Python requires messing around with stdout like in your code. That would be ridiculous.
You pass the 'w' mode to the open function wich erase any content in the file.
You should use the 'a' mode for appending in the file.
BTW you should consider using the logging module with two handlers. One writting to stdout and the other to a file. See logging handlers in the python documentation.
If you want to see the output on the screen and save them on a text file as well, then:
python <your-script-name> | tee output.txt
Can change "output.txt" to any file name you want. Ignore, if I misunderstood your question.