Getting untraceable OSError (exception ignored) after program terminates - python

I'm afraid that I won't be of any help for tracking this down since this issue might come from basically anywhere in my code.
The exception occurs after running a Tensorflow training (I don't know if the issue comes from Tensorflow though). I can't see anything broken so I assume this is just an exception that happens somewhere during teardown.
My biggest issue, besides this annoying exception itself, is the fact that I have literally zero information on where it originates from.
...
I1123 07:51:21.469002 140168846268160 experiment.py:174] Experiment completed.
Exception ignored in: <function Pool.__del__ at 0x7f7b225304c0>
Traceback (most recent call last):
File "/home/sfalk/miniconda3/envs/speech/lib/python3.9/multiprocessing/pool.py", line 268, in __del__
self._change_notifier.put(None)
File "/home/sfalk/miniconda3/envs/speech/lib/python3.9/multiprocessing/queues.py", line 378, in put
self._writer.send_bytes(obj)
File "/home/sfalk/miniconda3/envs/speech/lib/python3.9/multiprocessing/connection.py", line 205, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/home/sfalk/miniconda3/envs/speech/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes
self._send(header + buf)
File "/home/sfalk/miniconda3/envs/speech/lib/python3.9/multiprocessing/connection.py", line 373, in _send
n = write(self._handle, buf)
OSError: [Errno 9] Bad file descriptor
Is there any way I can get an idea where this is happening in my code?

Related

Using VS Code to debug python files. Exception thrown on breakpoint and breakpoint is ignored

Tried with multiple different python files. Every time I try to use the debugger in vs code and set breakpoints the breakpoint gets ignored and exception gets raised and the script continues on. I've been googling and tinkering for over 2 hours and can't seem to figure out what's going on here. Tried rebooting PC, running vs code as admin, uninstall/reinstall the python extension for vs code. Tried to dig into the files mentioned in the traceback and pinpointed the function that seems to be raising the exception but I can't figure out where it's being called from or why it's raising the exception. I'm still new-ish to Python. Debugging works properly on my laptop but for whatever reason my desktop is having this issue.
Traceback (most recent call last):
File "c:\Users\Joel\.vscode\extensions\ms-python.python-2020.7.96456\pythonFiles\lib\python\debugpy\_vendored\pydevd\pydevd_file_utils.py", line 529, in _original_file_to_client
return cache[filename]
KeyError: 'c:\\users\\joel\\local settings\\application data\\programs\\python\\python37-32\\lib\\runpy.py'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:\Users\Joel\.vscode\extensions\ms-python.python-2020.7.96456\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_comm.py", line 330, in _on_run
self.process_net_command_json(self.py_db, json_contents)
File "c:\Users\Joel\.vscode\extensions\ms-python.python-2020.7.96456\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_process_net_command_json.py", line 190, in process_net_command_json
cmd = on_request(py_db, request)
File "c:\Users\Joel\.vscode\extensions\ms-python.python-2020.7.96456\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_process_net_command_json.py", line 771, in on_stacktrace_request
self.api.request_stack(py_db, request.seq, thread_id, fmt=fmt, start_frame=start_frame, levels=levels)
File "c:\Users\Joel\.vscode\extensions\ms-python.python-2020.7.96456\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_api.py", line 214, in request_stack
if internal_get_thread_stack.can_be_executed_by(get_current_thread_id(threading.current_thread())):
File "c:\Users\Joel\.vscode\extensions\ms-python.python-2020.7.96456\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_comm.py", line 661, in can_be_executed_by
py_db, self.seq, self.thread_id, frame, self._fmt, must_be_suspended=not timed_out, start_frame=self._start_frame, levels=self._levels)
File "c:\Users\Joel\.vscode\extensions\ms-python.python-2020.7.96456\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_net_command_factory_json.py", line 213, in make_get_thread_stack_message
py_db, frames_list
File "c:\Users\Joel\.vscode\extensions\ms-python.python-2020.7.96456\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_net_command_factory_xml.py", line 175, in _iter_visible_frames_info
new_filename_in_utf8, applied_mapping = pydevd_file_utils.norm_file_to_client(filename_in_utf8)
File "c:\Users\Joel\.vscode\extensions\ms-python.python-2020.7.96456\pythonFiles\lib\python\debugpy\_vendored\pydevd\pydevd_file_utils.py", line 531, in _original_file_to_client
translated = _path_to_expected_str(get_path_with_real_case(_AbsFile(filename)))
File "c:\Users\Joel\.vscode\extensions\ms-python.python-2020.7.96456\pythonFiles\lib\python\debugpy\_vendored\pydevd\pydevd_file_utils.py", line 221, in _get_path_with_real_case
return _resolve_listing(drive, iter(parts))
File "c:\Users\Joel\.vscode\extensions\ms-python.python-2020.7.96456\pythonFiles\lib\python\debugpy\_vendored\pydevd\pydevd_file_utils.py", line 184, in _resolve_listing
dir_contents = cache[resolved_lower] = os.listdir(resolved)
PermissionError: [WinError 5] Access is denied: 'C:\\Users\\Joel\\Local Settings'
So I get this traceback every time a breakpoint is hit. Taking a peek at the "_original_file_to_client" function in "pydevd_file_utils.py" we get this:
def _original_file_to_client(filename, cache={}):
try:
return cache[filename]
except KeyError:
translated = _path_to_expected_str(get_path_with_real_case(_AbsFile(filename)))
cache[filename] = (translated, False)
return cache[filename]
I wasn't able to figure out where this function was being called from or what the expected output was supposed to be. Any help with this would be greatly appreciated!
Edit: Forgot to mention I'm using Windows 10 if it wasn't obvious from the trace
This is a similar question. The spaces in the filename cause this problem:
"Local Settings", "application data"

pyspark got Py4JNetworkError("Answer from Java side is empty") when exit python

Background:
spark standalone cluster mode on k8s
spark 2.2.1
hadoop 2.7.6
run code in python, not in pyspark
client mode, not cluster mode
The pyspark code in python, not in pyspark env.
Every code can work and get it down. But 'sometimes', when the code finish and exit, below error will show up even time.sleep(10) after spark.stop().
{{py4j.java_gateway:1038}} INFO - Error while receiving.
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 1035, in send_command
raise Py4JNetworkError("Answer from Java side is empty")
Py4JNetworkError: Answer from Java side is empty
[2018-11-22 09:06:40,293] {{root:899}} ERROR - Exception while sending command.
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 883, in send_command
response = connection.send_command(command)
File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 1040, in send_command
"Error while receiving", e, proto.ERROR_ON_RECEIVE)
Py4JNetworkError: Error while receiving
[2018-11-22 09:06:40,293] {{py4j.java_gateway:443}} DEBUG - Exception while shutting down a socket
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 441, in quiet_shutdown
socket_instance.shutdown(socket.SHUT_RDWR)
File "/usr/lib64/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
File "/usr/lib64/python2.7/socket.py", line 170, in _dummy
raise error(EBADF, 'Bad file descriptor')
error: [Errno 9] Bad file descriptor
I guess the reason is parent process python try to get log message from terminated child process 'jvm'. But the wired thing is the error not always raise...
Any suggestion?
This root-cause is 'py4j' log-level.
I set python log-level to DEBUG, this let the 'py4j' client & 'java' raise connection error when close pyspark.
So setting python log-level to INFO or more higher level will resolve this problem.
ref: Gateway raises an exception when shut down
ref: Tune down the logging level for callback server messages
ref: PySpark Internals

Error while writing to file with multiprocessing

I am working on html parser, it uses Python multiprocessing Pool, because it runs through huge number of pages. The output from every page is saved to a separate CSV file. The problem is sometimes I get unexpected error and whole program crashes and I have errors handling almost everywhere - reading pages, parsing pages, even writing files. Moreover it looks like the script crashes after it finishes writing a batch of files, so it shouldn't be anything to crush on. Thus after whole day of debugging I am left clueless.
Error:
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "D:\Programy\Python36-32\lib\multiprocessing\pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "D:\Programy\Python36-32\lib\multiprocessing\pool.py", line 44, in mapstar
return list(map(*args))
File "D:\ppp\Python\parser\run.py", line 244, in media_process
save_media_product(DIRECTORY, category, media_data)
File "D:\ppp\Python\parser\manage_output.py", line 180, in save_media_product
_file_manager(target_file, temp, temp2)
File "D:\ppp\Python\store_parser\manage_output.py", line 214, in _file_manager
file_to_write.close()
UnboundLocalError: local variable 'file_to_write' referenced before assignment
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "D:\ppp\Python\store_parser\run.py", line 356, in <module>
main()
File "D:\Rzeczy Mariusza\Python\store_parser\run.py", line 318, in main
process.map(media_process, batch)
File "D:\Programy\Python36-32\lib\multiprocessing\pool.py", line 266, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "D:\Programy\Python36-32\lib\multiprocessing\pool.py", line 644, in get
raise self._value
UnboundLocalError: local variable 'file_to_write' referenced before assignment
It look like, there is an error with variable assignment, but it is not:
try:
file_to_write = open(target_file, 'w')
except OSError:
message = 'OSError while writing file name - {}'.format(target_file)
log_error(message)
except UnboundLocalError:
message = 'UnboundLocalError while writing file name - {}'.format(target_file)
log_error(message)
except Exception as e:
message = 'Total failure "{}" while writing file name - {}'.format(e, target_file)
log_error(message)
else:
file_to_write.write(temp)
file_to_write.write(temp2)
finally:
file_to_write.close()
Line - except Exception as e:, does not help with anything, the whole thing still crashes. So far i have excluded only Out Of Memory scenario, because this script is designed to be handled on low spec VPS, but in testing stage I run it in environment with 8 GB of ram. So if You have any theories please share.
The exception really says what is happening.
This part is telling you obvious issue:
UnboundLocalError: local variable 'file_to_write' referenced before assignment
Even you have try/except blocks that catches various exceptions, else/finally doesn't.
More specifically in finally block you reference variable that might not exist since exception with doing: file_to_write = open(target_file, 'w') is being handled by at least last except Exception as e block, but then finally is run too.
Since exception happened as a result of not being able to open target file, you do not have anything assigned to file_to_write and that variable doesn't exist after exception is handled. That is why finally block crashes.

scapy bad file descriptor

scapy OSError: [Errno 9] Bad file descriptor
same error as this guy . using python 2.7.5 on windows.downloded all the extentions and followed everything step by step. still doesnt seem to work. every function i try wheter it's sr1 or send or anything else i get this error OSError: [Errno 9] Bad file descriptor..
and it mentiones lines in some of the scapy scripts (scapy\sendrecv.pyc as example..)
the simplest things i try running in the interpreter doesnt work. as in :
from scapy.all import *
p=sr1(IP(dst='190.200.2.5')/ICMP())
so i tried reinstalling this time everything according to the guide they provide int the website and using python 2.6 , same results ..
>>> p=sr1(IP(dst='192.168.1.1')/ICMP())
ERROR: --- Error sending packets
Traceback (most recent call last):
File "C:\Python26\lib\site-packages\scapy\arch\windows\__init__.py", line 374, in sndrcv
pks.send(p)
File "C:\Python26\lib\site-packages\scapy\arch\pcapdnet.py", line 237, in send
ifs = dnet.eth(iff)
File "dnet.pyx", line 112, in dnet.eth.__init__
OSError: Result too large
Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
p=sr1(IP(dst='192.168.1.1')/ICMP())
File "C:\Python26\lib\site-packages\scapy\sendrecv.py", line 335, in sr1
a,b=sndrcv(s,x,*args,**kargs)
File "C:\Python26\lib\site-packages\scapy\arch\windows\__init__.py", line 431, in sndrcv
os.write(1, ".")
OSError: [Errno 9] Bad file descriptor

Using numdisplay, "An existing connection was forced to close by the remote host"

I've been learning to use python in astronomy and for that I'm following this notes. In the very beginning the author does the following example:
>>> im = pyfits.getdata('http://das.sdss.org/www/cgi-bin/drC?RUN=3630&RERUN=40&CAMCOL=3&FIELD=83&FILTER=r')
>>> numdisplay.display(im,z1=1000,z2=1500)
I try to replicate it and I get:
>>> numdisplay.display(im,z1=1000,z2=1500)
Image displayed with Z1: 1000 Z2: 1500
Traceback (most recent call last):
File "<pyshell#13>", line 1, in <module>
numdisplay.display(im,z1=1000,z2=1500)
File "C:\Mine\Python\lib\site-packages\numdisplay\__init__.py", line 446, in display
_d.writeImage(bpix,_wcsinfo)
File "C:\Mine\Python\lib\site-packages\numdisplay\displaydev.py", line 513, in writeImage
self.writeData(_lx,_ydisp,_fpix[block,:])
File "C:\Mine\Python\lib\site-packages\numdisplay\displaydev.py", line 379, in writeData
self._writeHeader(opcode,self._MEMORY, -nbytes, x, y, frame, 0)
File "C:\Mine\Python\lib\site-packages\numdisplay\displaydev.py", line 542, in _writeHeader
self._write(a.tostring())
File "C:\Mine\Python\lib\site-packages\numdisplay\displaydev.py", line 580, in _write
nwritten = self._socket.send(s[-n:])
error: [Errno 10054] An existing connection was forced to close by the remote host
I don't understand what I'm doing wrong. I mean if I write numdisplay.open() everything is fine... I'm thinking that it might be my antivirus or something that doesn't let python to communicate with ds9... Can somebody help me?
Edit: Well it doesn't seem to be the antivirus. I stopped it and run the script and I got the same error.
I was getting similar error messages, and I just tried adding a non-'None' argument to the bufname argument and it works (my image is about 4096.4096):
numdisplay.display(data,bufname='imt4096')

Categories

Resources