Python `Error in atexit._run_exitfuncs` - python

I am trying to use kafka MultiProcessConsumer but I am getting following error. It seems like error is related to multithreading in python
Here is the code which I am using.
simple.py
from kafka import SimpleProducer, SimpleClient, SimpleConsumer, MultiProcessConsumer
# To consume messages
client = SimpleClient('localhost:9092')
consumer = MultiProcessConsumer(client, "my-group", "testing_topic", num_procs=3)
for message in consumer:
# message is raw byte string -- decode if necessary!
# e.g., for unicode: `message.decode('utf-8')`
print(message)
client.close()
Error while running above code.
$ python simple.py
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/var/users/ec2-user/.pyenv/versions/3.6.0/lib/python3.6/multiprocessing/managers.py", line 749, in _callmethod
conn = self._tls.connection
AttributeError: 'ForkAwareLocal' object has no attribute 'connection'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/var/users/ec2-user/.pyenv/versions/3.6.0/lib/python3.6/multiprocessing/connection.py", line 614, in SocketClient
s.connect(address)
FileNotFoundError: [Errno 2] No such file or directory

Related

SystemError exception occurs when running click.confirm

I am trying to run a click.confirm() command. It works when I run the file stand-alone, but when I call the function from another module I get a SystemError
Traceback (most recent call last):
File "/home/usr/.local/lib/python3.6/site-packages/click/_compat.py", line 108, in __getattr__
return getattr(self._stream, name)
File "/home/usr/.vscode-server/extensions/ms-python.python-2022.6.0/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_io.py", line 59, in __getattr__
raise AttributeError(name)
AttributeError: closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/usr/.local/lib/python3.6/site-packages/click/termui.py", line 231, in confirm
echo(prompt.rstrip(" "), nl=False, err=err)
File "/home/usr/.local/lib/python3.6/site-packages/click/utils.py", line 298, in echo
file.write(out) # type: ignore
SystemError: <built-in method write of _NonClosingTextIOWrapper object at 0x7f20e85ea048> returned a result with an error set
I run my program on a computer over SSH using the vscode extension

Python try except - Include the custom message in the Error variable

Im trying to do a simple try except, and it is working. But I want to add some custom string at the beginning of the error message. If I just add it in print, its giving error.
import sys
try:
with open('./datatype-mapping/file.json') as rs_mapping:
data_mapping = json.load(rs_mapping)
except Exception as error:
print('CUSTOM ERROR: '+error)
sys.exit(1)
The error I got is,
Traceback (most recent call last):
File "c:/Users/rbhuv/Desktop/code/bqshift.py", line 22, in get_datatype_mapping
with open('./datatype-mapping/file.json') as rs_mapping:
FileNotFoundError: [Errno 2] No such file or directory: './datatype-mapping/file.json'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:/Users/rbhuv/Desktop/code/bqshift.py", line 102, in <module>
main()
File "c:/Users/rbhuv/Desktop/code/bqshift.py", line 99, in main
target_mapping()
File "c:/Users/rbhuv/Desktop/code/bqshift.py", line 39, in target_mapping
data_mapping = get_datatype_mapping()
File "c:/Users/rbhuv/Desktop/code/bqshift.py", line 26, in get_datatype_mapping
print('ERROR: '+error)
TypeError: can only concatenate str (not "FileNotFoundError") to str
But if I use just print(error) - this is working.
You need to convert error to str.
import sys
try:
int("fail")
except Exception as error:
print('CUSTOM ERROR: ' + str(error))
sys.exit(1)
This works flawlessly.

pyspark got Py4JNetworkError("Answer from Java side is empty") when exit python

Background:
spark standalone cluster mode on k8s
spark 2.2.1
hadoop 2.7.6
run code in python, not in pyspark
client mode, not cluster mode
The pyspark code in python, not in pyspark env.
Every code can work and get it down. But 'sometimes', when the code finish and exit, below error will show up even time.sleep(10) after spark.stop().
{{py4j.java_gateway:1038}} INFO - Error while receiving.
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 1035, in send_command
raise Py4JNetworkError("Answer from Java side is empty")
Py4JNetworkError: Answer from Java side is empty
[2018-11-22 09:06:40,293] {{root:899}} ERROR - Exception while sending command.
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 883, in send_command
response = connection.send_command(command)
File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 1040, in send_command
"Error while receiving", e, proto.ERROR_ON_RECEIVE)
Py4JNetworkError: Error while receiving
[2018-11-22 09:06:40,293] {{py4j.java_gateway:443}} DEBUG - Exception while shutting down a socket
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 441, in quiet_shutdown
socket_instance.shutdown(socket.SHUT_RDWR)
File "/usr/lib64/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
File "/usr/lib64/python2.7/socket.py", line 170, in _dummy
raise error(EBADF, 'Bad file descriptor')
error: [Errno 9] Bad file descriptor
I guess the reason is parent process python try to get log message from terminated child process 'jvm'. But the wired thing is the error not always raise...
Any suggestion?
This root-cause is 'py4j' log-level.
I set python log-level to DEBUG, this let the 'py4j' client & 'java' raise connection error when close pyspark.
So setting python log-level to INFO or more higher level will resolve this problem.
ref: Gateway raises an exception when shut down
ref: Tune down the logging level for callback server messages
ref: PySpark Internals

Windows error while running standalone pyspark

I am trying to import pyspark in Anaconda and run sample code. However, whenever I try to run the code in Anaconda, I get following error message.
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:53294)
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 1021, in send_command
self.socket.sendall(command.encode("utf-8"))
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 883, in send_command
response = connection.send_command(command)
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 1025, in send_command
"Error while sending", e, proto.ERROR_ON_SEND)
py4j.protocol.Py4JNetworkError: Error while sending
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 827, in _get_connection
connection = self.deque.pop()
IndexError: pop from an empty deque
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 963, in start
self.socket.connect((self.address, self.port))
ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:53294)
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 827, in _get_connection
connection = self.deque.pop()
IndexError: pop from an empty deque
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 963, in start
self.socket.connect((self.address, self.port))
ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:53294)
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 827, in _get_connection
connection = self.deque.pop()
IndexError: pop from an empty deque
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 963, in start
self.socket.connect((self.address, self.port))
ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:53294)
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 827, in _get_connection
connection = self.deque.pop()
IndexError: pop from an empty deque
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 963, in start
self.socket.connect((self.address, self.port))
ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:53294)
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 827, in _get_connection
connection = self.deque.pop()
IndexError: pop from an empty deque
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 963, in start
self.socket.connect((self.address, self.port))
ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it
Reloaded modules: py4j.protocol, pyspark.sql.context, py4j.java_gateway, py4j.compat, pyspark.profiler, pyspark.sql.catalog, pyspark.context, pyspark.sql.group, pyspark.sql.conf, pyspark.sql.readwriter, pyspark.resultiterable, pyspark.sql, pyspark.sql.dataframe, pyspark.traceback_utils, pyspark.cloudpickle, pyspark.rddsampler, pyspark.accumulators, pyspark.broadcast, py4j, pyspark.rdd, pyspark.sql.functions, pyspark.java_gateway, pyspark.statcounter, pyspark.conf, pyspark.serializers, pyspark.files, pyspark.join, pyspark.sql.streaming, pyspark.shuffle, pyspark, py4j.version, pyspark.sql.session, pyspark.sql.column, py4j.finalizer, py4j.java_collections, pyspark.status, pyspark.sql.window, pyspark.sql.utils, pyspark.storagelevel, pyspark.heapq3, py4j.signals, pyspark.sql.types
Traceback (most recent call last):
File "", line 1, in
runfile('C:/Users/hlee/Desktop/pyspark.py', wdir='C:/Users/hlee/Desktop')
File "C:\Program Files\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
execfile(filename, namespace)
File "C:\Program Files\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/hlee/Desktop/pyspark.py", line 38, in
sc = SparkContext()
File "C:\spark\python\lib\pyspark.zip\pyspark\context.py", line 115, in init
conf, jsc, profiler_cls)
File "C:\spark\python\lib\pyspark.zip\pyspark\context.py", line 168, in _do_init
self._jsc = jsc or self._initialize_context(self._conf._jconf)
File "C:\spark\python\lib\pyspark.zip\pyspark\context.py", line 233, in _initialize_context
return self._jvm.JavaSparkContext(jconf)
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 1401, in call
answer, self._gateway_client, None, self._fqn)
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\protocol.py", line 319, in get_return_value
format(target_id, ".", name), value)
Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.net.BindException: Cannot assign requested address: bind: Service 'sparkDriver' failed after 16 retries! Consider explicitly setting the appropriate port for the service 'sparkDriver' (for example spark.ui.port for SparkUI) to an available port or increasing spark.port.maxRetries.
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:485)
at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1089)
at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:430)
at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:415)
at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:903)
at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:198)
at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:348)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
Following is my sample code, and I have no problem running Apache in cmd.
import os
import sys
spark_path = r"C:\spark"
os.environ['SPARK_HOME'] = spark_path
sys.path.insert(0, spark_path + "/bin")
sys.path.insert(0, spark_path + "/python/pyspark/")
sys.path.insert(0, spark_path + "/python/lib/pyspark.zip")
sys.path.insert(0, spark_path + "/python/lib/py4j-0.10.3-src.zip")
from pyspark import SparkContext
sc = SparkContext()
import random
NUM_SAMPLES = 100000
def sample(p):
x, y = random.random(), random.random()
return 1 if x*x + y*y < 1 else 0
count = sc.parallelize(range(0, NUM_SAMPLES)).map(sample) \
.reduce(lambda a, b: a + b)
print("Pi is roughly %f" % (4.0 * count / NUM_SAMPLES))
I have downloaded winutils.exe and added HADOOP_HOME variable in environment Varaible and added export SPARK_MASTER_IP=127.0.0.1, export SPARK_LOCAL_IP=127.0.0.1 in spark-env.sh file. However, I am still getting the same error. Can someone help me and point out what am I missing?
Thank you in advance,
In my case I just had to re-estart the kernel.
The problem was that I was creating the environment twice: every time I made a mistake I re-ran the code from the beginning.

KeyError when assigning ''praw.Reddit'' to variable

I could successfully connect to reddit's servers with oauth2 some time ago, but when running my script just now, I get a KeyError followed by a NoSectionError. Code is below followed by exceptions, (The code has been reduced to its essentials).
import praw
# Configuration
APP_UA = 'useragent'
...
...
...
r = praw.Reddit(APP_UA)
Error message:
Traceback (most recent call last):
File "D:\Directory\Python\lib\configparser.py", line 843, in items
d.update(self._sections[section])
KeyError: 'useragent'
A NoSectionError occurred when handling the above exception.
"During handling of the above exception, another exception occurred:"
'Traceback (most recent call last):
File "D:\Directory\Python\Projects\myprj for Reddit, globaloffensive\oddshotcrawler.py", line 19, in <module>
r = praw.Reddit(APP_UA)
File "D:\Directory\Python\lib\site-packages\praw\reddit.py", line 84, in __init__
**config_settings)
File "D:\Directory\Python\lib\site-packages\praw\config.py", line 47, in __init__
raw = dict(Config.CONFIG.items(site_name), **settings)
File "D:\Directory\Python\lib\configparser.py", line 846, in items
raise NoSectionError(section)
configparser.NoSectionError: No section: 'useragent'
[Finished in 0.2s]
Try giving it a user_agent kwarg.
r = praw.Reddit(useragent=APP_UA)

Categories

Resources