I have a service which initiates 3 sub-process from main. I need a prometheus metrics server in one of the process. But I get "AttributeError: Can't pickle local object 'MultiProcessValue..MmapedValue'" when the process with prometheus server is started.
main.py:
from os.path import dirname, join
environ['PROMETHEUS_MULTIPROC_DIR'] = join(dirname(__file__), 'prom')
import multiprocessing
from sub_process import SubProcessClass
environ['PROMETHEUS_MULTIPROC_DIR'] = join(dirname(__file__), 'prom')
if __name__ == "__main__":
...
multiprocessing.Process(target=SubProcessClass().f).start()
sub_process.py:
from my_metrics import MyMetric
class SubProcessClass:
def __init__(self):
self.metrics = MyMetric(8090)
self.metrics.start()
def f(self):
print("In f")
self.metrics.inc_my_counter()
my_metrics.py:
from prometheus_client import Counter, CollectorRegistry, multiprocess
from prometheus_client import start_http_server
class MyMetric:
def __init__(self, port):
self.port = port
self.registry = CollectorRegistry()
multiprocess.MultiProcessCollector(self.registry)
self.my_counter = Counter('counter_name', 'counter_desc', registry=self.registry)
def start(self):
start_http_server(self.port)
def inc_my_counter(self):
self.my_counter.inc()
Getting below exception on running main.py
Traceback (most recent call last):
File "<project_path>\source\main.py", line 15, in <module>
multiprocessing.Process(target=MyClass().f).start()
File "<python_path>\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "<python_path>\lib\multiprocessing\context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "<python_path>\lib\multiprocessing\context.py", line 327, in _Popen
return Popen(process_obj)
File "<python_path>\lib\multiprocessing\popen_spawn_win32.py", line 93, in __init__
reduction.dump(process_obj, to_child)
File "<python_path>\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'MultiProcessValue.<locals>.MmapedValue'
I am running python 3.9.7 on windows.
Related
I'm trying to use the python multiprocessing module to run a server in another Thread using the http.server.BaseHTTPRequestHandler module. I am stuck though and am running into a '_thread.lock' issue.
I don't want to use the threading module because I'd rather use true multi-threading with the multi-processing module.
If anyone knows what I am doing incorrectly or can point me to a good library to use that would be awesome.
import multiprocessing
from http.server import ThreadingHTTPServer, BaseHTTPRequestHandler
if __name__ == '__main__':
httpd = ThreadingHTTPServer(('localhost', 4433), BaseHTTPRequestHandler)
manager = multiprocessing.Manager()
manager.http_server = httpd
running_server = multiprocessing.Process(target=manager.http_server.serve_forever)
running_server.start()
Stack Trace:
File "/Users/redacted/python/test2/test1.py", line 10, in <module>
running_server.start()
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
TypeError: cannot pickle '_thread.lock' object
Python uses pickle to pass objects to another process when using multiprocess module. In your case, the thread lock used in the httpserver is not pickleable. So it reports the error.
What you can do is start the http server in another process completely like this:
import multiprocessing
from http.server import ThreadingHTTPServer, BaseHTTPRequestHandler
def startServer():
httpd = ThreadingHTTPServer(('localhost', 4433), BaseHTTPRequestHandler)
httpd.serve_forever()
if __name__ == '__main__':
manager = multiprocessing.Manager()
running_server = multiprocessing.Process(target=startServer)
running_server.start()
Also, you might want to try a different port other than 4433. I cannot connect to this port on my windows machine. But if I use 8000 everything works fine.
After serveral test, I find this problem caused by the dim of manager.list(manager.list(...)). But I really need it to be 2 dims. Any suggestion would be appreciated!
I'm trying to build a server and multiple clients across multiple nodes.
One node act as server which initial manager.list() for other client to use.
Other nodes act as client which attach server to get list and deal with it.
Firewall is closed. And when put server and client on a single node, it works fine.
Got problem like this:
Traceback (most recent call last):
File "main.py", line 352, in <module>
train(args)
File "main.py", line 296, in train
args, proc_manager, device)
File "main.py", line 267, in make_gossip_buffer
mng,sync_freq=args.sync_freq, num_nodes=args.num_nodes)
File "/home/think/gala-master-distprocess-changing_to_multinodes/gala/gpu_gossip_buffer.py", line 49, in __init__
r_events = read_events[rank]
File "<string>", line 2, in __getitem__
File "/home/think/anaconda3/envs/AC/lib/python3.7/multiprocessing/managers.py", line 819, in _callmethod
kind, result = conn.recv()
File "/home/think/anaconda3/envs/AC/lib/python3.7/multiprocessing/connection.py", line 251, in recv
return _ForkingPickler.loads(buf.getbuffer())
File "/home/think/anaconda3/envs/AC/lib/python3.7/multiprocessing/managers.py", line 943, in RebuildProxy
return func(token, serializer, incref=incref, **kwds)
File "/home/think/anaconda3/envs/AC/lib/python3.7/multiprocessing/managers.py", line 793, in __init__
self._incref()
File "/home/think/anaconda3/envs/AC/lib/python3.7/multiprocessing/managers.py", line 847, in _incref
conn = self._Client(self._token.address, authkey=self._authkey)
File "/home/think/anaconda3/envs/AC/lib/python3.7/multiprocessing/connection.py", line 492, in Client
c = SocketClient(address)
File "/home/think/anaconda3/envs/AC/lib/python3.7/multiprocessing/connection.py", line 620, in SocketClient
s.connect(address)
FileNotFoundError: [Errno 2] No such file or directory
Server runs on a single node.
Code of server are shown below:
import torch.multiprocessing as mp
from multiprocessing.managers import ListProxy, BarrierProxy, AcquirerProxy, EventProxy
from gala.arguments import get_args
mp.current_process().authkey = b'abc'
def server(manager,host, port, key, args):
read_events = manager.list([manager.list([manager.Event() for _ in range(num_learners)])
for _ in range(num_learners)])
manager.register('get_read_events', callable=lambda : read_events, proxytype=ListProxy)
print('start service at', host)
s = manager.get_server()
s.serve_forever()
if __name__ == '__main__':
mp.set_start_method('spawn')
args = get_args()
manager = mp.Manager()
server(manager,'10.107.13.120', 5000, b'abc', args)
Client runs on other nodes. those nodes connect server with ethernet. CLient ip is 10.107.13.80
Code of client are shown below:
import torch.multiprocessing as mp
mp.current_process().authkey = b'abc'
def make_gossip_buffer(mng):
read_events = mng.get_read_events()
gossip_buffer = GossipBuffer(parameters)
def train(args):
proc_manager = mp.Manager()
proc_manager.register('get_read_events')
proc_manager.__init__(address=('10.107.13.120', 5000), authkey=b'abc')
proc_manager.connect()
make_gossip_buffer(proc_manager)
if __name__ == "__main__":
mp.set_start_method('spawn')
train(args)
Any help would be appreciated!
I'm trying to write some unittest for functions that uses openshift python client so I need to mock some openshift client calls.
https://github.com/openshift/openshift-restclient-python
from kubernetes import client, config
from openshift.dynamic import DynamicClient, exceptions as openshift_exceptions
_openshift_cert_manager_api_version = 'cert-manager.io/v1'
class OCPException(Exception):
pass
def certificate_exists(dyn_client: DynamicClient, namespace: str, certificate_name: str) -> bool:
""" Checks if certificate already exists in namespace. """
v1_certificates = dyn_client.resources.get(api_version=_openshift_cert_manager_api_version, kind='Certificate')
try:
v1_certificates.get(namespace=namespace, name=certificate_name)
except openshift_exceptions.NotFoundError:
return False
except openshift_exceptions as Error:
raise OCPException(Error)
return True
import unittest
from unittest import mock
from openshift.dynamic import DynamicClient
from awscertmanager import ocp
class TestCertificateExists(unittest.TestCase):
def test_certificate_exists(self):
mock_client = mock.create_autospec(DynamicClient)
ocp.certificate_exists(mock_client, namespace='dummy', certificate_name='tls-wildcard-dummy')
mock_client.resources.get.assert_called_with(
api_version=ocp._openshift_cert_manager_api_version,
kind='Certificate'
)
if __name__ == '__main__':
unittest.main()
This test works fine with the first call but I've tried to know if the second call (v1_certificates.get) is called with no success.
I've tried mocking Resource class but I get this error.
import unittest
from unittest import mock
from openshift.dynamic import DynamicClient, Resource
from awscertmanager import ocp
class TestCertificateExists(unittest.TestCase):
#mock.patch.object(Resource, 'get')
def test_certificate_exists(self, mock_get):
mock_client = mock.create_autospec(DynamicClient)
ocp.certificate_exists(mock_client, namespace='dummy', certificate_name='tls-wildcard-dummy')
mock_client.resources.get.assert_called_with(
api_version=ocp._openshift_cert_manager_api_version,
kind='Certificate'
)
mock_get.assert_called_with(namespace='dummy', name='tls-wildcard-dummy')
if __name__ == '__main__':
unittest.main()
Error
Traceback (most recent call last):
File "/usr/local/Cellar/python#3.8/3.8.6_2/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/case.py", line 60, in testPartExecutor
yield
File "/usr/local/Cellar/python#3.8/3.8.6_2/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/case.py", line 676, in run
self._callTestMethod(testMethod)
File "/usr/local/Cellar/python#3.8/3.8.6_2/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/case.py", line 633, in _callTestMethod
method()
File "/usr/local/Cellar/python#3.8/3.8.6_2/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/mock.py", line 1322, in patched
with self.decoration_helper(patched,
File "/usr/local/Cellar/python#3.8/3.8.6_2/Frameworks/Python.framework/Versions/3.8/lib/python3.8/contextlib.py", line 113, in __enter__
return next(self.gen)
File "/usr/local/Cellar/python#3.8/3.8.6_2/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/mock.py", line 1304, in decoration_helper
arg = exit_stack.enter_context(patching)
File "/usr/local/Cellar/python#3.8/3.8.6_2/Frameworks/Python.framework/Versions/3.8/lib/python3.8/contextlib.py", line 425, in enter_context
result = _cm_type.__enter__(cm)
File "/usr/local/Cellar/python#3.8/3.8.6_2/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/mock.py", line 1393, in __enter__
original, local = self.get_original()
File "/usr/local/Cellar/python#3.8/3.8.6_2/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/mock.py", line 1366, in get_original
raise AttributeError(
AttributeError: <class 'openshift.dynamic.resource.Resource'> does not have the attribute 'get'
Finally it's working mocking DynamicClient class and some calls:
#mock.patch('awscertmanager.ocp.DynamicClient')
def test_certificate_exists_true(self, mock_dynamicclient):
mock_resource = mock.Mock()
mock_resource.get.return_value = None
mock_dynamicclient.resources.get.return_value = mock_resource
result = ocp.certificate_exists(mock_dynamicclient, namespace=self.namespace, certificate_name=self.certificate_name)
mock_resource.get.assert_called_with(namespace=self.namespace, name=self.certificate_name)
self.assertEqual(result, True)
I am trying to delgate a long processing task to background, so I can keep my UI responsive. Instead of using multithreading, which is not truly concurrent process, I implement my background task as multiprocessing. However I kept encounter error says
Traceback (most recent call last):
File "C:\source\MyApp\MyApp\env\lib\site-packages\engineio\server.py", line 411, in _trigger_event
return self.handlers[event](*args)
File "C:\source\MyApp\MyApp\env\lib\site-packages\socketio\server.py", line 522, in _handle_eio_message
self._handle_event(sid, pkt.namespace, pkt.id, pkt.data)
File "C:\source\MyApp\MyApp\env\lib\site-packages\socketio\server.py", line 458, in _handle_event
self._handle_event_internal(self, sid, data, namespace, id)
File "C:\source\MyApp\MyApp\env\lib\site-packages\socketio\server.py", line 461, in _handle_event_internal
r = server._trigger_event(data[0], namespace, sid, *data[1:])
File "C:source\MyApp\MyApp\env\lib\site-packages\socketio\server.py", line 490, in _trigger_event
return self.handlers[namespace][event](*args)
File "C:\source\MyApp\MyApp\env\lib\site-packages\flask_socketio\__init__.py", line 251, in _handler
*args)
File "C:\source\MyApp\MyApp\env\lib\site-packages\flask_socketio\__init__.py", line 634, in _handle_event
ret = handler(*args)
File "C:\source\MyApp\MyApp\MyApp\routes.py", line 171, in StartProcess
myClassObj.Start()
File "C:\source\MyApp\MyAppv\src\Controller.py", line 121, in Start
p.start()
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
reduction.dump(process_obj, to_child)
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle _thread.RLock objects
I am wondering where do I do wrong in using multiprocessing.
class MyClass(Object):
def __init__(self):
self._logger= logging.getLogger('MyClassObj')
pass
def Start(self):
p = Process(self.Test, args(1,))
p.start()
def Test(self, number):
print(number)
I am calling this class method from flask route.py, where
#socketio.on('StartProcess')
def StartProcess(msg):
""""Do Processing"""
myClassObj.Start()
return 'OK'
This route.py is not the entry point of the my application, which this is called by runserver.py
from os import environ
from MyApp import routes
from MyApp import app
from flask_socketio import SocketIO
from MyApp.routes import socketio
if __name__ == '__main__':
HOST = environ.get('SERVER_HOST', 'localhost')
try:
PORT = int(environ.get('SERVER_PORT', '5555'))
except ValueError:
PORT = 5555
socketio.run(app, host='0.0.0.0', port=PORT, debug=False)
I do keep seeing people mention the multiprocessing needs to run under a statement
if __name__ == '__main__':
But I am not sure how to correctly use multiprocessing in my case because at the entry point (runserver.py), I do not need my background process.
=== Edit ====
I created a very simple example to further elaborate this problem. My application has a simple flask app structure.
entry point runserver.py:
from os import environ
from MultiprocessDemo import app
if __name__ == '__main__':
HOST = environ.get('SERVER_HOST', 'localhost')
try:
PORT = int(environ.get('SERVER_PORT', '5100'))
except ValueError:
PORT = 5100
app.run(HOST, PORT,threaded=False)
route-level views.py:
from datetime import datetime
from flask import render_template
from multiprocessing import Process
from MultiprocessDemo import app
from flask_socketio import SocketIO, emit
from MultiprocessDemo.src.MultiprocessClass import MyClass
app.config['SECRET_KEY'] = 'secret!'
#socketio = SocketIO(app)
obj = MyClass()
#app.route('/')
def index():
"""Request To Fire Multiprocess."""
return render_template(
'Demo.html'
)
#app.route('/TriggerMultiprocess', methods=['GET','POST'])
def TriggerMultiprocess():
print('request to trigger')
obj .Start()
return render_template(
'Demo.html'
)
The MyClass object which create and execute multiprocess
#import queue #Queue as of python 2.x
from multiprocessing import Queue
import threading
import cv2
import os, errno, sys
import logging
import datetime
import time
from multiprocessing import Process
class MyClass(object):
def __init__(self):
# ==Where Issue happens ==========================
self._logger= logging.getLogger('MyClassObj')
self._logger.setLevel(logging.DEBUG)
format = logging.Formatter('%(levelname)s - %(asctime)s - (module)s - %(thread)d - %(message)s')
pass
def Start(self):
self.jobs = []
self._nThread = 3
for i in range (0, self._nThread):
thread = Process(target=self.Test, args=('classobj',))
self.jobs.append(thread)
# Start the threads (i.e. calculate the random number lists)
for j in self.jobs:
j.start()
#bk thread to consume tasks
def Test(self, name):
i = 0
while i < 20:
print('hello, {}'.format(name))
I found out that if the MyClass does not contain member of python logger, the multiprocess will be executed without issue, otherwise, it will throw the same error I've encountered previously. TypeError: can't pickle _thread.RLock objects
However, for the same class, if I call them from the following script, without flask, I will not encounter any issue with pickling, whether or not logger is part of the class member.
from MultiprocessClass import MyClass
import time
if __name__ == '__main__':
a = MyClass()
a.Start()
print("done")
Is there anyway to connectionpool or use a connection across multiple processes?
I am trying to use one connection across multiple processes. Here is the code (running on python 2.7, pyodbc).
# Import custom python packages
import pathos.multiprocessing as mp
import pyodbc
class MyManagerClass(object):
def __init__(self):
self.conn = None
self.result = []
def connect_to_db(self):
conn = pyodbc.connect("DSN=cpmeast;UID=dntcore;PWD=dntcorevs2")
cursor = conn.cursor()
self.conn = conn
return cursor
def read_data(self, *args):
cursor = args[0][0]
data = args[0][1]
print 'Running query'
cursor.execute("WAITFOR DELAY '00:00:02';select GETDATE(), '"+data+"';")
self.result.append(cursor.fetchall())
def read_data(*args):
print 'Running query', args
# cursor.execute("WAITFOR DELAY '00:00:02';select GETDATE(), '"+data+"';")
def main():
dbm = MyManagerClass()
conn = pyodbc.connect("DSN=cpmeast;UID=dntcore;PWD=dntcorevs2")
cursor = conn.cursor()
pool = mp.ProcessingPool(4)
for i in pool.imap(dbm.read_data, ((cursor, 'foo'), (cursor, 'bar'))):
print i
pool.close()
pool.join()
cursor.close();
dbm.conn.close()
print 'Result', dbm.result
print 'Closed'
if __name__ == '__main__':
main()
I am getting the following error:
Process PoolWorker-1:
Traceback (most recent call last):
File "/home/amit/envs/py_env_clink/lib/python2.7/site-packages/processing/process.py", line 227, in _bootstrap
self.run()
File "/home/amit/envs/py_env_clink/lib/python2.7/site-packages/processing/process.py", line 85, in run
self._target(*self._args, **self._kwargs)
File "/home/amit/envs/py_env_clink/lib/python2.7/site-packages/processing/pool.py", line 54, in worker
for job, i, func, args, kwds in iter(inqueue.get, None):
File "/home/amit/envs/py_env_clink/lib/python2.7/site-packages/processing/queue.py", line 327, in get
return recv()
File "/home/amit/envs/py_env_clink/lib/python2.7/site-packages/dill-0.2.4-py2.7.egg/dill/dill.py", line 209, in loads
return load(file)
File "/home/amit/envs/py_env_clink/lib/python2.7/site-packages/dill-0.2.4-py2.7.egg/dill/dill.py", line 199, in load
obj = pik.load()
File "/home/amit/envs/py_env_clink/lib/python2.7/pickle.py", line 858, in load
dispatch[key](self)
File "/home/amit/envs/py_env_clink/lib/python2.7/pickle.py", line 1083, in load_newobj
obj = cls.__new__(cls, *args)
TypeError: object.__new__(pyodbc.Cursor) is not safe, use pyodbc.Cursor.__new__()
Process PoolWorker-2:
Traceback (most recent call last):
File "/home/amit/envs/py_env_clink/lib/python2.7/site-packages/processing/process.py", line 227, in _bootstrap
self.run()
File "/home/amit/envs/py_env_clink/lib/python2.7/site-packages/processing/process.py", line 85, in run
self._target(*self._args, **self._kwargs)
File "/home/amit/envs/py_env_clink/lib/python2.7/site-packages/processing/pool.py", line 54, in worker
for job, i, func, args, kwds in iter(inqueue.get, None):
File "/home/amit/envs/py_env_clink/lib/python2.7/site-packages/processing/queue.py", line 327, in get
return recv()
File "/home/amit/envs/py_env_clink/lib/python2.7/site-packages/dill-0.2.4-py2.7.egg/dill/dill.py", line 209, in loads
return load(file)
File "/home/amit/envs/py_env_clink/lib/python2.7/site-packages/dill-0.2.4-py2.7.egg/dill/dill.py", line 199, in load
obj = pik.load()
File "/home/amit/envs/py_env_clink/lib/python2.7/pickle.py", line 858, in load
dispatch[key](self)
File "/home/amit/envs/py_env_clink/lib/python2.7/pickle.py", line 1083, in load_newobj
obj = cls.__new__(cls, *args)
TypeError: object.__new__(pyodbc.Cursor) is not safe, use pyodbc.Cursor.__new__()
The problem is with the Pickle stage. Pickle doesn't know inherently how to serialize a connection. Consider:
import pickle
import pymssql
a = {'hello': 'world'}
server = 'server'
username = 'username'
password = 'password'
database = 'database'
conn = pymssql.connect(host=server,user=username,password=password,database=database)
with open('filename.pickle', 'wb') as handle:
pickle.dump(conn, handle, protocol=pickle.HIGHEST_PROTOCOL)
with open('filename.pickle', 'rb') as handle:
b = pickle.load(handle)
print(a == b)
This results in the following error message:
Traceback (most recent call last):
File "pickle_ex.py", line 10, in <module>
pickle.dump(conn, handle, protocol=pickle.HIGHEST_PROTOCOL)
File "stringsource", line 2, in _mssql.MSSQLConnection.__reduce_cython__
TypeError: no default __reduce__ due to non-trivial __cinit__
But if you replace conn with a in pickle.dump, the code will run and print out True.
You may be able to define a custom reduce method in your class, but I wouldn't try it, considering how this would result in temp tables acting like global temp tables but only accessible across these processes (which shouldn't be allowed to transpire) anyways.
Links:
My pickle code is from here: How can I use pickle to save a dict?