I am using python muiltithreading for achieving a task which is like 2 to 3 mins long ,i have made one api endpoint in django project.
Here is my code--
from threading import Thread
def myendpoint(request):
print("hello")
lis = [ *args ]
obj = Model.objects.get(name =" jax")
T1 = MyThreadClass(lis, obj)
T1.start()
T1.deamon = True
return HttpResponse("successful", status=200)
Class MyThreadClass(Thread):
def __init__(self,lis,obj):
Thread.__init__(self)
self.lis = lis
self.obj = obj
def run(self):
for i in lis:
res =Func1(i)
self.obj.someattribute = res
self.obj.save()
def Func1(i):
'''Some big codes'''
context =func2(*args)
return context
def func2(*args):
"' some codes "'
return res
By this muiltithreading i can achieve the quick response from the django server on calling the endpoint function as the big task is thrown in another tread and execution of the endpoint thread is terminated on its return statement without keeping track of the spawned thread.
This part works for me correctly if i hit the url once , but if i hit the url 2 times as soon as 1st execution starts then on 2nd request i can see my request on console. But i cant get any response from it.
And if i hit the same url from 2 different client at the same time , both the individual datas are getting mixed up and i see few records of one client's request on other client data.
I am testing it to my local django runserver.
So guys please help , and i know about celery so dont recommend celery. Just tell me why this thing is happening or can it be fixed . As my task is not that long to use celery. I want to achieve it by muiltithreading.
Related
So i'm creating this application and a part of it is a web page where a trading algorithm is testing itself using live data. All that is working but the issue is if i leave (exit) the web page, it stops. I was wondering how i can keep it running in the background indefinitely as i want the algorithm to keep doing it's thing.
This is the route which i would like to run in the background.
#app.route('/live-data-source')
def live_data_source():
def get_live_data():
live_options = lo.Options()
while True:
live_options.run()
live_options.update_strategy()
trades = live_options.get_all_option_trades()
trades = trades[0]
json_data = json.dumps(
{'data': trades})
yield f"data:{json_data}\n\n"
time.sleep(5)
return Response(get_live_data(), mimetype='text/event-stream')
I've looked into multi-threading but not too sure if that's the right thing for the job. I am kind of still new to flask so hence the poor question. If you need more info, please do comment.
You can do it the following way - this is 100% working example below. Note, in production use Celery for such tasks, or write another one daemon app (another process) by yourself and feed it with tasks from http server with the help of message queue (e.g. RabbitMQ) or with the help of common database.
If any questions regarding code below, feel free to ask, it was quite good exercise for me:
from flask import Flask, current_app
import threading
from threading import Thread, Event
import time
from random import randint
app = Flask(__name__)
# use the dict to store events to stop other treads
# one event per thread !
app.config["ThreadWorkerActive"] = dict()
def do_work(e: Event):
"""function just for another one thread to do some work"""
while True:
if e.is_set():
break # can be stopped from another trhead
print(f"{threading.current_thread().getName()} working now ...")
time.sleep(2)
print(f"{threading.current_thread().getName()} was stoped ...")
#app.route("/long_thread", methods=["GET"])
def long_thread_task():
"""Allows to start a new thread"""
th_name = f"Th-{randint(100000, 999999)}" # not really unique actually
stop_event = Event() # is used to stop another thread
th = Thread(target=do_work, args=(stop_event, ), name=th_name, daemon=True)
th.start()
current_app.config["ThreadWorkerActive"][th_name] = stop_event
return f"{th_name} was created!"
#app.route("/stop_thread/<th_id>", methods=["GET"])
def stop_thread_task(th_id):
th_name = f"Th-{th_id}"
if th_name in current_app.config["ThreadWorkerActive"].keys():
e = current_app.config["ThreadWorkerActive"].get(th_name)
if e:
e.set()
current_app.config["ThreadWorkerActive"].pop(th_name)
return f"Th-{th_id} was asked to stop"
else:
return "Sorry something went wrong..."
else:
return f"Th-{th_id} not found"
#app.route("/", methods=["GET"])
def index_route():
text = ("/long_thread - create another thread. "
"/stop_thread/th_id - stop thread with a certain id. "
f"Available Threads: {'; '.join(current_app.config['ThreadWorkerActive'].keys())}")
return text
if __name__ == '__main__':
app.run(host="0.0.0.0", port=9999)
The main intent of this question is to know how to refresh cache from db (which is populated by some other team not in our control) in django rest service which will then be used in serving requests received on rest end point.
Currently I am using the following approach but my concern is since python (cpython with GIL) is not multithreaded then can we have following type of code in rest service where one thread is populating cache every 30 mins and main thread is serving requests on rest end point.Here is sample code only for illustration.
# mainproject.__init__.py
globaldict = {} # cache
class MyThread(Thread):
def __init__(self, event):
Thread.__init__(self)
self.stopped = event
def run(self):
while not self.stopped.wait(1800):
refershcachefromdb() # function that takes around 5-6 mins for refreshing cache (global datastructure) from db
refershcachefromdb() # this is explicitly called to initially populate cache
thread = MyThread(stop_flag)
thread.start() # started thread that will refresh cache every 30 mins
# views.py
import mainproject
#api_view(['GET'])
def get_data(request):
str_param = request.GET.get('paramid')
if str_param:
try:
paramids = [int(x) for x in str_param.split(",")]
except ValueError:
return JsonResponse({'Error': 'This rest end point only accept comma seperated integers'}, status=422)
# using global cache to get records
output_dct_lst = [mainproject.globaldict[paramid] for paramid in paramids if paramid in mainproject.globaldict]
if not output_dct_lst:
return JsonResponse({'Error': 'Data not available'}, status=422)
else:
return JsonResponse(output_dct_lst, status=200, safe=False)
I am new to celery but failing at what should be simple:
Backend and broker are both configured for RabbitMQ
Task as follows:
#app.task
def add(x, y):
return x + y
Test Code:
File 1:
from tasks import add
from celery import uuid
task_id = uuid()
result = add.delay(7, 2)
task_id = result.task_id
print task_id
# output =
05f3f783-a538-45ed-89e3-c836a2623e8a
print result.get()
# output =
9
File 2:
from tasks import add
from celery.result import AsyncResult
res = AsyncResult('05f3f783-a538-45ed-89e3-c836a2623e8a')
print res.state
# output =
pending
print ('Result = %s' %res.get())
My understanding is file 2 should retrieve the value success and 9.
I have installed flower:
This reports success and 9 for the result.
Help. This is driving me nuts.
Thank you
Maybe you should read the FineManual and think twice ?
RPC Result Backend (RabbitMQ/QPid)
The RPC result backend (rpc://) is special as it doesn’t actually store the states, but rather sends
them as messages. This is an important difference as it means that a
result can only be retrieved once, and only by the client that
initiated the task. Two different processes can’t wait for the same
result.
(...)
The messages are transient (non-persistent) by default, so the results
will disappear if the broker restarts. You can configure the result
backend to send persistent messages using the result_persistent
setting.
I have a problem with django,
it was like, when users submit some data,it will go into view.py to process, and it will eventually get to the success page.
But the process is too long.I don't want the users to wait for that long time.What i want is to get to the success page right away after users submiting the data.And the Server will process the data after return the success page.
Can you please tell me how to deal with it?
that was my code,but i don't know why it didn't work.
url.py
from django.conf.urls import patterns, url
from hebeu.views import handleRequest
urlpatterns = patterns('',
url(r'^$', handleRequest),
)
view.py
def handleRequest(request):
if request.method == 'POST':
response = HttpResponse(parserMsg(request))
return response
else:
return None
def parserMsg(request):
rawStr = smart_str(request.body)
msg = paraseMsgXml(ET.fromstring(rawStr))
queryStr = msg.get('Content')
openID = msg.get('FromUserName')
arr = smart_unicode(queryStr).split(' ')
#start a new thread
cache_classroom(openID,arr[1],arr[2],arr[3],arr[4]).start()
return "success"
My English is not good, i hope you can understand.
Take a look at Celery, it is a distributed task queue that will handle your situation perfectly. There is a little bit of setup to get everything working but once that is out of the way Celery really easy to work with.
For integration with Django start here: http://docs.celeryproject.org/en/latest/django/index.html
Write a management command to for parseMsg and trigger that using subprocess.popen and return success to user and parseMsg process will run in background. if this kind of operations are more in the application then you should use celery.
This is quiet easy, encapsulate #start a new thread with the code below
from threading import Thread
from datetime import datetime
class ProcessThread(Thread):
def __init__(self, name):
Thread.__init__(self)
self.name = name
self.started = datetime.now()
def run(self):
cache_classroom(openID,arr[1],arr[2],arr[3],arr[4]).start()
# I added this so you might know how long the process lasted
# just incase any optimization of your code is needed
finished = datetime.now()
duration = (self.start - finished).seconds
print "%s thread started at %s and finished at %s in "\
"%s seconds" % (self.name, self.started, finished, duration)
# let us now run start the thread
my_thread = ProcessThread("CacheClassroom")
my_thread.start()
I have a trouble with long calculations in django. I am not able to install Celery because of idiocy of my company, so I have to "reinvent the wheel".I am trying to make all calculations in TaskQueue class, which stores all calculations in dictionary "results". Also, I am trying to make "Please Wait" page, which will asks this TaskQueue if task with provided key is ready.
And the problem is that the results somehow disappear.
I have some view with long calculations.
def some_view(request):
...
uuid = task_queue.add_task(method_name, params) #method_name(params) returns HttpResponse
return redirect('/please_wait/?uuid={0}'.format(uuid))
And please_wait view:
def please_wait(request):
uuid = request.GET.get('uuid','0')
ready = task_queue.task_ready(uuid)
if ready:
return task_queue.task_result(uuid)
elif ready == None:
return render_to_response('admin/please_wait.html',{'not_found':True})
else:
return render_to_response('admin/please_wait.html',{'not_found':False})
And last code, my TaskQueue:
class TaskQueue:
def __init__(self):
self.pool = ThreadPool()
self.results = {}
self.lock = Lock()
def add_task(self, method, params):
self.lock.acquire()
new_uuid = self.generate_new_uuid()
while self.results.has_key(new_uuid):
new_uuid = self.generate_new_uuid()
self.results[new_uuid] = self.pool.apply_async(func=method, args=params)
self.lock.release()
return new_uuid
def generate_new_uuid(self):
return uuid.uuid1().hex[0:8]
def task_ready(self, task_id):
if self.results.has_key(task_id):
return self.results[task_id].ready()
else:
return None
def task_result(self, task_id):
if self.task_ready(task_id):
return self.results[task_id].get()
else:
return None
global task_queue = TaskQueue()
After task addition I could log result providing it's uuid for some seconds, and then it says that task doesn't ready. Here is my log: (I am outputting task_queue.results)
[INFO] 2013-10-01 16:04:52,782 logger: {'ade5d154': <multiprocessing.pool.ApplyResult object at 0x1989906c>}
[INFO] 2013-10-01 16:05:05,740 logger: {}
Help me, please! Why the hell result disappears?
UPD: #freakish helped me to find out some new information. This result doesn't disappear forever, it disappears sometimes if I will repeat my tries to log it.
[INFO] 2013-10-01 16:52:41,743 logger: {}
[INFO] 2013-10-01 16:52:45,775 logger: {}
[INFO] 2013-10-01 16:52:48,855 logger: {'ade5d154': <multiprocessing.pool.ApplyResult object at 0x1989906c>}
OK, so we've established that you are running 4 processes of Django. In that case your queue won't be shared between them. Actually there are two possible solutions AFAIK:
Use a shared queueing server. You can write your own (see for example this entry) but using a proper one (like Celery) will be a lot easier (if you can't convince your employer to install it, then quit the job ;)).
Use database to store results inside it and let each server do the calculations (via processes or threads). It does not have to be a proper database server. You can use sqlite3 for example. This is more secure and reliable way but less efficient. I think this is easier then queueing mechanism. You simply create table with columns: id, state, result. When you create job you update entry with state=processing, when you finish the job you update entry with state=done and result=result (for example as JSON string). This is easy and reliable (you actually don't need a queue here at all, the order of jobs doesn't matter unless I'm missing something).
Of course you won't be able to use this .ready() functions with it (you should store results inside these storages) unless you pickle results but that is an unnecessary overhead.