Mis-arranged dictionary pairs when host on Flask - python

When I print the dictionary in the interpreter, it works as desired, but when I use it as Flask API return value the dictionary becomes a mess, all key-value pairs are mis-organized.
Not desired JSON data (got this on Flask API) - https://pastebin.com/jrfLMVNg
Desired JSON data (got this on interpreter) - https://pastebin.com/cDJnah07
Probably the faulty code:
def dataPacker(self,*datas):
for data in datas:
if type(data) == dict:
for key,value in data.items():
self.returnDataJson[key] = value
else:
raise Exception('dict object expected')
def dataCollector(self):
with concurrent.futures.ThreadPoolExecutor() as executor:
details_ = executor.submit(self.dataPacker,self.details)
audiolink_ = executor.submit(self.dataPacker,self.audiolink)
videolink_ = executor.submit(self.dataPacker,self.videolink)
lyrics_ = executor.submit(self.dataPacker,self.lyrics)
return self.returnDataJson
Is this because of threading? But why does it work fine on Interpreter?

So the problem is that the items in the wrong order, but each key has the right value?
You don't say which version of Python this is; older versions didn't keep items in order (it was arbitrary), and Flask may be deliberately making it random as well as arbitrary in order to protect against attacks.
If it comes to that, the order of items in a JSON dictionary (object) is defined to be unimportant, so you shouldn't rely on it if you can at all help it.
Threading will indeed make the order potentially interleaved. If you need to rely on the order, you'll need to put in some mechanism to guarantee it; currently it's just chance whether it ends up in the order you want or in some other order.

I found Cliff Kerr's Answer to this question
helpful. I just added app.config['JSON_SORT_KEYS'] = False to my code and now it doesn't sort it alphabetically and keeps the dict ordered.

Related

Converting a QTreeWidget to a nested dictionary in PyQt

The reason I want to do this is to allow the user to create a file tree using a QTreeWidget, then I want to extract that tree in to a nested dict structure, and save it some how. I've thought about using a txt file and eval aproach to simply load all the saved schemes into some array or just another dict where the key is the name of the scheme, so the user can then simply select a scheme or edit it if they wish. This naturally leads to me having to then convert that saved dict back into a QTreeWidget after the user has selected edit.
For now though here's my problem.
I've been able to successfully navigate the QTreeWidget using a recursive a function. What I struggle with is the logic behind creating the nested dict.
Below is what i have come up with so far:
def tree_to_dict(self, parent, key):
for i in range(parent.childCount()):
cur_par = parent.child(i)
if cur_par.childCount() == 0:
try:
if type(self.scheme[key]) is dict :
self.scheme[key][cur_par.text(0)] = 'E'
except KeyError:
key = cur_par.text(0)
self.scheme[key] = 'E'
else:
key = cur_par.text(0)
self.scheme[key] = {}
self.tree_to_dict(cur_par, key)
I know this is wrong. It's why I need help.
The above code generates the following dict form the following QTreeWidget
a
b
a
c
{'a':'E', 'b':{'a':'E', 'c':'E'}}
But it should be:
{'a':'E', 'b':{'a':'E'}, 'c':'E'}
The E simply means that there will be no further subdirectories.
I've seen some other implementations of this but their horribly confusing and I don't quite get their logic. This is a near duplicate of the question I'm asking here, but it's yet to be answered. I've tried adapting his implementation but it's (to me anyway) convoluted and hard to fit into the structure of my program.
Your implementation is probably too complex than required.
Since each item is the key, you need to iterate recursively and return the values for that key.
If the item has no children, it will return 'E', otherwise will it will call the function again with the given child, and so on.
The function doesn't need the key argument, as it will be created by the recursive call.
def tree_to_dict(parent):
childCount = parent.childCount()
if not childCount:
return 'E'
content = {}
for row in range(childCount):
child = parent.child(row)
content[child.text(0)] = tree_to_dict(child)
return content
Then, just call the function using the invisibleRootItem().

append to request.sessions[list] in Django

Something is bugging me.
I'm following along with this beginner tutorial for django (cs50) and at some point we receive a string back from a form submission and want to add it to a list:
https://www.youtube.com/watch?v=w8q0C-C1js4&list=PLhQjrBD2T380xvFSUmToMMzERZ3qB5Ueu&t=5777s
def add(request):
if 'tasklist' not in request.session:
request.session['tasklist'] = []
if request.method == 'POST':
form_data = NewTaskForm(request.POST)
if form_data.is_valid():
task = form_data.cleaned_data['task']
request.session['tasklist'] += [task]
return HttpResponseRedirect(reverse('tasks:index'))
I've checked the type of request.session['tasklist']and python shows it's a list.
The task variable is a string.
So why doesn't request.session['tasklist'].append(task) work properly? I can see it being added to the list via some print statements but then it is 'forgotten again' - it doesn't seem to be permanently added to the tasklist.
Why do we use this request.session['tasklist'] += [task] instead?
The only thing I could find is https://ogirardot.wordpress.com/2010/09/17/append-objects-in-request-session-in-django/ but that refers to a site that no longer exists.
The code works fine, but I'm trying to understand why you need to use a different operation and can't / shouldn't use the append method.
Thanks.
The reason why it does not work is because django does not see that you have changed anything in the session by using the append() method on a list that is in the session.
What you are doing here is essentially pulling out the reference to the list and making changes to it without the session backend knowing anything about it. An other way to explain:
The append() method is on the list itself not on the session object
When you call append() on the list you are only talking to the list and the list's parent (the session) has no idea what you guys are doing
When you however do an assignment on the session itself session['whatever'] = 'something' then it knows that something is up and changes are made
So the key here is that you need to operate on the session object directly if you want your changes to be updated automatically
Django only thinks it needs to save a changed session item if the item got reassigned to the session. See here: django session base code the __setitem__ method containing a self.modified = True statement.
The session['list'] += [new_element] adds a new list item (mutates the list stored in the session, so the list reference stays the same) and then gets it reassigned to the session again -> thus triggering first a __getitem__ call -> then your += / __iadd__ runs on the value read -> then a __setitem__ call is made (with the list ref. passed to it). You can see it in the django codebase that it marks the session after each __setitem__ call as modified.
The session['list'] = session['list'] + [new_item] mode of doing the same does create a new list every time it's run so its a bit less efficient, but you should not store hundreds of items in the session anyway. So you're probably fine. This also works exactly as above.
However if you use sub-keys in the session like session['list']['x'] = 'whatever' the session will not see itself as modified so you need to mark it as by request.session.modified = True
Short answer: It's about how Python chooses to implement the dict data structure.
Long answer:
Let's start by saying that request.session is a dictionary.
Quoting Django's documentation, "By default, Django only saves to the session database when the session has been modified – that is if any of its dictionary values have been assigned or deleted". Link
So, the problem is that the session database is not being modified by
request.session['tasklist'].append(task)
Seeing the related parts Django's Session base code (as posted by #Csaba K. in an answer), the variable self.modified is to be set True when setitem dunder method is called.
Now, at this step the problem seems like the setitem dunder method is not being called with request.session['tasklist'].append(task) but with request.session['tasklist'] += [task] it gets called. It is not due to if the reference of request.session['tasklist'] is changing or not as pointed out by another answer, because the reference to the underlying list remains the same.
To confirm, let's create a custom dictionary which extends the Python dict, and print something when setitem dunder method is called.
class MyDict(dict):
def __init__(self, globalVar):
super().__init__()
self.globalVar = globalVar
def __setitem__(self, key, value):
super().__setitem__(key, value)
print("Called Set item when: ", end="")
myDict = MyDict(0)
print("Creating Dict")
print("-----")
myDict["y"] = []
print("Adding a new key-value pair")
print("-----")
myDict["y"] += ["x"]
print(" using +=")
print("-----")
myDict["y"].append("x")
print("append")
print("-----")
myDict["y"].extend(["x"])
print("extend")
print("-----")
myDict["y"] = myDict["y"] + ["x"]
print(" using +",)
print("-----")
It prints:
Creating Dict
-----
Called Set item when: Adding a new key-value pair
-----
Called Set item when: using +=
-----
append
-----
extend
-----
Called Set item when: using +
-----
As we can see, setitem dunder method is called and in turn self.modified is set true only when adding a new key-value pair, or using += or using +, but not when initializing, appending or extending an iterable (in this case a list). Now, the operator + and += do very different things in Python, as explained in the other answer. += behaves more like the append method but in this case, I guess it's more about how Python chooses to implement the dict data structure rather than how +, += and append behave on lists.
I found this while doing some more searching:
https://code.djangoproject.com/wiki/NewbieMistakes
Scroll to 'Appending to a list in session doesn't work'
Again, it is a very dated entry but still seems to hold true.
Not completely satisfied because this does not answer the question as to 'why' this doesn't work, but at the very least confirms 'something's up' and you should probably still use the recommendations there.
(if anyone out there can actually explain this in a more verbose manner then I'd be happy to hear it)

i want to use variable globally in veiws.py

veiws.py
def getBusRouteId(strSrch):
end_point = "----API url----"
parameters = "?ServiceKey=" + "----my servicekey----"
parameters += "&strSrch=" + strSrch
url = end_point + parameters
retData = get_request_url(url)
asd = xmltodict.parse(retData)
json_type = json.dumps(asd)
data = json.loads(json_type)
if (data == None):
return None
else:
return data
def show_list(request)
Nm_list=[]
dictData_1 = getBusRouteId("110")
for i in range(len(dictData_1['ServiceResult']['msgBody']['itemList'])):
Nm_list.append(dictData_1['ServiceResult']['msgBody']['itemList'][i]['busRouteNm'])
return render(request, 'list.html', {'Nm_list': Nm_list})
There is a dict data that was given by API
In 'def getBusRouteId', some Xml data is saved by dict data
In 'def show_list', I call 'def getBusRouteId' so 'dictData_1' get a dict data
And I want to refer this dictData_1 in another function
Is there any way to use dictData_1 globally?
Either store those data in a session (if those are short-lived data) or in the database (if you want to persist them).
The point is that a WSGI app is typically deployed as a pool of long-running processes, with a "supervisor" process that will dispatch incoming HTTP requests to the first available process (or to a newly spawned one etc), so using process-wide globals to store per-user data does NOT work as you always end up with user A getting data from user B, or no data at all, etc.
NB: this kind of issues may not appear when testing with a single user on the dev server, but it's still GARANTEED to break in production.
Also, totally unrelated but:
1/ this bit seems totally useless - you serialize a dict to json then unserialize it to a dict, which, unless you have custom serialization / unseralization hooks (which is not the case here), it's functionalmly a no-op.
json_type = json.dumps(asd)
data = json.loads(json_type)
2/ Here:
end_point = "----API url----"
parameters = "?ServiceKey=" + "----my servicekey----"
parameters += "&strSrch=" + strSrch
url = end_point + parameters
retData = get_request_url(url)
I don't know how get_request_url is implemented but if you are using python-requests, it already knows how to turn a dict into a (properly encoded) querystring. And if you're using the standard urllib packages, they ALSO provide a way to turn a dict into a properly built querystring. This makes for more robust AND more maintainable code.
3/ you may want to learn how to properly use Python's for loops
Here:
Nm_list=[]
dictData_1 = getBusRouteId("110")
for i in range(len(dictData_1['ServiceResult']['msgBody']['itemList'])):
Nm_list.append(dictData_1['ServiceResult']['msgBody']['itemList'][i]['busRouteNm'])
Python for loop naturally iterate over the sequence, yielding an item from the sequence in each iteration. So the proper way to write this is:
Nm_list=[]
for item in dictData_1['ServiceResult']['msgBody']['itemList']:
Nm_list.append(item['busRouteNm'])
which is both much more readable AND much more efficient.
Also, this can be further improved using list comprehension:
# intermediate var for readability
source = dictData_1['ServiceResult']['msgBody']['itemList']
Nm_list = [item['busRouteNm'] for item in source]
which is even more efficient (it's optimized by the runtime to avoid memory reallocation when the list grows).
4/ this:
if (data == None):
return None
else:
return data
is a very convoluted way to write:
return data
(also note that since None is a singleton, the preferred way is to use the identity test operator is, ie if data is None - same result but more idiomatic).
I understood that you want to perform some operations on dict_data returned by getBusRouteId() and pass them to another function.
SOLUTION - Just passing the dict_data as an argument to another function will work. No need to make global variables.

Avoid extra line for attribute check?

I am developing this Python project where I encounter a situation many times and I wondered if there is a better way.
There is a list of class instances. Some part of lists are empty(filled with None).
Here is an example list.
ins_list = [ins_1, ins_2, None, ins_3, None]
I have to do some confirmations throughout the program flow. There are points where I need the control an attribute of these instances. But only indexes are given for choosing an instance from the list and it may be one of the empty elements. Which would give an error when the attribute is called. Here is an example program flow.
ind = 2
if ins_list[ind].some_attribute == "thing":
# This would give error when empty element is selected.
I deal with this by using,
if ins_list[ind]:
if ins_list[ind].some_attribute == "thing":
# This works
I am okay with using this. However the program is a long one, I apply this hundreds of times. Is there an easier, better way of doing this, it means I am producing reduntant code and increasing indentation level for no reason. I wish to know if there is such a solution.
Use a boolean operator and.
if ins_list[ind] and ins_list[ind].some_attribute == "thing":
# Code
As coder proposed, you can remove None from your list, or use dictionaries instead, to avoid to have to create an entry for each index.
I want to propose another way: you can create a dummyclass and replace None by it. This way there will be no error if you set an attribute:
class dummy:
def __nonzero__(self):
return False
def __setattr__(self, k, v):
return
mydummy = dummy()
mylist = [ins_1, ins_2, mydummy, ins_3, mydummy]
nothing will be stored to the dummyinstances when setting an attribute
edit:
If the content of the original list cannot be chosen, then this class could help:
class PickyList(list):
def __init__(self, iterable, dummyval):
self.dummy = dummyval
return super(PickyList, self).__init__(iterable)
def __getitem__(self, k):
v = super(PickyList, self).__getitem__(k)
return (self.dummy if v is None else v)
mylist = PickyList(ins_list, mydummy)
There are these two options:
Using a dictionary:
Another way would be to use a dictionary instead. So you could create your dictionary once the list is filled up with elements. The dictionary's keys would be the values of your list and as values you could use the attributes of the elements that are not None and "No_attr" for those that are None. (Note: Have in mind that python dictionaries don't support duplicate keys and that's why I propose below to store as keys your list indexes else you will have to find a way to make keys be different)
For example for a list like:
l = [item1,item2,None,item4]
You could create a dictionary:
d = {item1:"thing1", item2:"thing2", None:"No_attr", item3:"thing3"}
So in this way every time you would need to make a check, you wouldn't have to check two conditions, but you could check only the value, such as:
if d.values()[your_index]=="thing":
The only cons of this method is that standard python dictionaries are inherently unordered, which makes accessing dictionary values by index a bit dangerous sometimes - you have to be careful not to change the form-arrangement of the dictionary.
Now, if you want to make sure that the index stays stable, then you would have to store it some way, for example select as keys of your dictionary the indexes, as you will have already stored the attributes of the items - But that is something that you will have to decide and depends strongly on the architecture of your project.
Using a list:
In using lists way I don't think there is a way to avoid your if statement - and is not bad actually. Maybe use an and operator as it is mentioned already in another answer but I don't think that makes any difference anyway.
Also, if you want to use your first approach:
if ins_list[ind].some_attribute == "thing":
You could try using and exception catcher like this:
try:
if ins_list[ind].some_attribute == "thing":
#do something
except:
#an error occured
pass
In this case I would use an try-except statement because of EAFP (easier to ask for forgivness than permission). It won't shorten yout code but it's a more Pythonic way to code when checking for valid attributes. This way you won't break against DRY (Don't Repat Yourself) either.
try:
if ins_list[ind].some_attribute == "thing":
# do_something()
except AttributeError:
# do_something_else()

exception handling with NameError

I want to append new input to list SESSION_U without erasing its content. I try this:
...
try:
SESSION_U.append(UNIQUES)
except NameError:
SESSION_U = []
SESSION_U.append(UNIQUES)
...
I would think that at first try I would get the NameError and SESSION_U list would be created and appended; the second time try would work. But it does not. Do you know why? If this is not clear let me know and I will post the script. Thanks.
Edit
# save string s submitted from form to list K:
K = []
s = self.request.get('sentence')
K.append(s)
# clean up K and create 2 new lists with unique items only and find their frequency
K = K[0].split('\r\n')
UNIQUES = f2(K)
COUNTS = lcount(K, UNIQUES)
# append UNIQUES and COUNTS TO session lists.
# Session lists should not be initialized with each new submission
SESSION_U.append(UNIQUES)
SESSION_C.append(COUNTS)
If I put SESSION_U and SESSION_C after K = [] their content is erased with each submission; if not; I get NameError. I am looking for help about the standard way to handle this situation. Thank you. (I am working Google App Engine)
It appears that the code you posted is probably contained within a request handler. What are your requirements regarding this SESSION_U list? Clearly you want it to be preserved across requests, but there are several ways to do this and the best choice depends on your requirements.
I suspect you want to store SESSION_U in the datastore. You will need to use a transaction to atomically update the list (since multiple requests may try to simultaneously update it). Storing SESSION_U in the datastore makes it durable (i.e., it will persist across requests).
Alternatively, you could use memcache if you aren't worried about losing the list periodically. You could even store the list in a global variable (due to app caching, it will be maintained between requests to a particular instance and will be lost when the instance terminates).

Categories

Resources