Django: Aggregation does not group by an integer - python

I want to count on an integer variable like I would with
select q001_r,count(q001_r) from ProjectData group by q001_r;
The result should be a dictionary saying I have e.g. 100 entries with a value of q001_r == 1
I have a really simple model
class ProjectData(models.Model):
"""No-SQL Flatfile der Variablen"""
caseid = models.PositiveIntegerField(unique=True)
...
q001_r = models.PositiveIntegerField()
and this is my aggregation funciton in the view
countInfosq1 = ProjectData.objects.values('q001_r').annotate(q1count=Count('q001_r'))
My result is not what I am expecting but:
[{'q1count': 1, 'q001_r': 1}, {'q1count': 1, 'q001_r': 3}, {'q1count': 1, 'q001_r': 1}, {'q1count': 1, 'q001_r': 3}, {'q1count': 1, 'q001_r': 2}, {'q1count': 1, 'q001_r': 2}, {'q1count': 1, 'q001_r': 2}, {'q1count': 1, 'q001_r': 1}, {'q1count': 1, 'q001_r': 3}]

Related

Is it possible to check if a key is in another key and if it is add the values of that key into the other key

given a dictionary is it possible to check if key is inside any other key and if it is adds the values of the key into that other key
my_dict[key] = set()
for example
{'0': {'3', '1', '0'}, '3': {'3', '1', '4'}, '1': {'2', '1', '0'}, '5': {'3', '5'}}
since key '0" is in key '1': {'2', '1', '0'} then the values of key '0' gets added to key '1'
{0: {0, 1, 2, 3, 4}, 1: {0, 1, 2, 3}, 3: {0, 1, 2, 3, 4}, 5: {1, 3, 4, 5}}
ive tried making putting the keys into a list and then checking if the values in the list is in any of the sets for the keys
You can iterate through each set and use dict.get to extract items from a key if it exists, or fall back to the item itself:
d = {0: {3, 1, 0}, 3: {3, 1, 4}, 1: {2, 1, 0}, 5: {3, 5}}
print({k: {c for i in s for c in d.get(i, [i])} for k, [*s] in d.items()})
This outputs, in no particular order (since set is unordered in Python):
{0: {0, 1, 2, 3, 4}, 3: {0, 1, 2, 3, 4}, 1: {0, 1, 2, 3}, 5: {1, 3, 4, 5}}
Demo: https://replit.com/#blhsing/LinedUnnaturalApi

pandas - pd.replace and TypeError

I have all_data dataframe. I want to replace some categorical values in certain columns with numerical values. I'm trying to use this nested dictionary notation (I've checked that the brackets and curly brackets are in place, I don't think that's the issue):
all_data = all_data.replace({'Street': {'Pave': 1, 'Grvl': 0}},
{'LotShape': {'IR3': 1, 'IR2': 2, 'IR1': 3, 'Reg': 4}},
{'Utilities': {'ELO': 0, 'NoSeWa': 0, 'NoSewr': 0, 'AllPub': 1}},
{'LandSlope': {'Sev': 1, 'Mod': 2, 'Gtl': 3}},
{'ExterQual': {'Po': 1, 'Fa': 2, 'TA': 3, 'Gd': 4, 'Ex': 5}},
{'ExterCond': {'Po': 1, 'Fa': 2, 'TA': 3, 'Gd': 4, 'Ex': 5}},
{'BsmtQual': {'NA': 0, 'Po': 1, 'Fa': 2, 'TA': 3, 'Gd': 4,'Ex': 5}},
{'BsmtCond': {'NA': 0, 'Po': 1, 'Fa': 2, 'TA': 3, 'Gd': 4,'Ex': 5}},
{'BsmtExposure': {'NA': 0, 'No': 1, 'Mn': 2, 'Av': 3, 'Gd': 4}},
{'BsmtFinType1': {'NA': 0, 'Unf': 1, 'LwQ': 2, 'Rec': 3, 'BLQ': 4, 'ALQ': 5, 'GLQ': 6}},
{'BsmtFinType2': {'NA': 0, 'Unf': 1,'LwQ': 2,'Rec': 3, 'BLQ': 4,'ALQ': 5, 'GLQ': 6}},
{'HeatingQC': {'Po': 1,'Fa': 2,'TA': 3,'Gd': 4,'Ex': 5}},
{'CentralAir': {'No': 0,'Yes': 1}},
{'KitchenQual': {'Po': 1,'Fa': 2,'TA': 3,'Gd': 4,'Ex': 5}},
{'Functional': {'Sal': -7,'Sev': -6,'Maj1': -5,'Maj2': -4,'Mod': -3,'Min2': -2,'Min1': -1,
'Typ': 0}},
{'FireplaceQu': {'NA': 0,'Po': 1,'Fa': 2,'TA': 3,'Gd': 4,'Ex': 5}},
{'GarageFinish': {'NA': 0,'Unf': 1,'RFn': 2, 'Fin': 3}},
{'GarageQual': {'NA': 0, 'Po': 1,'Fa': 2, 'TA': 3,'Gd': 4, 'Ex': 5}},
{'GarageCond': {'NA': 0,'Po': 1,'Fa': 2,'TA': 3,'Gd': 4,'Ex': 5}},
{'PavedDrive': {'N': 0,'P': 0, 'Y': 1}},
{'Fence': {'NA': 0, 'MnWw': 1,'GdWo': 2,'MnPrv': 3,'GdPrv': 4}},
{'SaleCondition': {'Abnorml': 1, 'Alloca': 1, 'AdjLand': 1, 'Family': 1, 'Normal': 0,
'Partial': 0}}
)
Error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-40-f9c9c28b7237> in <module>()
22 {'Fence': {'NA': 0, 'MnWw': 1,'GdWo': 2,'MnPrv': 3,'GdPrv': 4}},
23 {'SaleCondition': {'Abnorml': 1, 'Alloca': 1, 'AdjLand': 1, 'Family': 1, 'Normal': 0,
---> 24 'Partial': 0}}
25 )
TypeError: replace() takes from 1 to 8 positional arguments but 23 were given
If I remove the 'SaleCondition' row from the above code, the error is again there but this time referring to 'Fence', and so on, for each line of code from bottom up. I've googled but have no idea what this means. Help MUCH appreciated.
You should do something like :
df.replace({'Fence':{'NA': 0, 'MnWw': 1,'GdWo': 2,'MnPrv': 3,'GdPrv': 4},'SaleCondition':{'Abnorml': 1, 'Alloca': 1, 'AdjLand': 1, 'Family': 1, 'Normal': 0,
'Partial': 0}})
the format should be .replace({'col1':{},'col2':{}}) not .replace({'col1':{}},{'col2':{}})

dictionary update sequence element #0 has length 3; 2 is required when updating with a tuple

I have a problem with a dictionary that I want to split into two others.
dico={'GA1': {'main': 1, 'middle': 1, 'sub': 1},
'GA2': {'main': 1, 'middle': 1, 'sub': 2},
'GA3': {'main': 1, 'middle': 1, 'sub': 3},
'GA4': {'main': 1, 'middle': 1, 'sub': 4},
'GA5': {'main': 1, 'middle': 1, 'sub': 5},
'GA6': {'main': 1, 'middle': 1, 'sub': 6},
'GA7': {'main': 1, 'middle': 1, 'sub': 7},
'GA8': {'main': 1, 'middle': 1, 'sub': 8},
'GA9': {'main': 1, 'middle': 1, 'sub': 9},
'GA10': {'main': 1, 'middle': 1, 'sub': 10}}
I want to put GA2 and GA6 to GA10 in a dictionary d1 and GA1 and GA3 to GA5 in a dictionary d2.
When I transform it into a list, I end up with tupples like,
list(dico.items())[0]
which gives ('GA1', {'main': 1, 'middle': 1, 'sub': 1})
When I want to set this into my new dictionary,
d2 = {}
d2.update(list(dico.items())[0])
I end up with "builtins.ValueError: dictionary update sequence element #0 has length 3; 2 is required"
Is a dictionary an invalid format for a tuple element ?
Thanks for your help
Alexandre
Did you mean this?
d2.update([list(dico.items())[0]])
You can initialise a dictionary with a list of tuples. You were providing only a single tuple, not inside a list. Use the [] to initialise a singleton list and pass that:
{'GA10': {'middle': 1, 'main': 1, 'sub': 10}}
Also, doing list(dico.items()) and then taking the 0th element is wasteful. If you can, consider changing your approach to your problem.
d1 = { k:dico[k] for k in ['GA2','GA6','GA10'] }
print (d1)
Output:
{'GA2': {'main': 1, 'middle': 1, 'sub': 2}, 'GA6': {'main': 1, 'middle': 1, 'sub': 6}, 'GA10': {'main': 1, 'middle': 1, 'sub': 10}}
Create a list of the keys you want, then use a dict comprehension. Code below creates a dictionary, d2, with GA1, GA8 and GA9, key-value pairs.
newkeys = ['GA1', 'GA8', 'GA9']
d2 = {k: dico[k] for k in set(newkeys) & set(dico.keys())}
see Filter dict to contain only certain keys? for more info

Pythonic sort a list of dictionaries in a tricky order

I have a list of id's sorted in a proper oder:
ids = [1, 2, 4, 6, 5, 0, 3]
I also have a list of dictionaries, sorted in some random way:
rez = [{'val': 7, 'id': 1}, {'val': 8, 'id': 2}, {'val': 2, 'id': 3}, {'val': 0, 'id': 4}, {'val': -1, 'id': 5}, {'val': -4, 'id': 6}, {'val': 9, 'id': 0}]
My intention is to sort rez list in a way that corresponds to ids:
rez = [{'val': 7, 'id': 1}, {'val': 8, 'id': 2}, {'val': 0, 'id': 4}, {'val': -4, 'id': 6}, {'val': -1, 'id': 5}, {'val': 9, 'id': 0}, {'val': 2, 'id': 3}]
I tried:
rez.sort(key = lambda x: ids.index(x['id']))
However that way is too slow for me, as len(ids) > 150K, and each dict actually had a lot of keys (some values there are strings). Any suggestion how to do it in the most pythonic, but still fastest way?
You don't need to sort because ids specifies the entire ordering of the result. You just need to pick the correct elements by their ids:
rez_dict = {d['id']:d for d in rez}
rez_ordered = [rez_dict[id] for id in ids]
Which gives:
>>> rez_ordered
[{'id': 1, 'val': 7}, {'id': 2, 'val': 8}, {'id': 4, 'val': 0}, {'id': 6, 'val': -4}, {'id': 5, 'val': -1}, {'id': 0, 'val': 9}, {'id': 3, 'val': 2}]
This should be faster than sorting because it can be done in linear time on average, while sort is O(nlogn).
Note that this assumes that there will be one entry per id, as in your example.
I think you are on the right track. If you need to speed it up, because your list is too long and you are having quadratic complexity, you can turn the list into a dictionary first, mapping the ids to their respective indices.
indices = {id_: pos for pos, id_ in enumerate(ids)}
rez.sort(key = lambda x: indices[x['id']])
This way, indices is {0: 5, 1: 0, 2: 1, 3: 6, 4: 2, 5: 4, 6: 3}, and rez is
[{'id': 1, 'val': 7},
{'id': 2, 'val': 8},
{'id': 4, 'val': 0},
{'id': 6, 'val': -4},
{'id': 5, 'val': -1},
{'id': 0, 'val': 9},
{'id': 3, 'val': 2}]

mongodb elemMatch in dual array

I have a problem try to use $elemMatch in dual nested array:
Suppose I have this a document:
a = {'cart': [[{'id': 1, 'count': 1}, {'id': 2, 'count': 3}], [{'id': 1, 'count': 5}]]}
And I want to select a document out when id is 1 and count greater than 2:
db.cart.find_one({'cart.0.id': 1, 'cart.0.count': {'$gt': 2}})
But this query will select a out.
Then I have tried these queries:
db.cart.find_one({'cart': {'$elemMatch': {'id': 1, 'count': {'$gt': 2}}}})
db.cart.find_one({'cart': {'$elemMatch': {'id': 2, 'count': {'$gt': 2}}}})
db.cart.find_one({'cart.0': {'$elemMatch': {'id': 1, 'count': {'$gt': 2}}}})
db.cart.find_one({'cart.0': {'$elemMatch': {'id': 2, 'count': {'$gt': 2}}}})
But all return None.
So do $elemMatch support the nested array match? If so, how shall I tune my query?
Given the fact that you have an array within an array, I think you could try something like
db.cart.find_one({'cart': {'$elemMatch': { '$elemMatch' : {'id': 1, 'count': {'$gt': 2}}}}})

Categories

Resources