Related
given a dictionary is it possible to check if key is inside any other key and if it is adds the values of the key into that other key
my_dict[key] = set()
for example
{'0': {'3', '1', '0'}, '3': {'3', '1', '4'}, '1': {'2', '1', '0'}, '5': {'3', '5'}}
since key '0" is in key '1': {'2', '1', '0'} then the values of key '0' gets added to key '1'
{0: {0, 1, 2, 3, 4}, 1: {0, 1, 2, 3}, 3: {0, 1, 2, 3, 4}, 5: {1, 3, 4, 5}}
ive tried making putting the keys into a list and then checking if the values in the list is in any of the sets for the keys
You can iterate through each set and use dict.get to extract items from a key if it exists, or fall back to the item itself:
d = {0: {3, 1, 0}, 3: {3, 1, 4}, 1: {2, 1, 0}, 5: {3, 5}}
print({k: {c for i in s for c in d.get(i, [i])} for k, [*s] in d.items()})
This outputs, in no particular order (since set is unordered in Python):
{0: {0, 1, 2, 3, 4}, 3: {0, 1, 2, 3, 4}, 1: {0, 1, 2, 3}, 5: {1, 3, 4, 5}}
Demo: https://replit.com/#blhsing/LinedUnnaturalApi
I have all_data dataframe. I want to replace some categorical values in certain columns with numerical values. I'm trying to use this nested dictionary notation (I've checked that the brackets and curly brackets are in place, I don't think that's the issue):
all_data = all_data.replace({'Street': {'Pave': 1, 'Grvl': 0}},
{'LotShape': {'IR3': 1, 'IR2': 2, 'IR1': 3, 'Reg': 4}},
{'Utilities': {'ELO': 0, 'NoSeWa': 0, 'NoSewr': 0, 'AllPub': 1}},
{'LandSlope': {'Sev': 1, 'Mod': 2, 'Gtl': 3}},
{'ExterQual': {'Po': 1, 'Fa': 2, 'TA': 3, 'Gd': 4, 'Ex': 5}},
{'ExterCond': {'Po': 1, 'Fa': 2, 'TA': 3, 'Gd': 4, 'Ex': 5}},
{'BsmtQual': {'NA': 0, 'Po': 1, 'Fa': 2, 'TA': 3, 'Gd': 4,'Ex': 5}},
{'BsmtCond': {'NA': 0, 'Po': 1, 'Fa': 2, 'TA': 3, 'Gd': 4,'Ex': 5}},
{'BsmtExposure': {'NA': 0, 'No': 1, 'Mn': 2, 'Av': 3, 'Gd': 4}},
{'BsmtFinType1': {'NA': 0, 'Unf': 1, 'LwQ': 2, 'Rec': 3, 'BLQ': 4, 'ALQ': 5, 'GLQ': 6}},
{'BsmtFinType2': {'NA': 0, 'Unf': 1,'LwQ': 2,'Rec': 3, 'BLQ': 4,'ALQ': 5, 'GLQ': 6}},
{'HeatingQC': {'Po': 1,'Fa': 2,'TA': 3,'Gd': 4,'Ex': 5}},
{'CentralAir': {'No': 0,'Yes': 1}},
{'KitchenQual': {'Po': 1,'Fa': 2,'TA': 3,'Gd': 4,'Ex': 5}},
{'Functional': {'Sal': -7,'Sev': -6,'Maj1': -5,'Maj2': -4,'Mod': -3,'Min2': -2,'Min1': -1,
'Typ': 0}},
{'FireplaceQu': {'NA': 0,'Po': 1,'Fa': 2,'TA': 3,'Gd': 4,'Ex': 5}},
{'GarageFinish': {'NA': 0,'Unf': 1,'RFn': 2, 'Fin': 3}},
{'GarageQual': {'NA': 0, 'Po': 1,'Fa': 2, 'TA': 3,'Gd': 4, 'Ex': 5}},
{'GarageCond': {'NA': 0,'Po': 1,'Fa': 2,'TA': 3,'Gd': 4,'Ex': 5}},
{'PavedDrive': {'N': 0,'P': 0, 'Y': 1}},
{'Fence': {'NA': 0, 'MnWw': 1,'GdWo': 2,'MnPrv': 3,'GdPrv': 4}},
{'SaleCondition': {'Abnorml': 1, 'Alloca': 1, 'AdjLand': 1, 'Family': 1, 'Normal': 0,
'Partial': 0}}
)
Error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-40-f9c9c28b7237> in <module>()
22 {'Fence': {'NA': 0, 'MnWw': 1,'GdWo': 2,'MnPrv': 3,'GdPrv': 4}},
23 {'SaleCondition': {'Abnorml': 1, 'Alloca': 1, 'AdjLand': 1, 'Family': 1, 'Normal': 0,
---> 24 'Partial': 0}}
25 )
TypeError: replace() takes from 1 to 8 positional arguments but 23 were given
If I remove the 'SaleCondition' row from the above code, the error is again there but this time referring to 'Fence', and so on, for each line of code from bottom up. I've googled but have no idea what this means. Help MUCH appreciated.
You should do something like :
df.replace({'Fence':{'NA': 0, 'MnWw': 1,'GdWo': 2,'MnPrv': 3,'GdPrv': 4},'SaleCondition':{'Abnorml': 1, 'Alloca': 1, 'AdjLand': 1, 'Family': 1, 'Normal': 0,
'Partial': 0}})
the format should be .replace({'col1':{},'col2':{}}) not .replace({'col1':{}},{'col2':{}})
I have a problem with a dictionary that I want to split into two others.
dico={'GA1': {'main': 1, 'middle': 1, 'sub': 1},
'GA2': {'main': 1, 'middle': 1, 'sub': 2},
'GA3': {'main': 1, 'middle': 1, 'sub': 3},
'GA4': {'main': 1, 'middle': 1, 'sub': 4},
'GA5': {'main': 1, 'middle': 1, 'sub': 5},
'GA6': {'main': 1, 'middle': 1, 'sub': 6},
'GA7': {'main': 1, 'middle': 1, 'sub': 7},
'GA8': {'main': 1, 'middle': 1, 'sub': 8},
'GA9': {'main': 1, 'middle': 1, 'sub': 9},
'GA10': {'main': 1, 'middle': 1, 'sub': 10}}
I want to put GA2 and GA6 to GA10 in a dictionary d1 and GA1 and GA3 to GA5 in a dictionary d2.
When I transform it into a list, I end up with tupples like,
list(dico.items())[0]
which gives ('GA1', {'main': 1, 'middle': 1, 'sub': 1})
When I want to set this into my new dictionary,
d2 = {}
d2.update(list(dico.items())[0])
I end up with "builtins.ValueError: dictionary update sequence element #0 has length 3; 2 is required"
Is a dictionary an invalid format for a tuple element ?
Thanks for your help
Alexandre
Did you mean this?
d2.update([list(dico.items())[0]])
You can initialise a dictionary with a list of tuples. You were providing only a single tuple, not inside a list. Use the [] to initialise a singleton list and pass that:
{'GA10': {'middle': 1, 'main': 1, 'sub': 10}}
Also, doing list(dico.items()) and then taking the 0th element is wasteful. If you can, consider changing your approach to your problem.
d1 = { k:dico[k] for k in ['GA2','GA6','GA10'] }
print (d1)
Output:
{'GA2': {'main': 1, 'middle': 1, 'sub': 2}, 'GA6': {'main': 1, 'middle': 1, 'sub': 6}, 'GA10': {'main': 1, 'middle': 1, 'sub': 10}}
Create a list of the keys you want, then use a dict comprehension. Code below creates a dictionary, d2, with GA1, GA8 and GA9, key-value pairs.
newkeys = ['GA1', 'GA8', 'GA9']
d2 = {k: dico[k] for k in set(newkeys) & set(dico.keys())}
see Filter dict to contain only certain keys? for more info
I have a list of id's sorted in a proper oder:
ids = [1, 2, 4, 6, 5, 0, 3]
I also have a list of dictionaries, sorted in some random way:
rez = [{'val': 7, 'id': 1}, {'val': 8, 'id': 2}, {'val': 2, 'id': 3}, {'val': 0, 'id': 4}, {'val': -1, 'id': 5}, {'val': -4, 'id': 6}, {'val': 9, 'id': 0}]
My intention is to sort rez list in a way that corresponds to ids:
rez = [{'val': 7, 'id': 1}, {'val': 8, 'id': 2}, {'val': 0, 'id': 4}, {'val': -4, 'id': 6}, {'val': -1, 'id': 5}, {'val': 9, 'id': 0}, {'val': 2, 'id': 3}]
I tried:
rez.sort(key = lambda x: ids.index(x['id']))
However that way is too slow for me, as len(ids) > 150K, and each dict actually had a lot of keys (some values there are strings). Any suggestion how to do it in the most pythonic, but still fastest way?
You don't need to sort because ids specifies the entire ordering of the result. You just need to pick the correct elements by their ids:
rez_dict = {d['id']:d for d in rez}
rez_ordered = [rez_dict[id] for id in ids]
Which gives:
>>> rez_ordered
[{'id': 1, 'val': 7}, {'id': 2, 'val': 8}, {'id': 4, 'val': 0}, {'id': 6, 'val': -4}, {'id': 5, 'val': -1}, {'id': 0, 'val': 9}, {'id': 3, 'val': 2}]
This should be faster than sorting because it can be done in linear time on average, while sort is O(nlogn).
Note that this assumes that there will be one entry per id, as in your example.
I think you are on the right track. If you need to speed it up, because your list is too long and you are having quadratic complexity, you can turn the list into a dictionary first, mapping the ids to their respective indices.
indices = {id_: pos for pos, id_ in enumerate(ids)}
rez.sort(key = lambda x: indices[x['id']])
This way, indices is {0: 5, 1: 0, 2: 1, 3: 6, 4: 2, 5: 4, 6: 3}, and rez is
[{'id': 1, 'val': 7},
{'id': 2, 'val': 8},
{'id': 4, 'val': 0},
{'id': 6, 'val': -4},
{'id': 5, 'val': -1},
{'id': 0, 'val': 9},
{'id': 3, 'val': 2}]
I have a problem try to use $elemMatch in dual nested array:
Suppose I have this a document:
a = {'cart': [[{'id': 1, 'count': 1}, {'id': 2, 'count': 3}], [{'id': 1, 'count': 5}]]}
And I want to select a document out when id is 1 and count greater than 2:
db.cart.find_one({'cart.0.id': 1, 'cart.0.count': {'$gt': 2}})
But this query will select a out.
Then I have tried these queries:
db.cart.find_one({'cart': {'$elemMatch': {'id': 1, 'count': {'$gt': 2}}}})
db.cart.find_one({'cart': {'$elemMatch': {'id': 2, 'count': {'$gt': 2}}}})
db.cart.find_one({'cart.0': {'$elemMatch': {'id': 1, 'count': {'$gt': 2}}}})
db.cart.find_one({'cart.0': {'$elemMatch': {'id': 2, 'count': {'$gt': 2}}}})
But all return None.
So do $elemMatch support the nested array match? If so, how shall I tune my query?
Given the fact that you have an array within an array, I think you could try something like
db.cart.find_one({'cart': {'$elemMatch': { '$elemMatch' : {'id': 1, 'count': {'$gt': 2}}}}})