Python: Remove top 'n' keys from a dictionary - python

I have a dictionary dict. For each key in dict, there is a list, that has two items in it. One is another dictionary, the other is an integer.
dict = {
'hello' : [
{
'blah' : 1,
'dodo' : 2
},
3
],
'world' : [
{
'foo' : 7,
'bar' : 1
},
8
]
}
I want to sort the dictionary dict on the second item in the list, the integer. And then remove the first 'n' keys from the dictionary. Is there any way to do it? The sorted function works only on lists.
Here is the function I'm trying to do this in.
def create_inverted_index(inverted_index, corpus_tokens, corpus_files):
for file_tokens in corpus_tokens:
file_id = corpus_files[file_tokens[0]]
for token in file_tokens[1]:
if token in inverted_index.keys():
inverted_index[token][1] += 1
if file_id in inverted_index[token][0].keys():
inverted_index[token][0][file_id] += 1
else:
inverted_index[token][0][file_id] = 1
else:
inverted_index[token] = [{file_id : 1}, 1]

You can do it by doing this:
d = {1: [1, 2], 3: [2,4], 4:[3,3], 2:[4,1], 0:[5,0]} # dict to remove items from
sorted_list=sorted(d.items(), key=lambda x: x[1][1])
sorted_keys = [key[1] for key in sorted_list]
n=2 # number of items to remove
for key in sorted_keys[0:n]:
d = dict([(k,v) for k,v in d.items() if v != key ])
This code copies the dict to a list ordered by the second item in dict values. Then it creates a list with only the sorted keys and iterate over it removing them as values from the dictionary.
For my value of d and n=3, output is:
{3: [2, 4], 4: [3, 3]}
For n=2:
{1: [1, 2], 3: [2, 4], 4: [3, 3]}
Ps: Might not be most efficient way of doing this, but does the job

In Python, dictionaries don't have an order. You cannot sort a dict. However, you can take a look at collections.OrderedDict.

Related

How to write a nested dict with lists as values for each nested dict

I think I know what I want. My output needs to have "teams" as dict keys, for each dict key there will be a nested dict, in each nested dict the key will be a players name, the values for each nested dict key will be a list of goals per game. I want to be able to find out who scored the highest amount of goals for each team eventually.
I don't know if a dataframe is better for this?
code to make nested dict - I didn't know how to do it
mydict = {"team" : {"players" : "goals_each_game"}}
team = list(range(1, 7))
print("team : ", team)
players = ["gk", "lb", "dl", "dr", "rb", "rw", "rm", "lm", "lw", "ls", "rs" ]
goals_each_game = list(range(1,7))
for d in team:
for k, v in mydict.items():
mydict["team"] = d
#mydict[v] = a nested dict of each player and their list of goals
for p in players:
teamlist = []
new_dict = {}
for k,v in new_dict.items():
new_dict[k] = p
new_dict[v] = goals_each_game
teamlist.append(new_dict)
mydict[v] = teamlist
for k,v in mydict.items():
print(k,v)
expected output
I want to know how to make this, and put any list of values inside the nested dicts instead of [1, 2, 3].
mydict = {"1": [{"gk":[1,2,3]}, {"lb":[1,2,3]}] ,"2": [{"gk":[1,2,3]}, {"lb":[1,2,3]}], "3": [{"gk":[1,2,3]}, {"lb":[1,2,3]}] ,"4": [{"gk":[1,2,3]}, {"lb":[1,2,3]}] }
for k,v in mydict.items():
print(k,v)
team : [1, 2, 3, 4, 5, 6]
1 [{'gk': [1, 2, 3]}, {'lb': [1, 2, 3]}]
2 [{'gk': [1, 2, 3]}, {'lb': [1, 2, 3]}]
3 [{'gk': [1, 2, 3]}, {'lb': [1, 2, 3]}]
4 [{'gk': [1, 2, 3]}, {'lb': [1, 2, 3]}]
Your outer-most dictionary is a dictionary whose keys are team indices and whose values are lists.
If your goal's to have each value be a dictionary of player initials to goals scored, you want teamlist to be a {} (and the call to teamlist.append to be replaced with an assignment).
You could write this more idiomatically with a comprehension:
goals_per_game = list(range(1, 7))
mapping = {
d: {
p: goals_per_game
for p in players
}
for d in teams
}
There's a few caveats here. The first is that all teams share the same players; you may want to maintain a separate dictionary mapping teams to player initials.
The second is that—as written—goals_per_game will be the same object across all players:
Assignment statements in Python do not copy objects, they create bindings between a target and an object
(reference)
This means that mutating the list for one player will mutate it for all players in the above implementation. Consider:
mapping[1]['gk'].append(100)
print(mapping[2]['lb']) # Has 100!
To address this, you can either construct a new list for each player (i.e. by calling list o range separately) or use copy to construct a copy for each player.

Sorting arrays within a dictionary

The information inside the arrays is in the "reverse" order of how I want it. Ideally it could be sorted by the dates within the array but I'm 100% certain just reversing the order would work.
By using something like this:
sorted(Dictionary[self], key=lambda i: i[1][0], reverse=True)
I know that the above JUST sorts the arrays themselves into reverse order and not the data inside the array into reverse order.
With the Dictionary like this (all items are a file name)
Dictionary = {'a':[XPJulianDay(Timefurthestinpastfromnow), ... XPJulianDay(timeclosest2currnttime)], 'b':[RQJulianDay(Timefurthestinpastfromnow), ... RQJulianDay(timeclosest2currnttime)], 'c':[WSJulianDay(Timefurthestinpastfromnow), ... WSJulianDay(timeclosest2currnttime)] ..... (9 different ones total) }
turning into this
Dictionary = {'a':[XPJulianDay(timeclosest2currnttime), ... XPJulianDay(Timefurthestinpastfromnow)], 'b':[RQJulianDay(timeclosest2currnttime), ... RQJulianDay(Timefurthestinpastfromnow)], 'c':[WSJulianDay(timeclosest2currnttime), ... WSJulianDay(Timefurthestinpastfromnow)] .... }
You can try that:
Dictionary.update({ k: sorted(v) for k, v in Dictionary.items() })
It updates the dictionary with its own keys, with sorted values.
Example:
>>> Dictionary = {"a": [7,6,1,2], "b": [8,0,2,5] }
>>> Dictionary.update({ k: sorted(v) for k, v in Dictionary.items() })
>>> Dictionary
{'a': [1, 2, 6, 7], 'b': [0, 2, 5, 8]}
>>>
Note that a new dictionary is created for the call to .update() using a dict comprehension.
If needed you can replace sorted() by reversed() ; but reversed() returns an iterator so if you want a list you need to call it with list() (it is better to keep the iterator if you can).
Example with reversed:
>>> Dictionary = {"a": [7,6,1,2], "b": [8,0,2,5] } ; Dictionary.update({ k: reversed(v) for k, v in Dictionary.items() })
>>> Dictionary
{'a': <list_reverseiterator object at 0x7f537a0b3a10>, 'b': <list_reverseiterator object at 0x7f537a0b39d0>}
>>>
You can use the dict comprehension as stated by #mguijarr or use dict and zip
Dictionary = dict(zip(Dictionary, map(sorted, Dictionary.values()))))
But if your keys really are just the 'a', 'b', ... then why are you using a dict? Just use a list...

How to expand a dictionary to incorporate all matching values?

I have a dictionary as such:
{
'key1': [1,2,3],
'key2': [4,5,6],
'1': [4,5],
'4': [4,6]
}
Now, I need to unpack this dictionary so all values that also occur as keys get appended to the original key. By that I mean the result should be:
{
'key1': [1,2,3,4,5,6],
'key2': [4,5,6]
'1': [4,5,6]
'4': [4,6]
}
Basically the 1 value in key1 has a key-value pair in of {'1':[4,5,6]}. So I need that appended to the original key1. Then the 4 also has a corresponding key-value pair so that should also get appended to key1 as key1 now has 4.
Note I don't know the "depth" of the dictionary before hand. So I need a solution scalable to arbitrary depth
So far I have tried this:
new_dict = {}
def expand(dict):
for k in dict:
for dep in dict[k]:
val = dict.get(dep)
new_dict[k] = [dep, val]
return new_dict
But this solution can only go 2 depths. And I am not sure how to go about capturing more matches of keys across arbitrary depths.
You can use a while loop to keep expanding each sub-list of the dict with items in matching keys from the items that are not in the old sub-list. Use sets to obtain such deltas efficiently:
def expand(d):
for lst in d.values():
old = set()
new = set(lst)
while True:
delta = new - old
if not delta:
break
old = new.copy()
for i in map(str, delta):
if i in d:
new.update(d[i])
lst[:] = new
return d
so that given your sample input as variable d, expand(d) returns:
{'key1': [1, 2, 3, 4, 5, 6], 'key2': [4, 5, 6], '1': [4, 5, 6], '4': [4, 6]}

Python efficient sort of parallel lists in dictionary

Title pretty much says it all, I'm looking to efficiently sort a dictionary of parallel lists.
unsorted_my_dict = {
'key_one': [1,6,2,3],
'key_two': [4,1,9,7],
'key_three': [1,2,4,3],
...
}
sorted_my_dict = {
'key_one': [1,6,3,2],
'key_two': [4,1,7,9],
'key_three': [1,2,3,4],
...
}
I want to sort key_three, and all the other lists in that dictionary in parallel. There are a few similar questions but I'm struggling because I have an unknown number of keys in the dictionary to be sorted, and I only know the name of the key I want to sort on (key_three).
Looking to to do this with vanilla Python, no 3rd party dependencies.
Edit 1:
What do I mean by in parallel? I mean that if I sort key_three, which requires swapping the last two values, that all other lists in the dictionary will have their last two values swapped as well.
Edit 2: Python 3.4 specifically
You can first sort an enumerate of the target list to recover the desired order of indices and then rearrange each list in that order.
my_dict = {
'key_one': [1,6,2,3],
'key_two': [4,1,9,7],
'key_three': [1,2,4,3],
}
def parallel_sort(d, key):
index_order = [i for i, _ in sorted(enumerate(d[key]), key=lambda x: x[1])]
return {k: [v[i] for i in index_order] for k, v in d.items()}
print(parallel_sort(my_dict, 'key_three'))
Output
{'key_one': [1, 6, 3, 2],
'key_two': [4, 1, 7, 9],
'key_three': [1, 2, 3, 4]}
zip the keys together, sort on a key function based on the relevant item, , then zip again to restore the original form:
sorted_value_groups = sorted(zip(*unsorted_my_dict.values()), key=lambda _, it=iter(unsorted_my_dict['key_three']): next(it))
sorted_values = zip(*sorted_value_groups)
sorted_my_dict = {k: list(newvals) for k, newvals in zip(unsorted_my_dict, sorted_values)}
Not at all clean, I mostly just posted this for funsies. One-liner is:
sorted_my_dict = {k: list(newvals) for k, newvals in zip(unsorted_my_dict, zip(*sorted(zip(*unsorted_my_dict.values()), key=lambda _, it=iter(unsorted_my_dict['key_three']): next(it))))}
This works because, while dict iteration order isn't guaranteed prior to 3.7, the order is guaranteed to be repeatable for an unmodified dict. Similarly, the key function is executed in order from start to finish, so pulling the key by repeated iteration is safe. We just detach all the values, group them by index, sort the groups by the index key, regroup them by key, and reattach them to their original keys.
Output is exactly as requested (and the order of the original keys is preserved on CPython 3.6 or any Python 3.7 or higher):
sorted_my_dict = {
'key_one': [1,6,3,2],
'key_two': [4,1,7,9],
'key_three': [1,2,3,4]
}
First with the given key on which sorting is done, you can get the indices order. You that sequence to rearrange the remaining lists in the dictionary.
unsorted_my_dict = {
'key_one': [1, 6, 2, 3],
'key_two': [4, 1, 9, 7],
'key_three': [1, 2, 4, 3],
}
def sort_parallel_by_key(my_dict, key):
def sort_by_indices(idx_seq):
return {k: [v[i] for i in idx_seq] for k, v in my_dict.items()}
indexes = [idx for idx, _ in sorted(enumerate(my_dict[key]), key=lambda foo: foo[1])]
return sort_by_indices(indexes)
print(sort_parallel_by_key(unsorted_my_dict, 'key_three'))

How to merge a dict into nested dict in python with particular format?

I have a dictionary as:
digit = { 'one' : 1, 'two' : 2, 'three' : 3, 'four' : 4, 'five' : 5 }
I want the new nested dictionary to be made like this:
new_dict = [{'eng':'one','math': 1}
{'eng':'two','math': 2}
{'eng':'three','math': 3}
{'eng':'four','math': 4}
{'eng':'five','math': 5}
]
I tried this:
digit = { 'one' : 1, 'two' : 2, 'three' : 3, 'four' : 4, 'five' : 5 }
new_dict={'eng':'','math':''}
for nest_key,nest_val in new_dict.items():
for (key,value),(k,v) in nest_val.items(), digit.items():
if nest_val['eng'] == '':
nest_val.update({k:v})
nest_val.append({k:v})
print(new_dict)
Gives this error:
for (key,value),(k,v) in nest_val.items(), digit.items():
AttributeError: 'str' object has no attribute 'items'
As I mentioned in comments, nest_val is actually the value which is a string and doesn't have an items() method. Besides that, you don't have to create another dictionary and update it by multiple loops like that. Instead, you can create your desire dictionaries by one loop over the items.
lst = []
for name, val in digit.items():
lst.append({'eng': name,'math': val})
And in a more Pythonic way you can just use a list comprehension to refuse appending to the list at each iteration.
lst = [{'eng': name,'math': val} for name, val in digit.items()]

Categories

Resources