Copying first n items of a dictionary into another dictionary - python

This is a simple question but I am unable to code this in python. I want to copy first n items ( supposedly 100 ) i.e both the values and keys into another empty dictionary. I'll give a more clear picture of this. I created a dictionary and sorted it using OrderedDict. My code for sorting it is :
ordP = OrderedDict(reversed(sorted(wcP.items(), key=lambda t: t[1])))
Here ordP is the ordered dictionary I got. This is in descending order. And my original dictionary is wcP. I want to put the first 100 values of ordP i.e the first 100 maximum values of ordP ( sorted according to the keys ) in a new dictionary.

Dictionaries aren't ordered, but if you just want a random selection:
new_values = dict(your_values.items()[:n])
Or, for those obsessed with laziness:
import itertools
new_values = dict(itertools.islice(your_values.iteritems(), n))
If there's a particular sort you want to impose, define a key function that takes the key and value. People usually do lambdas, but there's no reason you can't use a full function.
def example_key_func((key, value)):
return key * value
new_dict = dict(sorted(your_values.items(), key=example_key_func)[:n])

n = 100
assert len(d.keys()) >= n
dic100 = {k:v for k,v in list(d.items())[:n]}

Related

Python dictionary comprehension to group together equal keys

I have a code snippit that groups together equal keys from a list of dicts and adds the dict with equal ObjectID to a list under that key.
Code bellow works, but I am trying to convert it to a Dictionary comprehension
group togheter subblocks if they have equal ObjectID
output = {}
subblkDBF : list[dict]
for row in subblkDBF:
if row["OBJECTID"] not in output:
output[row["OBJECTID"]] = []
output[row["OBJECTID"]].append(row)
Using a comprehension is possible, but likely inefficient in this case, since you need to (a) check if a key is in the dictionary at every iteration, and (b) append to, rather than set the value. You can, however, eliminate some of the boilerplate using collections.defaultdict:
output = defaultdict(list)
for row in subblkDBF:
output[row['OBJECTID']].append(row)
The problem with using a comprehension is that if really want a one-liner, you have to nest a list comprehension that traverses the entire list multiple times (once for each key):
{k: [d for d in subblkDBF if d['OBJECTID'] == k] for k in set(d['OBJECTID'] for d in subblkDBF)}
Iterating over subblkDBF in both the inner and outer loop leads to O(n^2) complexity, which is pointless, especially given how illegible the result is.
As the other answer shows, these problems go away if you're willing to sort the list first, or better yet, if it is already sorted.
If rows are sorted by Object ID (or all rows with equal Object ID are at least next to each other, no matter the overall order of those IDs) you could write a neat dict comprehension using itertools.groupby:
from itertools import groupby
from operator import itemgetter
output = {k: list(g) for k, g in groupby(subblkDBF, key=itemgetter("OBJECTID"))}
However, if this is not the case, you'd have to sort by the same key first, making this a lot less neat, and less efficient than above or the loop (O(nlogn) instead of O(n)).
key = itemgetter("OBJECTID")
output = {k: list(g) for k, g in groupby(sorted(subblkDBF, key=key), key=key)}
You can adding an else block to safe on time n slightly improve perfomrance a little:
output = {}
subblkDBF : list[dict]
for row in subblkDBF:
if row["OBJECTID"] not in output:
output[row["OBJECTID"]] = [row]
else:
output[row["OBJECTID"]].append(row)

how to get the corresponding key for the maximum value in dictionary list in the most effecient way?

Let's assume that there is a dictionary list like this one:
lst = {(1,1):2, (1,2):5, (1,3):10, (1,4):14, (1,6):22}
I want a simple (the most efficient) function that returns the dictionary key which its value is the maximum.
For example:
key_for_max_value_in_dict(lst) = (1,6)
because the tuple (1,6) has the most value (22).
I came up with this code which might be the most efficient one:
max(lst, key=lambda x: lst[x])
Use a comprehension for that like:
Code:
max((v, k) for k, v in lst.items())[1]
How does it work?
Iterate over the items() in the dict, and emit them as tuples of (value, key) with the value first in the tuple. max() can then find the largest value, because tuples sort by each element in the tuple, with first element matching first element. Then take the second element ([1]) of the max tuple since it is the key value for the max value in the dict.
Test Code:
lst = {(1,1):2, (1,2):5, (1,3):10, (1,4):14, (1,6):22}
print(max((v, k) for k, v in lst.items())[1])
Results;
(1, 6)
Assuming you're using a regular unsorted dictionary, you'll need to walk down the entire thing once. Keep track of what the largest element is and update it if you see a larger one. If it is the same, add to the list.
largest_key = []
largest_value = 0
for key, value in lst.items():
if value > largest_value:
largest_value = value
largest_key = [key]
elif value == largest_value:
largest_key.append(key)

Python printing single key/value pairs from a dict [duplicate]

This question already has answers here:
Get key by value in dictionary
(43 answers)
Closed 7 years ago.
Say I have the following code that makes a dict:
x = 0
myHash = {}
name = ["Max","Fred","Alice","Bobby"]
while x <= 3:
myHash[name[x]] = x
x += 1
l = sorted(myHash.values(), reverse=True)
largestNum = l[0]
# print myHash.getKeyFromValue(largestNum)
Is it possible to easily get the key that is paired to my largestNum variable without looping through the entire dict? Something like the pseudo code in the line at the bottom.
Note: I don't want to get a value from a key. I want the reverse of that.
Don't just sort the values. Sort the items by their values, and get the key for free.
from operator import itemgetter
l = sorted(myHash.items(), key=itemgetter(1), reverse=True)
largestKey, largestNum = l[0]
Note: If you only want the largest value, not the rest of the sort results, you can save some work and skip the full sorted work (reducing work from O(n log n) to O(n)):
largestKey, largestNum = max(myHash.items(), key=itemgetter(1))
For the general case of inverting a dict, if the values are unique, it's trivial to create a reversed mapping:
invert_dict = {v: k for k, v in orig_dict.items()}
If the values aren't unique, and you want to find all keys corresponding to a single value with a single lookup, you'd invert to a multi-dict:
from collections import defaultdict
invert_dict = defaultdict(set)
for k, v in orig_dict.items():
invert_dict[v].add(k)
# Optionally convert back to regular dict to avoid lookup auto-vivification in the future:
# invert_dict = dict(invert_dict)

Extracting keys-values from dictionary

import random
dictionary = {'dog': 1,'cat': 2,'animal': 3,'horse': 4}
keys = random.shuffle(list(dictionary.keys())*3)
values = list(dictionary.values())*3
random_key = []
random_key_value = []
random_key.append(keys.pop())
random_key_value.append(???)
For random_key_values.append, I need to add the value that corresponds to the key that was popped. How can I achieve this? I need to make use of multiples of the list and I can't multiply a dictionary directly, either.
I'm going on python (you should specify the language in your question).
If I understand, you want to multiply the elements in the dictionary. So
list(dictionary.keys()) * 3
is not your solution: [1,2] * 3 results in [1,2,1,2,1,2]
Try instead list comprehension:
[i * 3 for i in dictionary.keys()]
To take into account the order (because you shuffle it) shuffle the keys before the multiplication, then create the values list (in the same order that the shuffled keys) and finally multiply the keys:
keys = dictionary.keys()
random.shuffle(keys)
values = [dictionary[i]*3 for i in keys]
keys = [i * 3 for i in keys]
And finally:
random_key.append(keys.pop())
random_key_value.append(values.pop())
Also take care about the random function, it doesn't work as you are using it. See the documentation.

Correspendence between list indices originated from dictionary

I wrote the below code working with dictionary and list:
d = computeRanks() # dictionary of id : interestRank pairs
lst = list(d) # tuples (id, interestRank)
interestingIds = []
for i in range(20): # choice randomly 20 highly ranked ids
choice = randomWeightedChoice(d.values()) # returns random index from list
interestingIds.append(lst[choice][0])
There seems to be possible error because I'm not sure if there is a correspondence between indices in lst and d.values().
Do you know how to write this better?
One of the policies of dict is that the results of dict.keys() and dict.values() will correspond so long as the contents of the dictionary are not modified.
As #Ignacio says, the index choice does correspond to the intended element of lst, so your code's logic is correct. But your code should be much simpler: d already contains IDs for the elements, so rewrite randomWeightedChoice to take a dictionary and return an ID.
Perhaps it will help you to know that you can iterate over a dictionary's key-value pairs with d.items():
for k, v in d.items():
etc.

Categories

Resources