Picking random values multiple times from the same dict. - python

I'm trying to pick a random value from a dict. and then pick a second value from the same dict. guaranteeing that it is different.
def pick_value():
value, attribute = random.choice(list(my_dict.items()))
return(value, attribute)
If I call the function it works, however there is no guarantee that the second time I call it the value will be different than the first so I tried the following.
my_value_list = []
val1, attr1 = pick_value()
my_value_list.append(val1)
val2, attr2 = pick_value()
if val2 in my_value_list:
val2, attr2 = pick_value()
I still get matching values occasionally. I tried replacing the if val2 in statement with while val2 in and still no luck. Am I misunderstanding something simple?

If you need exactly two values (or any fixed number you know in advance), use random.sample(). That's what it's for: Sampling "without replacement", i.e. once you've picked an element from the list, it is no longer available to be picked again.
samples = random.sample(list(mydict.items()), 2)
attr1, val1 = samples[0]
attr2, val2 = samples[1]

As alexis has suggested, random.sample() is the right tool for this job, but for the sake of completeness, if you need to pick up random fields in a iterative/lazy fashion, you can do it yourself by:
def pick_random_destructive(data):
key = random.choice(data.keys()) if data else None
return key, data.pop(key, None)
However, that WILL modify the dict you pass to it. If you want a non-modifying iterative method you can create a generator like:
def pick_random_nondestructive(data):
keys = random.shuffle(data.keys())
while keys:
key = keys.pop()
yield key, data[key]

random.choice can achieve this but you'll be needing to retain state information after each call to the function as such when random.choice returns the previous random value, you ignore the value and call it again. Popping the value out of the sequence is another alternative:
def pick_value():
L = list(my_dict.items())
if hasattr(pick_value, 'prev'):
for _ in range(100):
res = random.choice(L)
if res != pick_value.prev:
pick_value.prev = res
return res
else:
res = random.choice(L)
pick_value.prev = res
return res
The first time you call pick_value there's no previous value, this is what the outer if statement is for, it checks if the there's a previous value and compare it against the newly chosen random value. This function returns None when random.choice fails to return a random value that's not equal to the previous one after 100 calls to random.choice.
*Of course, this is just a variation on a theme, other users provided simpler alternatives, so you don't necessarily need to wrap random.choice with the above logic.

This option is useful if you are already using numpy library:
Consider using random.choice from numpy:
import numpy as np
my_dict = {"a" : 1, "b": 2, "c": 3, "d" : 4}
#first, take two random key index from 0 to length
twoRandom = np.random.choice(list(my_dict.keys()),2,replace=False)
#then take the value of the key at index in the list
key1 = twoRandom[0]
key2 = twoRandom[1]
#finally, get the value
value1 = my_dict.get(key1)
value2 = my_dict.get(key2)

Related

Better ways to write a Python function to return highest value corresponding to a given input after comparing a list of dictionaries

I have a list of dictionaries as below,
test = [{'a':100, 'b':1, 'd':3.2},
{'a':200, 'b':5, 'd':8.75},
{'a':500, 'b':2, 'd':6.67},
{'a':150, 'b':7, 'd':3.86},
{'a':425, 'b':2, 'd':7.72},
{'a':424, 'b':2, 'd':7.72}]
Given a 'b' value, I need to find the maximum value of 'd' and extract the corresponding value of 'a' in that dictionary. If there's a tie, then take the highest value of 'a'. e.g. {a:424, b:2, d:7.72} and {a:424, b:2, d:7.72} has b = 2 and their corresponding d values are equal. In that case, I return a = 425.
Following code runs alright. However, I would like to know possible ways to optimise this or to use an anonymous function (lambda) to solve this.
def gerMaxA(bValue):
temporary_d = -999.99
temporary_a = 0
for i in test:
if i['b'] == bValue:
if i['d'] > temporary_d:
temporary_a = i['a']
temporary_d = i['d']
elif i['d'] == temporary_d:
if i['a'] >= temporary_a:
temporary_a = i['a']
ans = (temporary_a, temporary_d)
return ans
Appreciate any insights.
However, I would like to know possible ways to optimise this or to use an anonymous function (lambda) to solve this.
"Optimise" is a red herring - you cannot simple "optimise" something in a void, you must optimise it for some quality (speed, memory usage, etc.)
Instead, I will show how to make the code simpler and more elegant. This is theoretically off-topic for Stack Overflow, but IMX the system doesn't work very well if I try to send people elsewhere.
Given a 'b' value
This means that we will be selecting elements of the list that meet a condition (i.e., the 'b' value matches a target). Another word for this is filtering; while Python does have a built-in filter function, it is normally cleaner and more Pythonic to use a comprehension or generator expression for this purpose. Since we will be doing further processing on the results, we shouldn't choose yet.
I need to find the maximum value of 'd'
More accurately: you see the element which has the maximum value for 'd'. Or, as we like to think of it in the Python world, the maximum element, keyed by 'd'. This is built-in, using the max function. Since we will feed data directly to this function, we don't care about building up a container, so we will choose a generator expression for the first step.
The first step looks like this, and means exactly what it says, read left to right:
(element for element in test if element['b'] == b_value)
"A generator (()) producing: the element, for each element found in test, but only including it if the element's ['b'] value is == b_value".
In the second step, we wrap the max call around that, and supply the appropriate key function. This is, indeed, where we could use lambda:
max(
(element for element in test if element['b'] == b_value),
key=lambda element:(element['d'], element['a'])
)
The lambda here is a function that transforms a given element into that pair of values; max will then compare the filtered dicts according to what value is produced by that lambda for each.
Alternately, we could use a named function - lambdas are the same thing, just without a name and with limits on what can be done within them:
def my_key(element):
return element['d'], element['a']
# and then we do
max((element for element in test if element['b'] == b_value), key=my_key)
Or we could use the standard library:
from operator import itemgetter
max((element for element in test if element['b'] == b_value), key=itemgetter('d', 'a'))
The last step, of course, is simply to extract the ['a'] value from the max result.
Here's an approach that uses built-ins:
In [1]: from operator import itemgetter
In [2]: def max_a_value(b_value, data):
...: matching_values = (d for d in data if d['b'] == b_value)
...: return max(matching_values, key=itemgetter('d','a'))['a']
...:
In [3]: test = [{"a":100, "b":1, "d":3.2},
...: {"a":200, "b":5, "d":8.75},
...: {"a":500, "b":2, "d":6.67},
...: {"a":150, "b":7, "d":3.86},
...: {"a":425, "b":2, "d":7.72},
...: {"a":424, "b":2, "d":7.72}]
In [4]: max_a_value(2, test)
Out[4]: 425
Note, this isn't more algorithmically efficient. Both are O(N)
Yes, you can optimize this. With the given specifications, there is no reason to retain inferior entries. There is also no particular reason to keep this as a list of dictionaries. Instead, make this a simple data frame or reference table. The key is the 'b' value; the value is the desired 'a' value.
Make one pass over your data to convert to a single dict:
test = [ {
1: 100,
2: 425,
5: 200,
7: 150 } ]
There's your better data storage; you've already managed one version of conversion logic.
May be you want to check this. I don't know if it is more efficient but at least looks pythonic:
def gerMaxA(bValue):
dict = {i:x['d'] for i, x in enumerate(test) if x['b']== bValue}
idx = max(dict, key=dict.get)
max_d = test[idx]['d']
dict_a = {k :test[k]['a'] for k in dict.keys() if dict[k] == max_d}
idx_a = max(dict_a, key = dict_a.get)
return test[idx_a]['a'], test[idx]['d']
The last three lines of code make sure that it'll take the greater 'a' value in the case there were many of them.

Modifying a nested dictionary element by a reference, generated from a list

The code:
def main():
nested_dict = {'A': {'A_1': 'value_1', 'B_1': 'value_2'},
'B': 'value_3'}
access_pattern = ['A', 'B_1']
new_value = 'value_4'
nested_dict[access_pattern] = new_value
return nested_dict
Background information:
As can be seen, I have a variable called nested_dict - in reality, it contains hundreds of elements with a different number of sub-elements each (I'm simplifying it for the purpose of the example).
I need to modify the value of some elements inside this dictionary, but it is not predetermined which elements exactly. The specific "path" to the elements that need be modified, will be provided by the access_pattern variable, which will be different every time.
The problem:
I know how to reference the value of the dictionary with this function functools.reduce(dict.get, access_pattern, nested_dict). However, I do not know how to universally modify (regardless of the contained variable type) the value of the access_pattern in the dictionary.
The provided code produces a TypeError that I do not know how to overcome elegantly. I did think of some solution, specified in 4.
Possible solutions:
if len(access_pattern) == 1:
nested_dict[access_pattern[0]] = new_value
elif len(access_pattern) == 2:
nested_dict[access_pattern[0]][access_pattern[1]] = new_value
...
So on for all len()
This just seems VERY inelegant and painful. Is there a more practical way to achieve this?
Make use of recursion
def edit_from_access_pattern(access_pattern, nested_dict, new_value):
if len(access_pattern) == 1:
nested_dict[access_pattern[0]] = new_value
else:
return edit_from_access_pattern(access_pattern[1:], nested_dict[access_pattern[0], new_value]
You can use recursion
def set_value(container, key, value):
if len(key) == 1:
container[key[0]] = value
else:
set_value(container[key[0]], key[1:], value)
but an explicit loop is probably going to be more efficient
def set_value(container, key, value):
for i in range(len(key)-1):
container = container[key[i]]
container[key[-1]] = value

Test if all values of a dictionary are equal - when value is unknown

I have 2 dictionaries:
the values in each dictionary should all be equal.
BUT I don't know what that number will be...
dict1 = {'xx':A, 'yy':A, 'zz':A}
dict2 = {'xx':B, 'yy':B, 'zz':B}
N.B. A does not equal B
N.B. Both A and B are actually strings of decimal numbers (e.g. '-2.304998') as they have been extracted from a text file
I want to create another dictionary - that effectively summarises this data - but only if all the values in each dictionary are the same.
i.e.
summary = {}
if dict1['xx'] == dict1['yy'] == dict1['zz']:
summary['s'] = dict1['xx']
if dict2['xx'] == dict2['yy'] == dict2['zz']:
summary['hf'] = dict2['xx']
Is there a neat way of doing this in one line?
I know it is possible to create a dictionary using comprehensions
summary = {k:v for (k,v) in zip(iterable1, iterable2)}
but am struggling with both the underlying for loop and the if statement...
Some advice would be appreciated.
I have seen this question, but the answers all seem to rely on already knowing the value being tested (i.e. are all the entries in the dictionary equal to a known number) - unless I am missing something.
sets are a solid way to go here, but just for code golf purposes here's a version that can handle non-hashable dict values:
expected_value = next(iter(dict1.values())) # check for an empty dictionary first if that's possible
all_equal = all(value == expected_value for value in dict1.values())
all terminates early on a mismatch, but the set constructor is well enough optimized that I wouldn't say that matters without profiling on real test data. Handling non-hashable values is the main advantage to this version.
One way to do this would be to leverage set. You know a set of an iterable has a length of 1 if there is only one value in it:
if len(set(dct.values())) == 1:
summary[k] = next(iter(dct.values()))
This of course, only works if the values of your dictionary are hashable.
While we can use set for this, doing so has a number of inefficiencies when the input is large. It can take memory proportional to the size of the input, and it always scans the whole input, even when two distinct values are found early. Also, the input has to be hashable.
For 3-key dicts, this doesn't matter much, but for bigger ones, instead of using set, we can use itertools.groupby and see if it produces multiple groups:
import itertools
groups = itertools.groupby(dict1.values())
# Consume one group if there is one, then see if there's another.
next(groups, None)
if next(groups, None) is None:
# All values are equal.
do_something()
else:
# Unequal values detected.
do_something_else()
Except for readability, I don't care for all the answers involving set or .values. All of these are always O(N) in time and memory. In practice it can be faster, although it depends on the distribution of values.
Also because set employs hashing operations, you may also have a hefty large constant multiplier to your time cost. And your values have to hashable, when a test for equality is all that's needed.
It is theoretically better to take the first value from the dictionary and search for the first example in the remaining values that is not equal to.
set might be quicker than the solution below because its workings are may reduce to C implementations.
def all_values_equal(d):
if len(d)<=1: return True # Treat len0 len1 as all equal
i = d.itervalues()
firstval = i.next()
try:
# Incrementally generate all values not equal to firstval
# .next raises StopIteration if empty.
(j for j in i if j!=firstval).next()
return False
except StopIteration:
return True
print all_values_equal({1:0, 2:1, 3:0, 4:0, 5:0}) # False
print all_values_equal({1:0, 2:0, 3:0, 4:0, 5:0}) # True
print all_values_equal({1:"A", 2:"B", 3:"A", 4:"A", 5:"A"}) # False
print all_values_equal({1:"A", 2:"A", 3:"A", 4:"A", 5:"A"}) # True
In the above:
(j for j in i if j!=firstval)
is equivalent to:
def gen_neq(i, val):
"""
Give me the values of iterator i that are not equal to val
"""
for j in i:
if j!=val:
yield j
I found this solution, which I find quite a bit I combined another solution found here: enter link description here
user_min = {'test':1,'test2':2}
all(value == list(user_min.values())[0] for value in user_min.values())
>>> user_min = {'test':1,'test2':2}
>>> all(value == list(user_min.values())[0] for value in user_min.values())
False
>>> user_min = {'test':2,'test2':2}
>>> all(value == list(user_min.values())[0] for value in user_min.values())
True
>>> user_min = {'test':'A','test2':'B'}
>>> all(value == list(user_min.values())[0] for value in user_min.values())
False
>>> user_min = {'test':'A','test2':'A'}
>>> all(value == list(user_min.values())[0] for value in user_min.values())
True
Good for a small dictionary, but I'm not sure about a large dictionary, since we get all the values to choose the first one

function would not change the parameter as wanted

here is my code
def common_words(count_dict, limit):
'''
>>> k = {'you':2, 'made':1, 'me':1}
>>> common_words(k,2)
>>> k
{'you':2}
'''
new_list = list(revert_dictionary(count_dict).items())[::-1]
count_dict = {}
for number,word in new_list:
if len(count_dict) + len(word) <= limit:
for x in word:
count_dict[x] = number
print (count_dict)
def revert_dictionary(dictionary):
'''
>>> revert_dictionary({'sb':1, 'QAQ':2, 'CCC':2})
{1: ['sb'], 2: ['CCC', 'QAQ']}
'''
reverted = {}
for key,value in dictionary.items():
reverted[value] = reverted.get(value,[]) + [key]
return reverted
count_dict = {'you':2, 'made':1, 'me':1}
common_words(count_dict,2)
print (count_dict)
what i expected is to have the count_dict variable to change to {'you':2}.
It did work fine in the function's print statement, but not outside the function..
The problem, as others have already written, is that your function assigns a new empty dictionary to count_dict:
count_dict = {}
When you do this you modify the local variable count_dict, but the variable with the same name in the main part of your program continues to point to the original dictionary.
You should understand that you are allowed to modify the dictionary you passed in the function argument; just don't replace it with a new dictionary. To get your code to work without modifying anything else, you can instead delete all elements of the existing dictionary:
count_dict.clear()
This modifies the dictionary that was passed to the function, deleting all its elements-- which is what you intended. That said, if you are performing a new calculation it's usually a better idea to create a new dictionary in your function, and return it with return.
As already mentioned, the problem is that with count_dict = {} you are not changing the passed in dictionary, but you create a new one, and all subsequent changes are done on that new dictionary. The classical approach would be to just return the new dict, but it seems like you can't do that.
Alternatively, instead of adding the values to a new dictionary, you could reverse your condition and delete values from the existing dictionary. You can't use len(count_dict) in the condition, though, and have to use another variable to keep track of the elements already "added" to (or rather, not removed from) the dictionary.
def common_words(count_dict, limit):
new_list = list(revert_dictionary(count_dict).items())[::-1]
count = 0
for number,word in new_list:
if count + len(word) > limit:
for x in word:
del count_dict[x]
else:
count += len(word)
Also note that the dict returned from revert_dictionary does not have a particular order, so the line new_list = list(revert_dictionary(count_dict).items())[::-1] is not guaranteed to give you the items in any particular order, as well. You might want to add sorted here and sort by the count, but I'm not sure if you actually want that.
new_list = sorted(revert_dictionary(count_dict).items(), reverse=True)
just write
return count_dict
below
print count_dict
in function common_words()
and change
common_words(count_dict,2)
to
count_dict=common_words(count_dict,2)
So basically you need to return value from function and store that in your variable. When you are calling function and give it a parameter. It sends its copy to that function not variable itself.

python list, working with multiple elements

a is a list filled dynamically with values being received in no specific order. So, if the next value received was (ID2,13), how could I remove the (ID2,10) based on the fact that ID2 was the next value received? Because I don't know the order in which the list is being populated, I won't know the index.
Also, how would I know the count of a specfic ID?
I have tried a.count(ID1) but because of the second element, it fails to find any.
a = [(ID1,10),(ID2,10),(ID1,12),(ID2,15)]
My current usage:
while True:
'Receive ID information in format (IDx,Value)'
ID_info = (ID2,13) #For example
if a.count(ID2) == 2: #I need to modify this line as it always returns 0
del a[0] #This is what I need to modify to delete the correct information, as it is not always index 0
a.append(ID_info)
else:
a.append(ID_info)
Assuming that the ID's are hashable, it sounds like you want to be using a dictionary.
a = {ID1: 10, ID2: 10}
id, val = (ID2, 13)
a[id] = val
With the "keep two" addition, I still think it's easier with a dictionary, though with some modifications.
EDIT: Simpler version using collections.defaultdict.
from collections import defaultdict
a = defaultdict(list)
a[ID1].append(10)
a[ID2].append(10)
id, val = (ID2, 13)
a[id].append(val)
if len(a[id]) > 2:
a[id].pop(0)
def count(a, id):
return len(a[id])
a = {ID1: [10], ID2: [10]}
id, val = (ID2, 13)
if id not in a.keys():
a[id] = []
a[id].append(val)
if len(a[id]) > 2:
a[id].pop(0)
def count(a, id):
if id not in a.keys():
return 0
else:
return len(a[id])
You could (and probably should) also encapsulate this behavior into a simple class inherited from dict.

Categories

Resources