python list, working with multiple elements - python

a is a list filled dynamically with values being received in no specific order. So, if the next value received was (ID2,13), how could I remove the (ID2,10) based on the fact that ID2 was the next value received? Because I don't know the order in which the list is being populated, I won't know the index.
Also, how would I know the count of a specfic ID?
I have tried a.count(ID1) but because of the second element, it fails to find any.
a = [(ID1,10),(ID2,10),(ID1,12),(ID2,15)]
My current usage:
while True:
'Receive ID information in format (IDx,Value)'
ID_info = (ID2,13) #For example
if a.count(ID2) == 2: #I need to modify this line as it always returns 0
del a[0] #This is what I need to modify to delete the correct information, as it is not always index 0
a.append(ID_info)
else:
a.append(ID_info)

Assuming that the ID's are hashable, it sounds like you want to be using a dictionary.
a = {ID1: 10, ID2: 10}
id, val = (ID2, 13)
a[id] = val
With the "keep two" addition, I still think it's easier with a dictionary, though with some modifications.
EDIT: Simpler version using collections.defaultdict.
from collections import defaultdict
a = defaultdict(list)
a[ID1].append(10)
a[ID2].append(10)
id, val = (ID2, 13)
a[id].append(val)
if len(a[id]) > 2:
a[id].pop(0)
def count(a, id):
return len(a[id])
a = {ID1: [10], ID2: [10]}
id, val = (ID2, 13)
if id not in a.keys():
a[id] = []
a[id].append(val)
if len(a[id]) > 2:
a[id].pop(0)
def count(a, id):
if id not in a.keys():
return 0
else:
return len(a[id])
You could (and probably should) also encapsulate this behavior into a simple class inherited from dict.

Related

I just want to print the key, not their index

I need to print the keys + their values and it always prints the index of the key too, how can I fix that?
def task_3_4(something:str):
alphabet =list(string.ascii_letters)
i = 0
k=0
while i < len(alphabet):
dicts = {alphabet[i]: 0}
count = something.count(alphabet[i])
dicts[i] = count
if 0 < count:
for k in dicts:
print(k)
i = i+1
Based on the code it seems like you are trying to do some sort of counter of different characters in the string?
There is no index. your "index" is the "i" iterator you are using for your while loop. This simply makes a new key in dicts as called by dicts[i]. Thus when you call the print loop, it just iterates through and reads out I as well.
Try:
dicts[alphabet[i]] = count
Also your print function only prints out the key of the dict entry instead of the key-value pair. to do that you can try:
for k in dicts:
print(k,dicts[k])
Try reading up on the python docs for dicts.
https://docs.python.org/3/tutorial/datastructures.html

Picking random values multiple times from the same dict.

I'm trying to pick a random value from a dict. and then pick a second value from the same dict. guaranteeing that it is different.
def pick_value():
value, attribute = random.choice(list(my_dict.items()))
return(value, attribute)
If I call the function it works, however there is no guarantee that the second time I call it the value will be different than the first so I tried the following.
my_value_list = []
val1, attr1 = pick_value()
my_value_list.append(val1)
val2, attr2 = pick_value()
if val2 in my_value_list:
val2, attr2 = pick_value()
I still get matching values occasionally. I tried replacing the if val2 in statement with while val2 in and still no luck. Am I misunderstanding something simple?
If you need exactly two values (or any fixed number you know in advance), use random.sample(). That's what it's for: Sampling "without replacement", i.e. once you've picked an element from the list, it is no longer available to be picked again.
samples = random.sample(list(mydict.items()), 2)
attr1, val1 = samples[0]
attr2, val2 = samples[1]
As alexis has suggested, random.sample() is the right tool for this job, but for the sake of completeness, if you need to pick up random fields in a iterative/lazy fashion, you can do it yourself by:
def pick_random_destructive(data):
key = random.choice(data.keys()) if data else None
return key, data.pop(key, None)
However, that WILL modify the dict you pass to it. If you want a non-modifying iterative method you can create a generator like:
def pick_random_nondestructive(data):
keys = random.shuffle(data.keys())
while keys:
key = keys.pop()
yield key, data[key]
random.choice can achieve this but you'll be needing to retain state information after each call to the function as such when random.choice returns the previous random value, you ignore the value and call it again. Popping the value out of the sequence is another alternative:
def pick_value():
L = list(my_dict.items())
if hasattr(pick_value, 'prev'):
for _ in range(100):
res = random.choice(L)
if res != pick_value.prev:
pick_value.prev = res
return res
else:
res = random.choice(L)
pick_value.prev = res
return res
The first time you call pick_value there's no previous value, this is what the outer if statement is for, it checks if the there's a previous value and compare it against the newly chosen random value. This function returns None when random.choice fails to return a random value that's not equal to the previous one after 100 calls to random.choice.
*Of course, this is just a variation on a theme, other users provided simpler alternatives, so you don't necessarily need to wrap random.choice with the above logic.
This option is useful if you are already using numpy library:
Consider using random.choice from numpy:
import numpy as np
my_dict = {"a" : 1, "b": 2, "c": 3, "d" : 4}
#first, take two random key index from 0 to length
twoRandom = np.random.choice(list(my_dict.keys()),2,replace=False)
#then take the value of the key at index in the list
key1 = twoRandom[0]
key2 = twoRandom[1]
#finally, get the value
value1 = my_dict.get(key1)
value2 = my_dict.get(key2)

function would not change the parameter as wanted

here is my code
def common_words(count_dict, limit):
'''
>>> k = {'you':2, 'made':1, 'me':1}
>>> common_words(k,2)
>>> k
{'you':2}
'''
new_list = list(revert_dictionary(count_dict).items())[::-1]
count_dict = {}
for number,word in new_list:
if len(count_dict) + len(word) <= limit:
for x in word:
count_dict[x] = number
print (count_dict)
def revert_dictionary(dictionary):
'''
>>> revert_dictionary({'sb':1, 'QAQ':2, 'CCC':2})
{1: ['sb'], 2: ['CCC', 'QAQ']}
'''
reverted = {}
for key,value in dictionary.items():
reverted[value] = reverted.get(value,[]) + [key]
return reverted
count_dict = {'you':2, 'made':1, 'me':1}
common_words(count_dict,2)
print (count_dict)
what i expected is to have the count_dict variable to change to {'you':2}.
It did work fine in the function's print statement, but not outside the function..
The problem, as others have already written, is that your function assigns a new empty dictionary to count_dict:
count_dict = {}
When you do this you modify the local variable count_dict, but the variable with the same name in the main part of your program continues to point to the original dictionary.
You should understand that you are allowed to modify the dictionary you passed in the function argument; just don't replace it with a new dictionary. To get your code to work without modifying anything else, you can instead delete all elements of the existing dictionary:
count_dict.clear()
This modifies the dictionary that was passed to the function, deleting all its elements-- which is what you intended. That said, if you are performing a new calculation it's usually a better idea to create a new dictionary in your function, and return it with return.
As already mentioned, the problem is that with count_dict = {} you are not changing the passed in dictionary, but you create a new one, and all subsequent changes are done on that new dictionary. The classical approach would be to just return the new dict, but it seems like you can't do that.
Alternatively, instead of adding the values to a new dictionary, you could reverse your condition and delete values from the existing dictionary. You can't use len(count_dict) in the condition, though, and have to use another variable to keep track of the elements already "added" to (or rather, not removed from) the dictionary.
def common_words(count_dict, limit):
new_list = list(revert_dictionary(count_dict).items())[::-1]
count = 0
for number,word in new_list:
if count + len(word) > limit:
for x in word:
del count_dict[x]
else:
count += len(word)
Also note that the dict returned from revert_dictionary does not have a particular order, so the line new_list = list(revert_dictionary(count_dict).items())[::-1] is not guaranteed to give you the items in any particular order, as well. You might want to add sorted here and sort by the count, but I'm not sure if you actually want that.
new_list = sorted(revert_dictionary(count_dict).items(), reverse=True)
just write
return count_dict
below
print count_dict
in function common_words()
and change
common_words(count_dict,2)
to
count_dict=common_words(count_dict,2)
So basically you need to return value from function and store that in your variable. When you are calling function and give it a parameter. It sends its copy to that function not variable itself.

Python Group by count

Given a dictionary, I need some way to do the following:
In the dictionary, we have names, gender, occupation, and salary. I need to figure out if each name I search in the dictionay, there are no more than 5 other employees that have the same name, gender and occupation. If so, I output it. Otherwise, I remove it.
Any help or resources would be appreciated!
What I researched:
count = Counter(tok['Name'] for tok in input_file)
This counts the number of occurances for name (ie Bob: 2, Amy: 4). However, I need to add the gender and occupation to this as well (ie Bob, M, Salesperson: 2, Amy, F, Manager: 1).
To only check if the dictionary has 5 or more (key,value) pairs, in which the name,gender and occupation of employee is same, is quite simple. To remove all such inconsistencies is tricky.
# data = {}
# key = 'UID'
# value = ('Name','Male','Accountant','20000')
# data[key] = value
def consistency(dictionary):
temp_list_of_values_we_care_about = [(x[0],x[1],x[2]) for x in dictionary.itervalues()]
temp_dict = {}
for val in temp_list_of_values_we_care_about:
if val in temp_dict:
temp_dict[val] += 1
else:
temp_dict[val] = 1
if max(temp_dict.values()) >=5:
return False
else:
return True
And to actually, get a dictionary with those particular values removed, there are two ways.
Edit and update the original dictionary. (Doing it in-place)
Create a new dictionary and add only those values which satisfy our constraint.
def consistency(dictionary):
temp_list_of_values_we_care_about = [(x[0],x[1],x[2]) for x in dictionary.itervalues()]
temp_dict = {}
for val in temp_list_of_values_we_care_about:
if val in temp_dict:
temp_dict[val] += 1
else:
temp_dict[val] = 1
new_dictionary = {}
for key in dictionary:
value = dictionary[key]
temp = (value[0],value[1],value[2])
if temp_dict[temp] <=5:
new_dictionary[key] = value
return new_dictionary
P.S. I have chosen the much easier second way to do it. Choosing the first method will cause a lot of computation overhead, and we certainly would want to avoid that.

delta-dictionary/dictionary with revision awareness in python?

I am looking to create a dictionary with 'roll-back' capabilities in python. The dictionary would start with a revision number of 0, and the revision would be bumped up only by explicit method call. I do not need to delete keys, only add and update key,value pairs, and then roll back. I will never need to 'roll forward', that is, when rolling the dictionary back, all the newer revisions can be discarded, and I can start re-reving up again. thus I want behaviour like:
>>> rr = rev_dictionary()
>>> rr.rev
0
>>> rr["a"] = 17
>>> rr[('b',23)] = 'foo'
>>> rr["a"]
17
>>> rr.rev
0
>>> rr.roll_rev()
>>> rr.rev
1
>>> rr["a"]
17
>>> rr["a"] = 0
>>> rr["a"]
0
>>> rr[('b',23)]
'foo'
>>> rr.roll_to(0)
>>> rr.rev
0
>>> rr["a"]
17
>>> rr.roll_to(1)
Exception ...
Just to be clear, the state associated with a revision is the state of the dictionary just prior to the roll_rev() method call. thus if I can alter the value associated with a key several times 'within' a revision, and only have the last one remembered.
I would like a fairly memory-efficient implementation of this: the memory usage should be proportional to the deltas. Thus simply having a list of copies of the dictionary will not scale for my problem. One should assume the keys are in the tens of thousands, and the revisions are in the hundreds of thousands.
We can assume the values are immutable, but need not be numeric. For the case where the values are e.g. integers, there is a fairly straightforward implementation (have a list of dictionaries of the numerical delta from revision to revision). I am not sure how to turn this into the general form. Maybe bootstrap the integer version and add on an array of values?
all help appreciated.
Have just one dictionary, mapping from the key to a list of (revision_number, actual_value) tuples. Current value is the_dict[akey][-1][1]. Rollback merely involves popping the appropriate entries off the end of each list.
Update: examples of rollback
key1 -> [(10, 'v1-10'), (20, 'v1-20')]
Scenario 1: current revision is 30, rollback to 25: nothing happens
Scenario 2: current 30, back to 15: pop last entry
Scenario 3: current 30, back to 5: pop both entries
Update 2: faster rollback (with trade-offs)
I think your concern about popping every list is better expressed as "needs to examine every list to see if it needs popping". With a fancier data structure (more memory, more time to maintain the fancy bits in add and update operations) you can reduce the time to roll back.
Add an array (indexed by revision number) whose values are lists of the dictionary values that were changed in that revision.
# Original rollback code:
for rlist in the_dict.itervalues():
if not rlist: continue
while rlist[-1][0] > target_revno:
rlist.pop()
# New rollback code
for revno in xrange(current_revno, target_revno, -1):
for rlist in delta_index[revno]:
assert rlist[-1][0] == revno
del rlist[-1] # faster than rlist.pop()
del delta_index[target_revno+1:]
Update 3: full code for fancier method
import collections
class RevDict(collections.MutableMapping):
def __init__(self):
self.current_revno = 0
self.dict = {}
self.delta_index = [[]]
def __setitem__(self, key, value):
if key in self.dict:
rlist = self.dict[key]
last_revno = rlist[-1][0]
rtup = (self.current_revno, value)
if last_revno == self.current_revno:
rlist[-1] = rtup
# delta_index already has an entry for this rlist
else:
rlist.append(rtup)
self.delta_index[self.current_revno].append(rlist)
else:
rlist = [(self.current_revno, value)]
self.dict[key] = rlist
self.delta_index[self.current_revno].append(rlist)
def __getitem__(self, key):
if not key in self.dict:
raise KeyError(key)
return self.dict[key][-1][1]
def new_revision(self):
self.current_revno += 1
self.delta_index.append([])
def roll_back(self, target_revno):
assert 0 <= target_revno < self.current_revno
for revno in xrange(self.current_revno, target_revno, -1):
for rlist in self.delta_index[revno]:
assert rlist[-1][0] == revno
del rlist[-1]
del self.delta_index[target_revno+1:]
self.current_revno = target_revno
def __delitem__(self, key):
raise TypeError("RevDict doesn't do del")
def keys(self):
return self.dict.keys()
def __contains__(self, key):
return key in self.dict
def iteritems(self):
for key, rlist in self.dict.iteritems():
yield key, rlist[-1][1]
def __len__(self):
return len(self.dict)
def __iter__(self):
return self.dict.iterkeys()
The deluxe solution would be to use B+Trees with copy-on-write. I used a variation on B+Trees to implement my blist data type (which can be used to very efficiently create revisions of lists, exactly analogous to your problem).
The general idea is to store the data in a balanced tree. When you create a new revision, you copy only the root node. If you need to modify a node shared with an older revision, you copy the node and modify the copy instead. That way, the old tree is still completely intact, but you only need memory for the changes (technically, O(k * log n) where k is the number of changes and n is the total number of items).
It's non-trivial to implement, though.

Categories

Resources