Looping through a list adding values to dictionaries - python

I have a list of 4 dictionaries. I also have a list of 36 values. I'm trying to loop through the list of dictionaries, filling each dictionary with the list values in ascending order having item 1 as the key, 2 & 3 as the value attached and so on. The final result being 4 dictionaries, each with three keys and each key having two values attached.
The end result would be
dict1={x:(7,4),y:(7,8),z:(7:22)}
dict2={x: (111,4),y:(111,8),z:(111:22)}
...
Currently I have the following which does not work.
my_list = ['x',7,4,'y',7,8,'z',7,22,'x',111,4,'y',111,8,'z',111,22 and so on]
dict1={}
dict2={}
dict3={}
dict4={}
my_dict_list=[dict1,dict2,dict3,dict4]
for dicts in my_dict_list:
for x in range (0,len(my_list),3):
dicts[my_list[x]] = my_list[x+1],my_list[x+2]
break
the output of that code being the first 3 items in my list, in all four of the dictionaries. As so:
>>> dict_1
{'x': (7, 4)}
>>> dict_2
{'x': (7, 4)}
>>> dict_3
{'x': (7, 4)}
>>> dict_4
{'x': (7, 4)}
I think this is the closest I've got so far as it is actually filling each dictionary, previously I've only managed to fill the first dictionary and other similar wrong scenarios. Can anyone help or point me in the right direction?

Try this:
my_list = ['x',7,4,'y',7,8,'z',7,22,'x',111,4,'y',111,8,'z',111,22]
my_dict_list = []
idx = 0
current_dict = {}
while idx < len(my_list):
key, *vals = my_list[idx : idx + 3]
# If key repeats, starting a new current_dict, old one adding to my_dict_list
if key in current_dict:
my_dict_list.append(current_dict)
current_dict = {}
current_dict[key] = vals
idx += 3
if current_dict:
my_dict_list.append(current_dict)
print(my_dict_list)
In the solution above we loop my_list reading 3 values at a time. And in case if key of a dictionary is repeated this is interpreted and a starting of a new dictionary so the old one must be stored in my_dict_list.

Related

Convert a list with duplicating keys into a dictionary and sum the values for each duplicating key

I am new to Python so I do apologize that my first question might not be asked clearly to achieve the right answer.
I thought if I converted a list with duplicating keys into a dictionary then I would be able to sum the values of each duplicating key. I have tried to search on Google and Stack Overflow but I actually still can't solve this problem.
Can anybody help, please? Thank you very much in advance and I truly appreciate your help.
list1 = ["a:2", "b:5", "c:7", "a:8", "b:12"]
My expected output is:
dict = {a: 10, b: 17, c: 7}
You can try this code:
list1 = ["a:2", "b:5", "c:7", "a:8", "b:12"]
l1 = [each.split(":") for each in list1]
d1 = {}
for each in l1:
if each[0] not in d1:
d1[each[0]] = int(each[1])
else:
d1[each[0]] += int(each[1])
d1
Output: {'a': 10, 'b': 17, 'c': 7}
Explanation:
Step 1. Convert your given list to key-value pair by splitting each of the elements in your original list from : and store that in a list/tuple
Step 2. Initialize an empty dictionary
Step 3. Iterate through each key-value pair in the newly created list/tuple and store that in a dictionary. If the key doesn't exist, then add new key-value pair to dictionary or else just add the values to it's corresponding key.
A list does not have "keys" per say, rather it has elements. In your example, the elements them selves are a key value pair. To make the dictionary you want you have to do 3 things,
Parse each element into its key value pair
Handle duplicate values
Add each pair to the dictionary.
the code should look like this
list1 = ["a:2", "b:5", "c:7", "a:8", "b:12"]
dict1={}#make an empty dictionary
for element in list1:
key,value=element.split(':')#This splits your list elements into a tuple of (key,value)
if key in dict1:#check if the key is in the dictionary
dict1[key]+=int(value)#add to existing key
else:
dict1[key]=int(value)#initilize new key
print(dict1)
That code prints out
{'a': 10, 'c': 7, 'b': 17}
You could use a defaultdict, iterate over each string and add the corresponding value after splitting it to a pair (key, value).
>>> from collections import defaultdict
>>> res = defaultdict(int)
>>> for el in list1:
... k, v = el.split(':')
... res[k]+=int(v)
...
>>> res
defaultdict(<class 'int'>, {'a': 10, 'b': 17, 'c': 7})

Python - Get key of specific tuple index minimum in dictionary of tuples

I have a dict of tuples such as:
d = {'a': (3, 5), 'b': (5, 8), 'c': (9, 3)}
I want to return the key of the minimum of the tuple values based on the tuple index. For example, if using tuple index = 0, then 'a' would be returned. if index = 1, then 'c' would be returned. I have tried using min(), for example
min(d, key=d.get)
but am not sure how to manipulate it to select the tuple index to use. Although there are similar questions, I have not found an answer to this. Apologies in advance if this is a duped question, and please link to the answer. Thanks
You can write a lambda function to get the elements from the value by their index:
min(d, key=lambda k: d[k][0])
# 'a'
min(d, key=lambda k: d[k][1])
# 'c'
Since multiple keys could have the same value, you might want to return a list of matching keys, not just a single key.
def min_keys(d, index):
# Initialize lists
values = []
matches = []
# Append tuple items to list based on index
for t in list(d.values()):
values.append(t[index])
# If the item matches the min, append the key to list
for key in d:
if d[key][index] == min(values):
matches.append(key)
# Return a list of all keys with min value at index
return matches
Dictionaries are unsorted and have no index.
If you want the return the key alphabetically first you could use the ascii order:
print(chr(min([ord(key) for key in d.keys()])))
Here's a portable method you can use for dicts with a structure like yours, and feel free to choose the index of interest in the tuple:
def extract_min_key_by_index(cache, index):
min_val = float('inf')
min_key = 0
for k, v in d.iteritems():
if v[index] < min_val:
min_key, min_val = k, v[index]
return min_key
d = {'a': (3, 5), 'b': (5, 8), 'c': (9, 3)}
INDEX = 0
print extract_min_key_by_index(d, INDEX)

Ordering a nested dictionary by the frequency of the nested value

I have this list made from a csv which is massive.
For every item in list, I have broken it into it's id and details. id is always between 0-3 characters max length and details is variable.
I created an empty dictionary, D...(rest of code below):
D={}
for v in list:
id = v[0:3]
details = v[3:]
if id not in D:
D[id] = {}
if details not in D[id]:
D[id][details] = 0
D[id][details] += 1
aside: Can you help me understand what the two if statements are doing? Very new to python and programming.
Anyway, it produces something like this:
{'KEY1_1': {'key2_1' : value2_1, 'key2_2' : value2_2, 'key2_3' : value2_3},
'KEY1_2': {'key2_1' : value2_1, 'key2_2' : value2_2, 'key2_3' : value2_3},
and many more KEY1's with variable numbers of key2's
Each 'KEY1' is unique but each 'key2' isn't necessarily. The value2_
s are all different.
Ok so, right now I found a way to sort by the first KEY
for k, v in sorted(D.items()):
print k, ':', v
I have done enough research to know that dictionaries can't really be sorted but I don't care about sorting, I care about ordering or more specifically frequencies of occurrence. In my code value2_x is the number of times its corresponding key2_x occurs for that particular KEY1_x. I am starting to think I should have used better variable names.
Question: How do I order the top-level/overall dictionary by the number in value2_x which is in the nested dictionary? I want to do some statistics to those numbers like...
How many times does the most frequent KEY1_x:key2_x pair show up?
What are the 10, 20, 30 most frequent KEY1_x:key2_x pairs?
Can I only do that by each KEY1 or can I do it overall? Bonus: If I could order it that way for presentation/sharing that would be very helpful because it is such a large data set. So much thanks in advance and I hope I've made my question and intent clear.
You could use Counter to order the key pairs based on their frequency. It also provides an easy way to get x most frequent items:
from collections import Counter
d = {
'KEY1': {
'key2_1': 5,
'key2_2': 1,
'key2_3': 3
},
'KEY2': {
'key2_1': 2,
'key2_2': 3,
'key2_3': 4
}
}
c = Counter()
for k, v in d.iteritems():
c.update({(k, k1): v1 for k1, v1 in v.iteritems()})
print c.most_common(3)
Output:
[(('KEY1', 'key2_1'), 5), (('KEY2', 'key2_3'), 4), (('KEY2', 'key2_2'), 3)]
If you only care about the most common key pairs and have no other reason to build nested dictionary you could just use the following code:
from collections import Counter
l = ['foobar', 'foofoo', 'foobar', 'barfoo']
D = Counter((v[:3], v[3:]) for v in l)
print D.most_common() # [(('foo', 'bar'), 2), (('foo', 'foo'), 1), (('bar', 'foo'), 1)]
Short explanation: ((v[:3], v[3:]) for v in l) is a generator expression that will generate tuples where first item is the same as top level key in your original dict and second item is the same as key in nested dict.
>>> x = list((v[:3], v[3:]) for v in l)
>>> x
[('foo', 'bar'), ('foo', 'foo'), ('foo', 'bar'), ('bar', 'foo')]
Counter is a subclass of dict. It accepts an iterable as an argument and each unique element in iterable will be used as key and value is the count of element in the iterable.
>>> c = Counter(x)
>>> c
Counter({('foo', 'bar'): 2, ('foo', 'foo'): 1, ('bar', 'foo'): 1})
Since generator expression is an iterable there's no need to convert it to list in between so construction can simply be done with Counter((v[:3], v[3:]) for v in l).
The if statements you asked about are checking if the key exists in dict:
>>> d = {1: 'foo'}
>>> 1 in d
True
>>> 2 in d
False
So the following code will check if key with value of id exists in dict D and if it doesn't it will assign empty dict there.
if id not in D:
D[id] = {}
The second if does exactly the same for nested dictionaries.

comparing list of tuple elements python

I have a two list of tuples
t1 = [ ('a',3,4), ('b',3,4), ('c',4,5) ]
t2 = [ ('a',4,6), ('c',3,4), ('b',3,6), ('d',4,5) ]
Such that
the order of the tuples may not be the same order and
the lists may not contain the same amount of tuple elements.
My goal is to compare the two lists such that if the string element matches, then compare the last integer element in the tuple and return a list containing -1 if t1[2] < t2[2], 0 if they are equal and 1 if they are greater than.
I've tried different variations but the problem i have is finding a way to match the strings to do proper comparison.
return [diff_unique(x[2],y[2]) for x,y in zip(new_list,old_list) ]
Where diff_unique does the aforementioned comparison of the integers, and new_list is t1 and old_list is t2.
I've also tried this:
return [diff_unique(x[2],y[2]) for x,y in zip(new_list,old_list) if(x[0]==y[0]]
What I intend to do is use the returned list and create a new four-tuple list with the original t1 values along with the difference from the matching t2 tuple. i.e
inc_dec_list = compare_list(new,old)
final_list = [ (f,r,u,chge) for (f,r,u), chge in zip(new,inc_dec_list)]
Where new = t1 and old = t2. This may have been an important detail, sorry I missed it.
Any help in the right direction?
Edit: I have added my test case program that mimicks what my original intent is for those who want to help. Thank you all.
import os
import sys
old = [('a',10,1),('b',10,2),('c',100,4),('d',200,4),('f',45,2)]
new = [('a',10,2),('c',10,2),('b',100,2),('d',200,6),('e',233,4),('g',45,66)]
def diff_unique(a,b):
print "a:{} = b:{}".format(a,b)
if a < b:
return -1
elif a==b:
return 0
else:
return 1
def compare_list(new_list, old_list):
a = { t[0]:t[1:] for t in new_list }
b = { t[0]:t[1:] for t in old_list }
common = list( set(a.keys())&set(b.keys()))
return [diff_unique(a[key][1], b[key][1]) for key in common]
#get common tuples
#common = [x for x,y in zip(new_list,old_list) if x[0] == y[0] ]
#compare common to old list
#return [diff_unique(x[2],y[2]) for x,y in zip(new_list,old_list) ]
inc_dec_list = compare_list(new,old)
print inc_dec_list
final_list = [ (f,r,u,chge) for (f,r,u), chge in zip(new,inc_dec_list)]
print final_list
To match the tuples by string from different lists, you can use dict comprehension (order inside the tuples is preserved):
a = {t[0]:t[1:] for t in t1} # {'a': (3, 4), 'c': (4, 5), 'b': (3, 4)}
b = {t[0]:t[1:] for t in t1} # {'a': (4, 6), 'c': (3, 4), 'b': (3, 6), 'd': (4, 5)}
Then you can iterate over the keys of both dictionaries and do the comparison. Assuming you only want to do the comparison for keys/tuples present in t1 and t2, you can join the keys using sets:
common_keys = list(set(a.keys())&set(b.keys()))
And finally compare the dictionary's items and create the list you want like this:
return [diff_unique(a[key][1],b[key][1]) for key in common_keys ]
If you need the output in the order of the alphabetically sorted characters, use the sorted function on the keys:
return [diff_unique(a[key][1],b[key][1]) for key in sorted(common_keys) ]
If you want all keys to be considered, you can do the following:
all_keys = list(set(a.keys()+b.keys()))
l = list()
for key in sorted(all_keys):
try:
l.append(diff_unique(a[key][1],b[key][1]))
except KeyError:
l.append("whatever you want")
return l
With the new information about what values should be returned in what order, the solution would be this:
ordered_keys = [t[0] for t in t1]
a = {t[0]:t[1:] for t in t1} # {'a': (3, 4), 'c': (4, 5), 'b': (3, 4)}
b = {t[0]:t[1:] for t in t1} # {'a': (4, 6), 'c': (3, 4), 'b': (3, 6), 'd': (4, 5)}
l = list()
for key in sorted(ordered_keys):
try:
l.append(diff_unique(a[key][1],b[key][1]))
except KeyError:
l.append(0) # default value
return l
First, build a default dictionary from each list, with the default value for a nonexistent key being a tuple whose last element is the smallest possible value for a comparison.
SMALL = (-float['inf'],)
from collections import defaultdict
d1 = defaultdict(lambda: SMALL, [(t[0], t[1:]) for t in t1])
d2 = defaultdict(lambda: SMALL, [(t[0], t[1:]) for t in t2])
Next, iterate over the keys in each dictionary (which can be created easily with itertools.chain). You probably want to sort the keys for the resulting list to have any meaning (otherwise, how do you know which keys produced which of -1/0/1?)
from itertools import chain
all_keys = set(chain(d1, d2))
result = [cmp(d1[k][-1], d2[k][-1]) for k in sorted(all_keys)]
Here is a simple solution of your problem,
It is not one line as you tried. I hope it will still help you
for a in t1:
for b in t2:
if a[0] != b[0]:
continue
return cmp(a[-1], b[-1])
In python 3.x, you can compare two lists of tuples
a and b thus:
import operator
a = [(1,2),(3,4)]
b = [(3,4),(1,2)]
# convert both lists to sets before calling the eq function
print(operator.eq(set(a),set(b))) #True

Inverting a dictionary when some of the original values are identical

Say I have a dictionary called word_counter_dictionary that counts how many words are in the document in the form {'word' : number}. For example, the word "secondly" appears one time, so the key/value pair would be {'secondly' : 1}. I want to make an inverted list so that the numbers will become keys and the words will become the values for those keys so I can then graph the top 25 most used words. I saw somewhere where the setdefault() function might come in handy, but regardless I cannot use it because so far in the class I am in we have only covered get().
inverted_dictionary = {}
for key in word_counter_dictionary:
new_key = word_counter_dictionary[key]
inverted_dictionary[new_key] = word_counter_dictionary.get(new_key, '') + str(key)
inverted_dictionary
So far, using this method above, it works fine until it reaches another word with the same value. For example, the word "saves" also appears once in the document, so Python will add the new key/value pair just fine. BUT it erases the {1 : 'secondly'} with the new pair so that only {1 : 'saves'} is in the dictionary.
So, bottom line, my goal is to get ALL of the words and their respective number of repetitions in this new dictionary called inverted_dictionary.
A defaultdict is perfect for this
word_counter_dictionary = {'first':1, 'second':2, 'third':3, 'fourth':2}
from collections import defaultdict
d = defaultdict(list)
for key, value in word_counter_dictionary.iteritems():
d[value].append(key)
print(d)
Output:
defaultdict(<type 'list'>, {1: ['first'], 2: ['second', 'fourth'], 3: ['third']})
What you can do is convert the value in a list of words with the same key:
word_counter_dictionary = {'first':1, 'second':2, 'third':3, 'fourth':2}
inverted_dictionary = {}
for key in word_counter_dictionary:
new_key = word_counter_dictionary[key]
if new_key in inverted_dictionary:
inverted_dictionary[new_key].append(str(key))
else:
inverted_dictionary[new_key] = [str(key)]
print inverted_dictionary
>>> {1: ['first'], 2: ['second', 'fourth'], 3: ['third']}
Python dicts do NOT allow repeated keys, so you can't use a simple dictionary to store multiple elements with the same key (1 in your case). For your example, I'd rather have a list as the value of your inverted dictionary, and store in that list the words that share the number of appearances, like:
inverted_dictionary = {}
for key in word_counter_dictionary:
new_key = word_counter_dictionary[key]
if new_key in inverted_dictionary:
inverted_dictionary[new_key].append(key)
else:
inverted_dictionary[new_key] = [key]
In order to get the 25 most repeated words, you should iterate through the (sorted) keys in the inverted_dictionary and store the words:
common_words = []
for key in sorted(inverted_dictionary.keys(), reverse=True):
if len(common_words) < 25:
common_words.extend(inverted_dictionary[key])
else:
break
common_words = common_words[:25] # In case there are more than 25 words
Here's a version that doesn't "invert" the dictionary:
>>> import operator
>>> A = {'a':10, 'b':843, 'c': 39, 'd': 10}
>>> B = sorted(A.iteritems(), key=operator.itemgetter(1), reverse=True)
>>> B
[('b', 843), ('c', 39), ('a', 10), ('d', 10)]
Instead, it creates a list that is sorted, highest to lowest, by value.
To get the top 25, you simply slice it: B[:25].
And here's one way to get the keys and values separated (after putting them into a list of tuples):
>>> [x[0] for x in B]
['b', 'c', 'a', 'd']
>>> [x[1] for x in B]
[843, 39, 10, 10]
or
>>> C, D = zip(*B)
>>> C
('b', 'c', 'a', 'd')
>>> D
(843, 39, 10, 10)
Note that if you only want to extract the keys or the values (and not both) you should have done so earlier. This is just examples of how to handle the tuple list.
For getting the largest elements of some dataset an inverted dictionary might not be the best data structure.
Either put the items in a sorted list (example assumes you want to get to two most frequent words):
word_counter_dictionary = {'first':1, 'second':2, 'third':3, 'fourth':2}
counter_word_list = sorted((count, word) for word, count in word_counter_dictionary.items())
Result:
>>> print(counter_word_list[-2:])
[(2, 'second'), (3, 'third')]
Or use Python's included batteries (heapq.nlargest in this case):
import heapq, operator
print(heapq.nlargest(2, word_counter_dictionary.items(), key=operator.itemgetter(1)))
Result:
[('third', 3), ('second', 2)]

Categories

Resources