This question already has answers here:
list to dictionary conversion with multiple values per key?
(7 answers)
Closed 4 years ago.
I am new to python and I have a list of years and values for each year. What I want to do is check if the year already exists in a dictionary and if it does, append the value to that list of values for the specific key.
So for instance, I have a list of years and have one value for each year:
2010
2
2009
4
1989
8
2009
7
What I want to do is populate a dictionary with the years as keys and those single digit numbers as values. However, if I have 2009 listed twice, I want to append that second value to my list of values in that dictionary, so I want:
2010: 2
2009: 4, 7
1989: 8
Right now I have the following:
d = dict()
years = []
(get 2 column list of years and values)
for line in list:
year = line[0]
value = line[1]
for line in list:
if year in d.keys():
d[value].append(value)
else:
d[value] = value
d[year] = year
If I can rephrase your question, what you want is a dictionary with the years as keys and an array for each year containing a list of values associated with that year, right? Here's how I'd do it:
years_dict = dict()
for line in list:
if line[0] in years_dict:
# append the new number to the existing array at this slot
years_dict[line[0]].append(line[1])
else:
# create a new array in this slot
years_dict[line[0]] = [line[1]]
What you should end up with in years_dict is a dictionary that looks like the following:
{
"2010": [2],
"2009": [4,7],
"1989": [8]
}
In general, it's poor programming practice to create "parallel arrays", where items are implicitly associated with each other by having the same index rather than being proper children of a container that encompasses them both.
You would be best off using collections.defaultdict (added in Python 2.5). This allows you to specify the default object type of a missing key (such as a list).
So instead of creating a key if it doesn't exist first and then appending to the value of the key, you cut out the middle-man and just directly append to non-existing keys to get the desired result.
A quick example using your data:
>>> from collections import defaultdict
>>> data = [(2010, 2), (2009, 4), (1989, 8), (2009, 7)]
>>> d = defaultdict(list)
>>> d
defaultdict(<type 'list'>, {})
>>> for year, month in data:
... d[year].append(month)
...
>>> d
defaultdict(<type 'list'>, {2009: [4, 7], 2010: [2], 1989: [8]})
This way you don't have to worry about whether you've seen a digit associated with a year or not. You just append and forget, knowing that a missing key will always be a list. If a key already exists, then it will just be appended to.
You can use setdefault.
for line in list:
d.setdefault(year, []).append(value)
This works because setdefault returns the list as well as setting it on the dictionary, and because a list is mutable, appending to the version returned by setdefault is the same as appending it to the version inside the dictionary itself. If that makes any sense.
d = {}
# import list of year,value pairs
for year,value in mylist:
try:
d[year].append(value)
except KeyError:
d[year] = [value]
The Python way - it is easier to receive forgiveness than ask permission!
Here is an alternative way of doing this using the not in operator:
# define an empty dict
years_dict = dict()
for line in list:
# here define what key is, for example,
key = line[0]
# check if key is already present in dict
if key not in years_dict:
years_dict[key] = []
# append some value
years_dict[key].append(some.value)
It's easier if you get these values into a list of tuples. To do this, you can use list slicing and the zip function.
data_in = [2010,2,2009,4,1989,8,2009,7]
data_pairs = zip(data_in[::2],data_in[1::2])
Zip takes an arbitrary number of lists, in this case the even and odd entries of data_in, and puts them together into a tuple.
Now we can use the setdefault method.
data_dict = {}
for x in data_pairs:
data_dict.setdefault(x[0],[]).append(x[1])
setdefault takes a key and a default value, and returns either associated value, or if there is no current value, the default value. In this case, we will either get an empty or populated list, which we then append the current value to.
If you want a (almost) one-liner:
from collections import deque
d = {}
deque((d.setdefault(year, []).append(value) for year, value in source_of_data), maxlen=0)
Using dict.setdefault, you can encapsulate the idea of "check if the key already exists and make a new list if not" into a single call. This allows you to write a generator expression which is consumed by deque as efficiently as possible since the queue length is set to zero. The deque will be discarded immediately and the result will be in d.
This is something I just did for fun. I don't recommend using it. There is a time and a place to consume arbitrary iterables through a deque, and this is definitely not it.
Related
I am trying to define a series of dictionaries and then iterate through them below. I've tried putting the dictionary names into a list to iterate over, but that then throws up an error that a string does not have .keys() method. I'm presuming that is because Python thinks the values in the list are just strings and not meant to represent the dictionaries above.
I'm not sure how else I could code this though. The code is here:
prem_year_map = {
2011: 2935,
2012: 3389,
2013: 3853,
2014: 4311,
}
year_tournament_map = {
2013: 8273,
2012: 6978,
2011: 5861,
2010: 4940,
}
tournament_list = [prem_year_map, year_tournament_map]
for x in tournament_list:
years = sorted(tournament_list.keys())
print years
Can anyone suggest an alternative method?
Thanks
I'm presuming that is because Python thinks the values in the list are just strings and not meant to represent the dictionaries above.
This is not right. A list in Python can contain any type of reference.
You simply need to use the object.keys() instead of tournament_list.keys() (the latter you are asking for keys of the list, which do not exist - the dictionaries have the keys)
for x in tournament_list:
years = sorted(x.keys())
print years
As pointed out by #JonClements you can also used sorted(x) which returns a list (and is a bit more efficient in Python 2.x). Note that it can't return a dictionary since the standard dictionary cannot preserve an order.
To iterate dictionary key and value you need use different iterator.
for key, value in {}.iteritems():
print key, value
The best choice is standard itertools.
for key, value in itertools.chain(dict1.iteritems(), dict2.iteritems())
print key, value
You can also do list of iterators and iterate them.
iterators = []
iterators.append(oneDict.iteritems())
for iterator in iterators:
for item in iterator:
yield item # (key, value)
Choose what is the simplest.
It should be:
for x in tournament_list:
years = sorted(x.keys())
print years
Otherwise you are trying to get the keys from the list of dictionaries (which of course make no sense).
i have a dictionary, in which each key has a list as its value and those lists are of different sizes. I populated keys and values using add and set(to avoid duplicates). If i output my dictionary, the output is:
blizzard set(['00:13:e8:17:9f:25', '00:21:6a:33:81:50', '58:bc:27:13:37:c9', '00:19:d2:33:ad:9d'])
alpha_jian set(['00:13:e8:17:9f:25'])
Here, blizzard and alpha_jian are two keys in my dictionary.
Now, i have another text file which has two columns like
00:21:6a:33:81:50 45
00:13:e8:17:9f:25 59
As you can see, the first column items are one of the entries in each list of my dictionary. For example, 00:21:6a:33:81:50 belongs to the key 'blizzard' and 00:13:e8:17:9f:25 belongs to the key 'alpha_jian'.
The problem i want is, go through first column items in my text file, and if that column entry is found in dictionary, find its corresponding key, find the length of that corresponding list in the dictionary, and add them in new dictionary, say newDict.
For example 00:21:6a:33:81:50 belongs to blizzard. Hence, newDict entry will be:
newDict[blizzard] = 4 // since the blizzard key corresponds to a list of length 4.
This is the code i expected to do this task:
newDict = dict()
# myDict is present with entries like specified above
with open("input.txt") as f:
for line in f:
fields = line.split("\t")
for key, value in myDict.items():
if fields[0] == #Some Expression:
newdict[key] = len(value)
print newDict
Here, my question is what should be #Some Expression in my code above. If values are not lists, this is very easy. But how to search in lists? Thanks in advance.
You are looking for in
if fields[0] in value:
But this isn't a very efficient method, as it involves scanning the dict values over and over
You can make a temporary datastructure to help
helper_dict = {k: v for v, x in myDict.items() for k in x}
So your code becomes
helper_dict = {k: v for v, x in myDict.items() for k in x}
with open("input.txt") as f:
for line in f:
fields = line.split("\t")
key = fields[0]
if key in helper_dict:
newdict[helper_dict[key]] = len(myDict[helper_dict[key]])
Doesn't
if fields[0] in value:
solve your problem ? Or I don't understand your question ?
Looks like
if fields[0] in value:
should do the trick. I.e. check if the field is a member of the set (this also works for lists, but a bit slower at least if the lists are large).
(note that lists and sets are two different things; one is an ordered container that can contain multiple copies of the same value, the other an unordered container that can contain only one copy of each value.)
You may also want to add a break after the newdict assignment, so you don't keep checking all the other dictionary entries.
if fields[0] in value: should do the trick given that from what you say above every value in the dictionary is a set, whether of length 1 or greater.
It would probably be more efficient to build a new dictionary with keys like '00:13:e8:17:9f:25' (assuming these are unique), and associated values being the number of entries in their set before you start though - that way you will avoid recalculating this stuff repeatedly. Obviously, if the list isn't that long then it doesn't make much difference.
I've been working on this thing for hours, still cant figure it out :O
The problem I'm having is this. Lets say I have a dictionary with 4-element tuples as elemets and an integer as key. When an element is removed from the whole dictionary (which belongs to every tuple) making two of the tuples (elements) same, the keys of the two tuples don't add up. Instead, a new element is formed, with the key for that element being one of the previous 2 keys.
Let's say I have a dictionary:
dict={('A','B','D','C'): 4, ('C','B','A','D'):5, ('D','A','C','B'):3,('D','A','B','C'):1}
Now I wanna remove one letter from the entire dictionary.
for example, If I wanna remove 'B'. The following new dictionary is formed, but isn't returned, because two of the elements are the same.
{('A','D','C'): 4, ('C','A','D'):5, ('D','A','C'):3,('D','A','C'):1}
Instead of ('D','A','C'):3,('D','A','C'):1 becoming ('D','A','C'):4, this is what ends up happenening:
('D','A','C'):3 along with other tuples
So basically, one of the tuples disappears.
This is the method I'm currently using:
for next in dict:
new_tuple=()
for i in next:
if i!='A':
new_tuple+=(i,)
new_dict[new_tuple]=dict[next]
The above code returns new_dict as the following:
{('A','D','C'): 4, ('C','A','D'):5, ('D','A','C'):3}
So what can I do, to remove one letter from every tuple in the entire dictionary, and if two of the tuples look the same, they merge and the keys add up?
You will have to rebuild your entire dictionary, as each key/value pair is going to be affected. You can use a defaultdict to make the merging easier when you encounter now-overlapping keys:
from collections import defaultdict
new_dict = defaultdict(int)
for key, value in old_dict.items():
new_key = tuple(i for i in key if i != 'A')
new_dict[new_key] += value
Because when first looking up new_key in new_dict it'll be set to 0 by default, all we have to do is add the old value to update new_dict for when we first encounter a key. The next time we encounter the key the values are 'merged' by adding them up.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Python List vs. Array - when to use?
I'm working on a few projects in Python, and I have a few questions:
What's the difference between Arrays and Lists?
If it's not obvious from question 1, which should I use?
How do you use the preferred one? (create array/list, add item, remove item, pick random item)
Use lists unless you want some very specific features that are in the C array libraries.
python really has three primitive data structures
tuple = ('a','b','c')
list = ['a','b','c']
dict = {'a':1, 'b': true, 'c': "name"}
list.append('d') #will add 'd' to the list
list[0] #will get the first item 'a'
list.insert(i, x) # Insert an item at a given position. The first argument is the index of the element before which to insert, so a.insert(0, x) inserts at the front of the list, and a.insert(len(a), x) is equivalent to a.append(x).
list.pop(2) # will remove items by position (index), remove the 3rd item
list.remove(x) # Remove the first item from the list whose value is x.
list.index(x) # Return the index in the list of the first item whose value is x. It is an error if there is no such item.
list.count(x) # Return the number of times x appears in the list.
list.sort(cmp=None, key=None, reverse=False) # Sort the items of the list in place (the arguments can be used for sort customization, see sorted() for their explanation).
list.reverse() # Reverse the elements of the list, in place.
More on data structures here:
http://docs.python.org/tutorial/datastructures.html
Nothing really concrete here and this answer is a bit subjective...
In general, I feel you should use a list just because it is supported in the syntax and is used more widely in the other libraries, etc.
You should use arrays if you know that everything in the "list" will be of the same type and you want to store the data more compactly.
How come that I can easily do a for-loop in Python to loop through all the elements of a dictionary in the order I appended the elements but there's no obvious way to access a specific element? When I say element I'm talking about a key+value pair.
I've looked through what some basic tutorials on Python says on dictionaries but not a single one answers my question, I can't find my answer in docs.python.org either.
According to:
accessing specific elements from python dictionary (Senderies comment) a dict is supposed to be unordered but then why does the for-loop print them in the order I put them in?
You access a specific element in a dictionary by key. That's what a dictionary is. If that behavior isn't what you want, use something other than a dictionary.
a dict is supposed to be unordered but then why does the for-loop print them in the order I put them in?
Coincidence: basically, you happened to put them in in the order that Python prefers. This isn't too hard to do, especially with integers (ints are their own hashes and will tend to come out from a dict in ascending numeric order, though this is an implementation detail of CPython and may not be true in other Python implementations), and especially if you put them in in numerical order to begin with.
"Unordered" really means that you don't control the order, and it may change due to various implementation-specific criteria, so you should not rely on it. Of course when you iterate over a dictionary elements come out in some order.
If you need to be able to access dictionary elements by numeric index, there are lots of ways to do that. collections.OrderedDict is the standard way; the keys are always returned in the order you added them, so you can always do foo[foo.keys()[i]] to access the ith element. There are other schemes you could use as well.
Python dicts are accessed by hashing the key. So if you have any sort of a sizable dict and things are coming out in the order you put them in, then it's time to start betting on the lottery!
my_dict = {}
my_dict['a'] = 1
my_dict['b'] = 2
my_dict['c'] = 3
my_dict['d'] = 4
for k,v in my_dict.items():
print k, v
yields:
a 1
c 3
b 2
d 4
d = {}
d['first'] = 1
d['second'] = 2
d['third'] = 3
print d
# prints {'seconds': 2, 'third': 3, 'first': 1}
# Hmm, just like the docs say, order of insertion isn't preserved.
print d['third']
# prints 3
# And there you have it: access to a specific element
If you want to iterate through the items in insertion order, you should be using OrderedDict. A regular dict is not guaranteed to do the same, so you're asking for trouble later if you rely on it to do so.
If you want to access a particular item, you should access it by its key using the [] operator or the get() method. That's the primary function of a dict, after all.
Result of the hashing of several values varies according the values:
sometimes the order seems to be kept: following example with d_one
generaly, the order is not kept: following example with d_two
Believing that the order is anyway kept is only because you are deceived by particuliar cases in which the order is apparently conserved
d_one = {}
for i,x in enumerate((122,'xo','roto',885)):
print x
d_one[x] = i
print
for k in d_one:
print k
print '\n=======================\n'
d_two = {}
for i,x in enumerate((122,'xo','roto','taratata',885)):
print x
d_two[x] = i
print
for k in d_two:
print k
result
122
xo
roto
885
122
xo
roto
885
=======================
122
xo
roto
taratata
885
122
taratata
xo
roto
885
By the way, what you call "elements of a dictionary' are commonly called 'items of the dictionary' ( hence the methods items() and iteritems() of a dictionary)