Multiple values to one key in python - python

So my main goal is simple, I want multiple values to be returned when using a single key. However I'm getting errors and confusing behavior. I am new to Python, so I fully expect there to be a simple reason for this issue.
I have a list of objects, list which only contains the index of the object. i.e.
1
2
3
4
etc..
and a file containing the groups that each of the objects belong to, listed in the same order. The file is a single value for n lines (n being the length of the list of objects as well.) i.e. the file looks like this:
2
5
2
4
etc..
meaning the first object belongs in group 2, the second object in group 5, the third in group 2 and fourth in group 4. This file will change depending on my input files. I have attempted the two following suggestions (that I could find).
EDIT: my end goal: to have a dictionary with group numbers as the keys and the objects in the groups as the values.
I looked to this StackOverflow question first for help since it is so similar, and ended up with this code:
def createdDict(list, file):
f = open(file, 'r')
d={}
i=0
for line in f:
groupIndex = int(line)
if groupIndex in d:
d[groupIndex].append(list[i])
else:
d[groupIndex] = list[i]
i +=1
print d
f.close()
And this error:
AttributeError: 'Element' object has no attribute 'append'
d[groupIndex] is just a dictionary and its key and groupIndex should also just be an integer.. not an object from a class I created earlier in the script. (Why is this error showing up?)
I then revised my code after coming upon this other question to be like the following, since I thought this was an alternative way to accomplish my task. My code then looked like this:
def createdDict(list, file):
f = open(file, 'r')
d={}
i=0
for line in f:
groupIndex = int(line)
if groupIndex in d:
d.setdefault('groupIndex', []).append(list[i])
else:
d[groupIndex] = list[i]
i +=1
print d
f.close()
This code snippet doesn't end in an error or what I want, but rather (what I believe) are the last objects in the groups... so print d gives me the key and the last object placed in the group (instead of the desired: ALL of the objects in that group) and then terminal randomly spits out groupIndex followed by all of the objects in list.
My question: what exactly am I missing here? People upvoted the answers to the questions I linked, so they are most likely correct and I am likely implementing them incorrectly. I don't need a correction to both procedures, but the most efficient answer to my problem of getting multiple values attached to one key. What would be the most pythonic way to accomplish this task?
EDIT 2: if this helps at all, here is the class that the first method is referencing the error too. I have no idea how it defined any part of this code as a part of this class. I haven't really developed it yet, but I'm all for an answer, so if this helps in locating the error:
class Element(object):
def __init__(self, globalIndex):
self.globalIndex = globalIndex
def GetGlobalIndex (self):
return self.globalIndex
globalIndex is a separate index of objects (Elements). with my current problem, I am taking a list of these Elements (this is the list mentioned earlier) and grouping them into smaller groups based upon my file (also mentioned earlier). Why I thought it shouldn't matter, the list is essentially a counting up of integers... How would it mess with my code?

The base of your problem is in this line:
d[groupIndex] = list[i]
In other words, when a key is not in the dictionary, you add a single value (Element object) under that key. The next time you see that key, you try to append to that single value. But you can't append to single values. You need a container type, such as a list, to append. Python doesn't magically turn your Element object into a list!
The solution is simple. If you want your dictionary's values to be lists, then do that. When you add the first item, append a one-element list:
d[groupIndex] = [list[i]]
Alternatively, you can take a one-item slice of the original list. This will be a list already.
d[groupIndex] = list[i:i+1]
Now the dictionary's values are always lists, and you can append the second and subsequent values to them without error.
As ecatmur points out, you can further simplify this (eliminating the if statement) using d.setdefault(groupIndex, []).append(list[i]). If the key doesn't exist, then the value is taken to be an empty list, and you simply always append the new item. You could also use collections.defaultdict(list).

Just use
d.setdefault(groupIndex, []).append(list[i])
This will check whether groupIndex is in d, so you don't need the if groupIndex in d: line.

from itertools import izip
from collections import defaultdict
dd = defaultdict(list)
with open('filename.txt') as fin:
for idx, line in izip(my_list, fin):
num = int(line)
defaultdict[num].append(idx)
This creates a defaultdict with a default type of list, so you can append without using setdefault. Then reads each element of my_list combined with the corresponding line from the file, converts the line to an integer, then adds to the group (represented by num) the corresponding index.

In your first try, you seem to correctly understand that adding the first element to the dictionary item is a special case, and you cannot append yet, since the dictionary item has no value yet.
In your case you set it to list[i]. However, list[i] is not a list, so you cannot run append on it in later iterations.
I would do something like:
for line in f:
groupIndex = int(line)
try:
blah = d[groupIndex] # to check if it exists
except:
d[groupIndex] = [] # if not, assign empty list
d[groupIndex].append(list[i])
print d
f.close()

Related

Search tuple elements within in list

I have a list in Python as
list_data = [('a','b',5),('aa','bb',50)]
and some variables:
a = ('a','b','2')
c = ('aaa','bbb','500')
Now how can I search if a is already there in list_data?
If yes add 2 to the value of a, if not append to list_data?
The result should be as
list_data = [('a','b',7),('aa','bb',50),('aaa','bbb','500')]
Actually, this question is a good way to several demonstrate Pythonic ways of doing things. So lets see what we can do.
In order to check if something is in python list you can just use operator in:
if a in list_data:
do_stuff()
But what you ask is a bit different. You want to do something like a search by multiple keys, if I understand correctly. In this case you can 'trim' your tuple by discarding last entry.
Slicing is handy for this:
value_trimmed = value[:-1]
Now you can make a list of trimmed tuples:
list_trimmed = []
for a in list_data:
list_trimmed.append(a[:-1])
And then search there:
if a[:-1] in list_trimmed:
do_smth()
This list can be constructed in a less verbose way using list_comprehension:
list_trimmed = [item[:-1] for item in list_data]
To find where your item exactly is you can use index() method of list:
list_trimmed.index(a[:-1])
This will return index of a[:-1] first occurrence in list_trimmed or throw if it cant be found. We can avoid explicitly checking if item is in the list, and do the insertion only if the exception is caught.
Your full code will look like this:
list_data = [('a','b',5), ('aa','bb',50)]
values_to_find = [('a','b','2'), ('aaa','bbb','500')]
list_trimmed = [item[:-1] for item in list_data]
for val in values_to_find:
val_trimmed = val[:-1]
try:
ind = list_trimmed.index(val_trimmed)
src_tuple = list_data[ind]
# we can't edit tuple inplace, since they are immutable in python
list_data[ind] = (src_tuple[0], src_tuple[1], src_tuple[2]+2)
except ValueError:
list_data.append(val)
print list_data
Of course, if speed or memory-efficiency is your main concern this code is not very appropriate, but you haven't mentioned these in your question, and that is not what python really about in my opinion.
Edit:
You haven't specified what happens when you check for ('aaa','bbb','500') second time - should we use the updated list and increment matching tuple's last element, or should we stick to the original list and insert another copy?
If we use updated list, it is not clear how to handle incrementing string '500' by 2 (we can convert it to integer, but you should have constructed your query appropriately in the first place).
Or maybe you meant add last element of tuple being searched to the tuple in list if found ? Please edit your question to make it clear.

How to implement a counter for each element of a python list?

Using Python 3.4
I've got a way that works, but I think there might be a better way.
I want to have a list with a method expand() which chooses a random element from the list, but every time that element is chosen, a counter is incremented. I tried subclassing str to be able to add attributes but it didn't work.
My main problem with what I've got is that the expression random.randint(0,len(self)-1) and using a local variable doesn't seem very Pythonic. Before I added the counter, I could just type random.choice(self)
class clauses(list):
def __init__(self):
self.uses = []
def __setitem__(self,key,value):
self.uses[key]=value
super().__setitem__(self,key,value)
def __delitem__(self,key):
del(self.uses[key])
super().__delitem__(key)
def append(self,value):
self.uses.append(0)
super().append(value)
def extend(self,sequence):
for x in sequence:
self.uses.append(0)
super().append(x)
def expand(self):
n = random.randint(0,len(self)-1)
self.uses[n] += 1
return(self[n])
Initializing an empty dictionary, along with your list should solve this, assuming there are no duplicate entries within the list.
When adding an element to the list, you can also add it to the dictionary by myDict[element]=0 where myDict is the initialized dictionary, and element is the item being added to the list.
Then, when the item is selected, you can simply do: myDict[element]+=1.
When dealing with an instance of duplicate entries, you could create a dictionary of dictionaries in which each key in the dictionary is a word, and the nested dictionary keys for each word are, say, index positions of the duplicate word (the values of course being the actual counts). This does add substantial complication, however, as when you remove an item from your list you will need to also update index positions. This nested data structure would like something like this though: { word1: {position1: count1}, word2: {position1: count1, position 2: count2}....}

finding first item in a list whose first item in a tuple is matched

I have a list of several thousand unordered tuples that are of the format
(mainValue, (value, value, value, value))
Given a main value (which may or may not be present), is there a 'nice' way, other than iterating through every item looking and incrementing a value, where I can produce a list of indexes of tuples that match like this:
index = 0;
for destEntry in destList:
if destEntry[0] == sourceMatch:
destMatches.append(index)
index = index + 1
So I can compare the sub values against another set, and remove the best match from the list if necessary.
This works fine, but just seems like python would have a better way!
Edit:
As per the question, when writing the original question, I realised that I could use a dictionary instead of the first value (in fact this list is within another dictionary), but after removing the question, I still wanted to know how to do it as a tuple.
With list comprehension your for loop can be reduced to this expression:
destMatches = [i for i,destEntry in enumerate(destList) if destEntry[0] == sourceMatch]
You can also use filter()1 built in function to filter your data:
destMatches = filter(lambda destEntry:destEntry[0] == sourceMatch, destList)
1: In Python 3 filter is a class and returns a filter object.

Python - List not converting to Tuple inorder to Sort

def mkEntry(file1):
for line in file1:
lst = (line.rstrip().split(","))
print("Old", lst)
print(type(lst))
tuple(lst)
print(type(lst)) #still showing type='list'
sorted(lst, key=operator.itemgetter(1, 2))
def main():
openFile = 'yob' + input("Enter the year <Do NOT include 'yob' or .'txt' : ") + '.txt'
file1 = open(openFile)
mkEntry(file1)
main()
TextFile:
Emma,F,20791
Tom,M,1658
Anthony,M,985
Lisa,F,88976
Ben,M,6989
Shelly,F,8975
and I get this output:
IndexError: string index out of range
I am trying to convert the lst to Tuple from List. So I will able to order the F to M and Smallest Number to Largest Numbers. In around line 7, it's still printing type list instead of type tuple. I don't know why it's doing that.
print(type(lst))
tuple(lst)
print(type(lst)) #still showing type='list'
You're not changing what lst refers to. You create a new tuple with tuple(lst) and immediately throw it away because you don't assign it to anything. You can do:
lst = tuple(lst)
Note that this will not fix your program. Notice that your sort operation is happening once per line of your file, which is not what you want. Try collecting each line into one sequence of tuples and then doing the sort.
Firstly, you are not saving the tuple you created anywhere:
tup = tuple(lst)
Secondly, there is no point in making it a tuple before sorting it - in fact, a list could be sorted in place as it's mutable, while a tuple would need another copy (although that's fairly cheap, the items it contains aren't copied).
Thirdly, the IndexError has nothing to do with whether it's a list or tuple, nor whether it is sorted. It most likely comes from the itemgetter, because there's a list item that doesn't have three entries in turn - for instance, the strings "F" or "M".
Fourthly, the sort you're doing, but not saving anywhere, is done on each individual line, not the table of data. Considering this means you're comparing a name, a number, and a gender, I rather doubt it's what you intended.
It's completely unclear why you're trying to convert data types, and the code doesn't match the structure of the data. How about moving back to the overview plan and sorting out what you want done? It could well be something like Python's csv module could help considerably.

searching a value in a list and outputting its key

i have a dictionary, in which each key has a list as its value and those lists are of different sizes. I populated keys and values using add and set(to avoid duplicates). If i output my dictionary, the output is:
blizzard set(['00:13:e8:17:9f:25', '00:21:6a:33:81:50', '58:bc:27:13:37:c9', '00:19:d2:33:ad:9d'])
alpha_jian set(['00:13:e8:17:9f:25'])
Here, blizzard and alpha_jian are two keys in my dictionary.
Now, i have another text file which has two columns like
00:21:6a:33:81:50 45
00:13:e8:17:9f:25 59
As you can see, the first column items are one of the entries in each list of my dictionary. For example, 00:21:6a:33:81:50 belongs to the key 'blizzard' and 00:13:e8:17:9f:25 belongs to the key 'alpha_jian'.
The problem i want is, go through first column items in my text file, and if that column entry is found in dictionary, find its corresponding key, find the length of that corresponding list in the dictionary, and add them in new dictionary, say newDict.
For example 00:21:6a:33:81:50 belongs to blizzard. Hence, newDict entry will be:
newDict[blizzard] = 4 // since the blizzard key corresponds to a list of length 4.
This is the code i expected to do this task:
newDict = dict()
# myDict is present with entries like specified above
with open("input.txt") as f:
for line in f:
fields = line.split("\t")
for key, value in myDict.items():
if fields[0] == #Some Expression:
newdict[key] = len(value)
print newDict
Here, my question is what should be #Some Expression in my code above. If values are not lists, this is very easy. But how to search in lists? Thanks in advance.
You are looking for in
if fields[0] in value:
But this isn't a very efficient method, as it involves scanning the dict values over and over
You can make a temporary datastructure to help
helper_dict = {k: v for v, x in myDict.items() for k in x}
So your code becomes
helper_dict = {k: v for v, x in myDict.items() for k in x}
with open("input.txt") as f:
for line in f:
fields = line.split("\t")
key = fields[0]
if key in helper_dict:
newdict[helper_dict[key]] = len(myDict[helper_dict[key]])
Doesn't
if fields[0] in value:
solve your problem ? Or I don't understand your question ?
Looks like
if fields[0] in value:
should do the trick. I.e. check if the field is a member of the set (this also works for lists, but a bit slower at least if the lists are large).
(note that lists and sets are two different things; one is an ordered container that can contain multiple copies of the same value, the other an unordered container that can contain only one copy of each value.)
You may also want to add a break after the newdict assignment, so you don't keep checking all the other dictionary entries.
if fields[0] in value: should do the trick given that from what you say above every value in the dictionary is a set, whether of length 1 or greater.
It would probably be more efficient to build a new dictionary with keys like '00:13:e8:17:9f:25' (assuming these are unique), and associated values being the number of entries in their set before you start though - that way you will avoid recalculating this stuff repeatedly. Obviously, if the list isn't that long then it doesn't make much difference.

Categories

Resources