Python: creating empty lists within empty lists? - python

EDIT: so I figured out that if I declare this at the beginning it works fine:
RelayPlaceHolder = [[],[],[],[],[],[],[],[],[]]
Why can't something like this create the same sort of empty containers? the number of empty lists might change:
for SwimTeams in SwimTeamList:
empty = []
RelayPlaceHolder.append(empty)
this was my old question...
I have a list of lists of further lists of single dictionaries:
TeamAgeGroupGender[team#][swimmer#][their dictionary with a key such as {"free":19.05}]
I have a loop that for every team in the first level of my lists, it then loops through every swimmer within that team's list, and add's their swim that corresponds to the key value "free" to a new list called RelayPlaceHolder[teamid][***the thing I just added***]
for SwimTeams in SwimTeamList:
empty = []
RelayPlaceHolder.append(empty)
teamid = SwimTeamList.index(SwimTeams)
print SwimTeams
print teamid
for swimmers in TeamAgeGroupGender[teamid]:
swimmerID = TeamAgeGroupGender[teamid].index(swimmers)
RelayPlaceHolder[teamid].append({SwimTeams:TeamAgeGroupGender[teamid][swimmerID]["Free"]})
print RelayPlaceHolder[teamid][0]
Desired:
RelayPlaceHolder[0][*** list of swims that go with this team#0 ***]
RelayPlaceHolder[1][*** list of swims that go with team#1, ie the next teamid in the loop***]
for some reason, my loop is only adding swims to RelayPlaceHolder[0], even the swims from team#1. I tried using the print to troubleshoot, however, the teamid index and swimteam names are changing just fine from #0 to #1, however, my RelayPlaceHolder[teamid].append is still adding to the #0 list and not the #1. I also know this because a key value from later code is failing to find the correct key in RelayPlaceHolder[1] (because its turning up empty). I'm not sure why my loop is failing. I've used similar structure in other loops...
Thank you.

As commented by #doukremt: A concise syntax if you need to define an arbitrary number of lists is:
[[] for i in range(some_number)]
If you need to do it more often, you can implement it in a function:
>>> lists = lambda x: [[] for i in range(x)]
>>> lists(3)
[[], [], []]

Related

Python Craps: Creating a list with sets. Unexpected types

I am fairly new to programming and now I want to create a simple craps game in Python.
However, I instantly run into a problem. I want to make a list of 12 elements and then make all of them empty sets.
roll = [i for i in range(12)]
for i in roll:
roll[i] = {}
print(roll)
Sure it works, but there is some issue. It indicates i in the for-loop and says unexpected types, which makes me think there is a more legitimate way of doing this, and I'd like to know why it doesn't work. I did the test to make roll a list of 12 sets in the first place by changing the function, sure it avoids the problem, but doesn't make me learn anything about this issue.
roll = [i for i in range(12)]
roll is a list of integers, from 0 - 11 inclusive.
for i in roll:
For every integer in the list...
roll[i] = {}
Treat the current integer as an index to the list, and replace the integer at that index with a dictionary (The literal {} is not a set). This happens to work the way you intended because the integers in the list happen to be the same values as the indices at which they appear in the list.
The whole thing is redundant to begin with - you don't need to create 12 integers to "allocate" space for the sets later on, just create the sets in the list comprehension:
roll = [set() for _ in range(12)]
you can define an empty set in the following way: set()
{} it is used for an empty dictionary
you can use in your example:
roll = [i for i in range(12)]
for i in roll:
roll[i] = set()
print(roll)
If you just want a list of sets you can do:
[set() for i in range(12)]
In your code you're both looping through roll and modifying the elements.
Also {} makes a dict not a set.

Get unique entries in list of lists by an item

This seems like a fairly straightforward problem but I can't seem to find an efficient way to do it. I have a list of lists like this:
list = [['abc','def','123'],['abc','xyz','123'],['ghi','jqk','456']]
I want to get a list of unique entries by the third item in each child list (the 'id'), i.e. the end result should be
unique_entries = [['abc','def','123'],['ghi','jqk','456']]
What is the most efficient way to do this? I know I can use set to get the unique ids, and then loop through the whole list again. However, there are more than 2 million entries in my list and this is taking too long. Appreciate any pointers you can offer! Thanks.
How about this: Create a set that keeps track of ids already seen, and only append sublists where id's where not seen.
l = [['abc','def','123'],['abc','xyz','123'],['ghi','jqk','456']]
seen = set()
new_list = []
for sl in l:
if sl[2] not in seen:
new_list.append(sl)
seen.add(sl[2])
print new_list
Result:
[['abc', 'def', '123'], ['ghi', 'jqk', '456']]
One approach would be to create an inner loop. within the first loop you iterate over the outer list starting from 1, previously you will need to create an arraylist which will add the first element, inside the inner loop starting from index 0 you will check only if the third element is located as part of the third element within the arraylist current holding elements, if it is not found then on another arraylist whose scope is outside outher loop you will add this element, else you will use "continue" keyword. Finally you will print out the last arraylist created.

Lookup, concatenate and remove items in python list

I have a list of lists of lists that looks like this:
[[[1], [’apple’], [’AAA’]]
[[2], [’banana’], [’BBB’]]
[[3], [’orange’], [’CCC’]]
[[4], [’pineapple’], [’AAA’]]
[[5], [’tomato’], [’ABC’]]]
Probably the wrong terminology, but: I want to find duplicates in the third column, add that row's second column item to the first instance of the duplicates, and then remove the duplicate row.
So using the example: I want to iterate through the list, find the duplicate value ‍'AAA', add 'pineapple' after 'apple' and remove the (second level) list containing the second instance of 'AAA'.
The list I want to end up with should look like:
[[[1], [’apple’, 'pineapple'], [’AAA’]]
[[2], [’banana’], [’BBB’]]
[[3], [’orange’], [’CCC’]]
[[5], [’tomato’], [’ABC’]]]
I tried the following but I can't figure out how to do this..
seen = set()
for l in final:
if l[2] not in seen: # TypeError: unhashable type: 'list'
# Here I want to add value to first instance
seen.add(l[2])
# Remove list
This will do what you're asking for... but I seriously wonder whether you can't change your data structure. It's strange and hard to work with!
newList = []
lookup = {}
for l in final:
if l[2][0] not in lookup:
lookup[l[2][0]] = l
newList.append(l)
else:
lookup[l[2][0]][1].append(l[1][0])
print newList
The reason you were getting the TypeError is you that you were doing this l[2] instead of l[2][0]. Remember, l[2] is a list. What you want is to grab the item inside that list (index 0 in this case) and check if that is in lookup. The lookup replaces the seen set implemented in your example because it can also help get back the entry that a duplicate l[2][0] would correspond to, since your data structure currently isn't set up to do something like final['AAA']. However, this isn't very ideal and I'd heavily recommend you do something about changing this, if possible.
Something else to think about...
Currently, because your items are all essentially lists within lists, the current algorithm will essentially change the nested objects (lists) you were working with, because of object mutability. This means that while final would contain the same objects it did originally, those objects will have changed (in this case with ['apple', 'pineapple']).
If you want to prevent that from happening, look into using the copy module. Specifically, using the deepcopy method to copy all objects (even through the nesting).
Edit:
w0lf's version (Improved readability)
newList = []
lookup = {}
for l in final:
row_no, fruit, code = l
unique_id = code[0] # because `code` is a one element list
if unique_id not in lookup:
lookup[unique_id] = l
newList.append(l)
else:
lookup[unique_id][1].extend(fruit)
print(newList)
Also note: He remembered to do print(newList) instead of print newList for Py3k users. Since the question is tagged for Python 3, that's the way to go.
List is unhashable type, i.e you cannot add it (as is) to data structures that uses hash maps (like python dictionary or set). but strings are hashable.
I'd do
seen.add(str(ls[2]))
This will solve the TypeError

Compare items in list with nested for-loop

I have a list of URLs in an open CSV which I have ordered alphabetically, and now I would like to iterate through the list and check for duplicate URLs. In a second step, the duplicate should then be removed from the list, but I am currently stuck on the checking part which I have tried to solve with a nested for-loop as follows:
for i in short_urls:
first_url = i
for s in short_urls:
second_url = s
if i == s:
print "duplicate"
else:
print "all good"
The print statements will obviously be replaced once the nested for-loop is working. Currently, the list contains a few duplicates, but my nested loop does not seem to work correctly as it does not recognise any of the duplicates.
My question is: are there better ways to do perform this exercise, and what is the problem with the current nested for-loop?
Many thanks :)
By construction, your method is faulty, even if you indent the if/else block correctly. For instance, imagine if you had [1, 2, 3] as short_urls for the sake of argument. The outer for loop will pick out 1 to compare to the list against. It will think it's finding a duplicate when in the inner for loop it encounters the first element, a 1 as well. Essentially, every element will be tagged as a duplicate and if you plan on removing duplicates, you'll end up with an empty list.
The better solution is to call set(short_urls) to get a set of your urls with the duplicates removed. If you want a list (as opposed to a set) of urls with the duplicates removed, you can convert the set back into a list with list(set(short_urls)).
In other words:
short_urls = ['google.com', 'twitter.com', 'google.com']
duplicates_removed_list = list(set(short_urls))
print duplicates_removed_list # Prints ['google.com', 'twitter.com']
if i == s:
is not inside the second for loop. You missed an indentation
for i in short_urls:
first_url = i
for s in short_urls:
second_url = s
if i == s:
print "duplicate"
else:
print "all good"
EDIT: Also you are comparing every element of an array with every element of the same array. This means compare the element at position 0 with the element at postion 0, which is obviously the same.
What you need to do is starting the second for at the position after that reached in the first for.

Python: Adding element to list while iterating

I know that it is not allowed to remove elements while iterating a list, but is it allowed to add elements to a python list while iterating. Here is an example:
for a in myarr:
if somecond(a):
myarr.append(newObj())
I have tried this in my code and it seems to work fine, however I don't know if it's because I am just lucky and that it will break at some point in the future?
EDIT: I prefer not to copy the list since "myarr" is huge, and therefore it would be too slow. Also I need to check the appended objects with "somecond()".
EDIT: At some point "somecond(a)" will be false, so there can not be an infinite loop.
EDIT: Someone asked about the "somecond()" function. Each object in myarr has a size, and each time "somecond(a)" is true and a new object is appended to the list, the new object will have a size smaller than a. "somecond()" has an epsilon for how small objects can be and if they are too small it will return "false"
Why don't you just do it the idiomatic C way? This ought to be bullet-proof, but it won't be fast. I'm pretty sure indexing into a list in Python walks the linked list, so this is a "Shlemiel the Painter" algorithm. But I tend not to worry about optimization until it becomes clear that a particular section of code is really a problem. First make it work; then worry about making it fast, if necessary.
If you want to iterate over all the elements:
i = 0
while i < len(some_list):
more_elements = do_something_with(some_list[i])
some_list.extend(more_elements)
i += 1
If you only want to iterate over the elements that were originally in the list:
i = 0
original_len = len(some_list)
while i < original_len:
more_elements = do_something_with(some_list[i])
some_list.extend(more_elements)
i += 1
well, according to http://docs.python.org/tutorial/controlflow.html
It is not safe to modify the sequence
being iterated over in the loop (this
can only happen for mutable sequence
types, such as lists). If you need to
modify the list you are iterating over
(for example, to duplicate selected
items) you must iterate over a copy.
You could use the islice from itertools to create an iterator over a smaller portion of the list. Then you can append entries to the list without impacting the items you're iterating over:
islice(myarr, 0, len(myarr)-1)
Even better, you don't even have to iterate over all the elements. You can increment a step size.
In short: If you'are absolutely sure all new objects fail somecond() check, then your code works fine, it just wastes some time iterating the newly added objects.
Before giving a proper answer, you have to understand why it considers a bad idea to change list/dict while iterating. When using for statement, Python tries to be clever, and returns a dynamically calculated item each time. Take list as example, python remembers a index, and each time it returns l[index] to you. If you are changing l, the result l[index] can be messy.
NOTE: Here is a stackoverflow question to demonstrate this.
The worst case for adding element while iterating is infinite loop, try(or not if you can read a bug) the following in a python REPL:
import random
l = [0]
for item in l:
l.append(random.randint(1, 1000))
print item
It will print numbers non-stop until memory is used up, or killed by system/user.
Understand the internal reason, let's discuss the solutions. Here are a few:
1. make a copy of origin list
Iterating the origin list, and modify the copied one.
result = l[:]
for item in l:
if somecond(item):
result.append(Obj())
2. control when the loop ends
Instead of handling control to python, you decides how to iterate the list:
length = len(l)
for index in range(length):
if somecond(l[index]):
l.append(Obj())
Before iterating, calculate the list length, and only loop length times.
3. store added objects in a new list
Instead of modifying the origin list, store new object in a new list and concatenate them afterward.
added = [Obj() for item in l if somecond(item)]
l.extend(added)
You can do this.
bonus_rows = []
for a in myarr:
if somecond(a):
bonus_rows.append(newObj())
myarr.extend( bonus_rows )
Access your list elements directly by i. Then you can append to your list:
for i in xrange(len(myarr)):
if somecond(a[i]):
myarr.append(newObj())
make copy of your original list, iterate over it,
see the modified code below
for a in myarr[:]:
if somecond(a):
myarr.append(newObj())
I had a similar problem today. I had a list of items that needed checking; if the objects passed the check, they were added to a result list. If they didn't pass, I changed them a bit and if they might still work (size > 0 after the change), I'd add them on to the back of the list for rechecking.
I went for a solution like
items = [...what I want to check...]
result = []
while items:
recheck_items = []
for item in items:
if check(item):
result.append(item)
else:
item = change(item) # Note that this always lowers the integer size(),
# so no danger of an infinite loop
if item.size() > 0:
recheck_items.append(item)
items = recheck_items # Let the loop restart with these, if any
My list is effectively a queue, should probably have used some sort of queue. But my lists are small (like 10 items) and this works too.
You can use an index and a while loop instead of a for loop if you want the loop to also loop over the elements that is added to the list during the loop:
i = 0
while i < len(myarr):
a = myarr[i];
i = i + 1;
if somecond(a):
myarr.append(newObj())
Expanding S.Lott's answer so that new items are processed as well:
todo = myarr
done = []
while todo:
added = []
for a in todo:
if somecond(a):
added.append(newObj())
done.extend(todo)
todo = added
The final list is in done.
Alternate solution :
reduce(lambda x,newObj : x +[newObj] if somecond else x,myarr,myarr)
Assuming you are adding at the last of this list arr, You can try this method I often use,
arr = [...The list I want to work with]
current_length = len(arr)
i = 0
while i < current_length:
current_element = arr[i]
do_something(arr[i])
# Time to insert
insert_count = 1 # How many Items you are adding add the last
arr.append(item_to_be inserted)
# IMPORTANT!!!! increase the current limit and indexer
i += 1
current_length += insert_count
This is just boilerplate and if you run this, your program will freeze because of infinite loop. DO NOT FORGET TO TERMINATE THE LOOP unless you need so.

Categories

Resources