list comprehension to merge various lists in python - python

I need to plot a lot of data samples, each stored in a list of integers. I want to create a list from a lot of concatenated lists, in order to plot it with enumerate(big_list) in order to get a fixed-offset x coordinate.
My current code is:
biglist = []
for n in xrange(number_of_lists):
biglist.extend(recordings[n][chosen_channel])
for x,y in enumerate(biglist):
print x,y
Notes: number_of_lists and chosen_channel are integer parameters defined elsewhere, and print x,y is for example (actually there are other statements to plot the points.
My question is:
is there a better way, for example, list comprehensions or other operation, to achieve the same result (merged list) without the loop and the pre-declared empty list?
Thanks

import itertools
for x,y in enumerate(itertools.chain(*(recordings[n][chosen_channel] for n in xrange(number_of_lists))):
print x,y
You can think of itertools.chain() as managing an iterator over the individual lists. It remembers which list and where in the list you are. This saves you all memory you would need to create the big list.

>>> import itertools
>>> l1 = [2,3,4,5]
>>> l2=[9,8,7]
>>> itertools.chain(l1,l2)
<itertools.chain object at 0x100429f90>
>>> list(itertools.chain(l1,l2))
[2, 3, 4, 5, 9, 8, 7]

Related

how do I append a list with two variables to another list that already has that list in it but with different values? python

I'm doing my programming coursework, and I've come across an issue
gamecentre1 = [winnerscore, winner]
organise = []
organise.extend([gamecentre1])
from operator import attrgetter, itemgetter
sorted(organise, key= itemgetter(0))
print(organise)
f = open("gameresults.txt","w")
f.write("Here are the winners of the games: \n")
f.write(str(organise))
f.write("\n")
f.close()
I'm trying to add two variables to a list, and add that list to another list. Then, I want to organise that larger list based off of the integer variable of the sublist (the integer is the winnerscore). But, the problem is that I haven't been able to organise them properly, and I worry that since I have to append the same list with the same variables to the larger list without overwriting the existing list in it, the larger list will just have the same values over and over again.
This is because I want to store the variables every time the program runs, without getting rid of the values of the variable from the previous game.
How do I do this?
I'm trying to add two variables to a list, and add that list to another list. Then, I want to organise that larger list based off of the integer variable of the sublist
It sounds like what you want is a nested list - a big list of small lists, such that each small list is independent of the others. The key problem in your code right now that's blocking you from accomplishing this is extend() - this is essentially list concatenation, which isn't what you want. For example,
x = [1, 2]
x.extend([3, 4]) # x == [1, 2, 3, 4]
Instead, try using append(), which adds its argument as the next value in the list. For example:
x = []
x.append([3, 4]) # [[3, 4]]
x.append([1, 2]) # [[3, 4], [1, 2]]
Now in the above example, we have a list x that's two elements long, and each of those elements is itself a list of two elements. Now we can plug that into sort:
y = sorted(x, key=lambda i: i[0])
print(y)
# [[1, 2], [3, 4]]
(the lambda i: i[0] is really just a more elegant way of what you're doing with itemgettr(0) - it's a lambda, a small inline function, that takes one argument i and returns the 0th element of i. In this case, the i that gets passed in is one of the smaller lists).
Now you have a sorted list of smaller lists, and you can do whatever you need with them.

What is the difference between `sorted(list)` vs `list.sort()`?

list.sort() sorts the list and replaces the original list, whereas sorted(list) returns a sorted copy of the list, without changing the original list.
When is one preferred over the other?
Which is more efficient? By how much?
Can a list be reverted to the unsorted state after list.sort() has been performed?
Please use Why do these list operations (methods) return None, rather than the resulting list? to close questions where OP has inadvertently assigned the result of .sort(), rather than using sorted or a separate statement. Proper debugging would reveal that .sort() had returned None, at which point "why?" is the remaining question.
sorted() returns a new sorted list, leaving the original list unaffected. list.sort() sorts the list in-place, mutating the list indices, and returns None (like all in-place operations).
sorted() works on any iterable, not just lists. Strings, tuples, dictionaries (you'll get the keys), generators, etc., returning a list containing all elements, sorted.
Use list.sort() when you want to mutate the list, sorted() when you want a new sorted object back. Use sorted() when you want to sort something that is an iterable, not a list yet.
For lists, list.sort() is faster than sorted() because it doesn't have to create a copy. For any other iterable, you have no choice.
No, you cannot retrieve the original positions. Once you called list.sort() the original order is gone.
What is the difference between sorted(list) vs list.sort()?
list.sort mutates the list in-place & returns None
sorted takes any iterable & returns a new list, sorted.
sorted is equivalent to this Python implementation, but the CPython builtin function should run measurably faster as it is written in C:
def sorted(iterable, key=None):
new_list = list(iterable) # make a new list
new_list.sort(key=key) # sort it
return new_list # return it
when to use which?
Use list.sort when you do not wish to retain the original sort order
(Thus you will be able to reuse the list in-place in memory.) and when
you are the sole owner of the list (if the list is shared by other code
and you mutate it, you could introduce bugs where that list is used.)
Use sorted when you want to retain the original sort order or when you
wish to create a new list that only your local code owns.
Can a list's original positions be retrieved after list.sort()?
No - unless you made a copy yourself, that information is lost because the sort is done in-place.
"And which is faster? And how much faster?"
To illustrate the penalty of creating a new list, use the timeit module, here's our setup:
import timeit
setup = """
import random
lists = [list(range(10000)) for _ in range(1000)] # list of lists
for l in lists:
random.shuffle(l) # shuffle each list
shuffled_iter = iter(lists) # wrap as iterator so next() yields one at a time
"""
And here's our results for a list of randomly arranged 10000 integers, as we can see here, we've disproven an older list creation expense myth:
Python 2.7
>>> timeit.repeat("next(shuffled_iter).sort()", setup=setup, number = 1000)
[3.75168503401801, 3.7473005310166627, 3.753129180986434]
>>> timeit.repeat("sorted(next(shuffled_iter))", setup=setup, number = 1000)
[3.702025591977872, 3.709248117986135, 3.71071034099441]
Python 3
>>> timeit.repeat("next(shuffled_iter).sort()", setup=setup, number = 1000)
[2.797430992126465, 2.796825885772705, 2.7744789123535156]
>>> timeit.repeat("sorted(next(shuffled_iter))", setup=setup, number = 1000)
[2.675589084625244, 2.8019039630889893, 2.849375009536743]
After some feedback, I decided another test would be desirable with different characteristics. Here I provide the same randomly ordered list of 100,000 in length for each iteration 1,000 times.
import timeit
setup = """
import random
random.seed(0)
lst = list(range(100000))
random.shuffle(lst)
"""
I interpret this larger sort's difference coming from the copying mentioned by Martijn, but it does not dominate to the point stated in the older more popular answer here, here the increase in time is only about 10%
>>> timeit.repeat("lst[:].sort()", setup=setup, number = 10000)
[572.919036605, 573.1384446719999, 568.5923951]
>>> timeit.repeat("sorted(lst[:])", setup=setup, number = 10000)
[647.0584738299999, 653.4040515829997, 657.9457361929999]
I also ran the above on a much smaller sort, and saw that the new sorted copy version still takes about 2% longer running time on a sort of 1000 length.
Poke ran his own code as well, here's the code:
setup = '''
import random
random.seed(12122353453462456)
lst = list(range({length}))
random.shuffle(lst)
lists = [lst[:] for _ in range({repeats})]
it = iter(lists)
'''
t1 = 'l = next(it); l.sort()'
t2 = 'l = next(it); sorted(l)'
length = 10 ** 7
repeats = 10 ** 2
print(length, repeats)
for t in t1, t2:
print(t)
print(timeit(t, setup=setup.format(length=length, repeats=repeats), number=repeats))
He found for 1000000 length sort, (ran 100 times) a similar result, but only about a 5% increase in time, here's the output:
10000000 100
l = next(it); l.sort()
610.5015971539542
l = next(it); sorted(l)
646.7786222379655
Conclusion:
A large sized list being sorted with sorted making a copy will likely dominate differences, but the sorting itself dominates the operation, and organizing your code around these differences would be premature optimization. I would use sorted when I need a new sorted list of the data, and I would use list.sort when I need to sort a list in-place, and let that determine my usage.
The main difference is that sorted(some_list) returns a new list:
a = [3, 2, 1]
print sorted(a) # new list
print a # is not modified
and some_list.sort(), sorts the list in place:
a = [3, 2, 1]
print a.sort() # in place
print a # it's modified
Note that since a.sort() doesn't return anything, print a.sort() will print None.
Can a list original positions be retrieved after list.sort()?
No, because it modifies the original list.
Here are a few simple examples to see the difference in action:
See the list of numbers here:
nums = [1, 9, -3, 4, 8, 5, 7, 14]
When calling sorted on this list, sorted will make a copy of the list. (Meaning your original list will remain unchanged.)
Let's see.
sorted(nums)
returns
[-3, 1, 4, 5, 7, 8, 9, 14]
Looking at the nums again
nums
We see the original list (unaltered and NOT sorted.). sorted did not change the original list
[1, 2, -3, 4, 8, 5, 7, 14]
Taking the same nums list and applying the sort function on it, will change the actual list.
Let's see.
Starting with our nums list to make sure, the content is still the same.
nums
[-3, 1, 4, 5, 7, 8, 9, 14]
nums.sort()
Now the original nums list is changed and looking at nums we see our original list has changed and is now sorted.
nums
[-3, 1, 2, 4, 5, 7, 8, 14]
Note: Simplest difference between sort() and sorted() is: sort()
doesn't return any value while, sorted() returns an iterable list.
sort() doesn't return any value.
The sort() method just sorts the elements of a given list in a specific order - Ascending or Descending without returning any value.
The syntax of sort() method is:
list.sort(key=..., reverse=...)
Alternatively, you can also use Python's in-built function sorted()
for the same purpose. sorted function return sorted list
list=sorted(list, key=..., reverse=...)
The .sort() function stores the value of new list directly in the list variable; so answer for your third question would be NO.
Also if you do this using sorted(list), then you can get it use because it is not stored in the list variable. Also sometimes .sort() method acts as function, or say that it takes arguments in it.
You have to store the value of sorted(list) in a variable explicitly.
Also for short data processing the speed will have no difference; but for long lists; you should directly use .sort() method for fast work; but again you will face irreversible actions.
With list.sort() you are altering the list variable but with sorted(list) you are not altering the variable.
Using sort:
list = [4, 5, 20, 1, 3, 2]
list.sort()
print(list)
print(type(list))
print(type(list.sort())
Should return this:
[1, 2, 3, 4, 5, 20]
<class 'NoneType'>
But using sorted():
list = [4, 5, 20, 1, 3, 2]
print(sorted(list))
print(list)
print(type(sorted(list)))
Should return this:
[1, 2, 3, 4, 5, 20]
[4, 5, 20, 1, 3, 2]
<class 'list'>

Removing duplicates and preserving order when elements inside the list is list itself

I have a following problem while trying to do some nodal analysis:
For example:
my_list=[[1,2,3,1],[2,3,1,2],[3,2,1,3]]
I want to write a function that treats the element_list inside my_list in a following way:
-The number of occurrence of certain element inside the list of my_list is not important and, as long as the unique elements inside the list are same, they are identical.
Find the identical loop based on the above premises and only keep the
first one and ignore other identical lists of my_list while preserving
the order.
Thus, in above example the function should return just the first list which is [1,2,3,1] because all the lists inside my_list are equal based on above premises.
I wrote a function in python to do this but I think it can be shortened and I am not sure if this is an efficient way to do it. Here is my code:
def _remove_duplicate_loops(duplicate_loop):
loops=[]
for i in range(len(duplicate_loop)):
unique_el_list=[]
for j in range(len(duplicate_loop[i])):
if (duplicate_loop[i][j] not in unique_el_list):
unique_el_list.append(duplicate_loop[i][j])
loops.append(unique_el_list[:])
loops_set=[set(x) for x in loops]
unique_loop_dict={}
for k in range(len(loops_set)):
if (loops_set[k] not in list(unique_loop_dict.values())):
unique_loop_dict[k]=loops_set[k]
unique_loop_pos=list(unique_loop_dict.keys())
unique_loops=[]
for l in range(len(unique_loop_pos)):
unique_loops.append(duplicate_loop[l])
return unique_loops
from collections import OrderedDict
my_list = [[1, 2, 3, 1], [2, 3, 1, 2], [3, 2, 1, 3]]
seen_combos = OrderedDict()
for sublist in my_list:
unique_elements = frozenset(sublist)
if unique_elements not in seen_combos:
seen_combos[unique_elements] = sublist
my_list = seen_combos.values()
you could do it in a fairly straightforward way using dictionaries. but you'll need to use frozenset instead of set, as sets are mutable and therefore not hashable.
def _remove_duplicate_lists(duplicate_loop):
dupdict = OrderedDict((frozenset(x), x) for x in reversed(duplicate_loop))
return reversed(dupdict.values())
should do it. Note the double reversed() because normally the last item is the one that is preserved, where you want the first, and the double reverses accomplish that.
edit: correction, yes, per Steven's answer, it must be an OrderedDict(), or the values returned will not be correct. His version might be slightly faster too..
edit again: You need an ordered dict if the order of the lists is important. Say your list is
[[1,2,3,4], [4,3,2,1], [5,6,7,8]]
The ordered dict version will ALWAYS return
[[1,2,3,4], [5,6,7,8]]
However, the regular dict version may return the above, or may return
[[5,6,7,8], [1,2,3,4]]
If you don't care, a non-ordered dict version may be faster/use less memory.

Basic python: how to increase value of item in list [duplicate]

This question already has answers here:
Why does this iterative list-growing code give IndexError: list assignment index out of range? How can I repeatedly add (append) elements to a list?
(9 answers)
Closed 4 months ago.
This is such a simple issue that I don't know what I'm doing wrong. Basically I want to iterate through the items in an empty list and increase each one according to some criteria. This is an example of what I'm trying to do:
list1 = []
for i in range(5):
list1[i] = list1[i] + 2*i
This fails with an list index out of range error and I'm stuck. The expected result (what I'm aiming at) would be a list with values:
[0, 2, 4, 6, 8]
Just to be more clear: I'm not after producing that particular list. The question is about how can I modify items of an empty list in a recursive way. As gnibbler showed below, initializing the list was the answer. Cheers.
Ruby (for example) lets you assign items beyond the end of the list. Python doesn't - you would have to initialise list1 like this
list1 = [0] * 5
So when doing this you are actually using i so you can just do your math to i and just set it to do that. there is no need to try and do the math to what is going to be in the list when you already have i. So just do list comprehension:
list1 = [2*i for i in range(5)]
Since you say that it is more complex, just don't use list comprehension, edit your for loop as such:
for i in range(5):
x = 2*i
list1[i] = x
This way you can keep doing things until you finally have the outcome you want, store it in a variable, and set it accordingly! You could also do list1.append(x), which I actually prefer because it will work with any list even if it's not in order like a list made with range
Edit: Since you want to be able to manipulate the array like you do, I would suggest using numpy! There is this great thing called vectorize so you can actually apply a function to a 1D array:
import numpy as np
list1 = range(5)
def my_func(x):
y = x * 2
vfunc = np.vectorize(my_func)
vfunc(list1)
>>> array([0, 2, 4, 6, 8])
I would advise only using this for more complex functions, because you can use numpy broadcasting for easy things like multiplying by two.
Your list is empty, so when you try to read an element of the list (right hand side of this line)
list1[i] = list1[i] + 2*i
it doesn't exist, so you get the error message.
You may also wish to consider using numpy. The multiplication operation is overloaded to be performed on each element of the array. Depending on the size of your list and the operations you plan to perform on it, using numpy very well may be the most efficient approach.
Example:
>>> import numpy
>>> 2 * numpy.arange(5)
array([0, 2, 4, 6, 8])
I would instead write
for i in range(5):
list1.append(2*i)
Yet another way to do this is to use the append method on your list. The reason you're getting an out of range error is because you're saying:
list1 = []
list1.__getitem__(0)
and then manipulate this item, BUT that item does not exist since your made an empty list.
Proof of concept:
list1 = []
list1[1]
IndexError: list index out of range
We can, however, append new stuff to this list like so:
list1 = []
for i in range(5):
list1.append(i * 2)

Sorting based on one of the list among Nested list in python

I have a list as [[4,5,6],[2,3,1]]. Now I want to sort the list based on list[1] i.e. output should be [[6,4,5],[1,2,3]]. So basically I am sorting 2,3,1 and maintaining the order of list[0].
While searching I got a function which sorts based on first element of every list but not for this. Also I do not want to recreate list as [[4,2],[5,3],[6,1]] and then use the function.
Since [4, 5, 6] and [2, 3, 1] serves two different purposes I will make a function taking two arguments: the list to be reordered, and the list whose sorting will decide the order. I'll only return the reordered list.
This answer has timings of three different solutions for creating a permutation list for a sort. Using the fastest option gives this solution:
def pyargsort(seq):
return sorted(range(len(seq)), key=seq.__getitem__)
def using_pyargsort(a, b):
"Reorder the list a the same way as list b would be reordered by a normal sort"
return [a[i] for i in pyargsort(b)]
print using_pyargsort([4, 5, 6], [2, 3, 1]) # [6, 4, 5]
The pyargsort method is inspired by the numpy argsort method, which does the same thing much faster. Numpy also has advanced indexing operations whereby an array can be used as an index, making possible very quick reordering of an array.
So if your need for speed is great, one would assume that this numpy solution would be faster:
import numpy as np
def using_numpy(a, b):
"Reorder the list a the same way as list b would be reordered by a normal sort"
return np.array(a)[np.argsort(b)].tolist()
print using_numpy([4, 5, 6], [2, 3, 1]) # [6, 4, 5]
However, for short lists (length < 1000), this solution is in fact slower than the first. This is because we're first converting the a and b lists to array and then converting the result back to list before returning. If we instead assume you're using numpy arrays throughout your application so that we do not need to convert back and forth, we get this solution:
def all_numpy(a, b):
"Reorder array a the same way as array b would be reordered by a normal sort"
return a[np.argsort(b)]
print all_numpy(np.array([4, 5, 6]), np.array([2, 3, 1])) # array([6, 4, 5])
The all_numpy function executes up to 10 times faster than the using_pyargsort function.
The following logaritmic graph compares these three solutions with the two alternative solutions from the other answers. The arguments are two randomly shuffled ranges of equal length, and the functions all receive identically ordered lists. I'm timing only the time the function takes to execute. For illustrative purposes I've added in an extra graph line for each numpy solution where the 60 ms overhead for loading numpy is added to the time.
As we can see, the all-numpy solution beats the others by an order of magnitude. Converting from python list and back slows the using_numpy solution down considerably in comparison, but it still beats pure python for large lists.
For a list length of about 1'000'000, using_pyargsort takes 2.0 seconds, using_nympy + overhead is only 1.3 seconds, while all_numpy + overhead is 0.3 seconds.
The sorting you describe is not very easy to accomplish. The only way that I can think of to do it is to use zip to create the list you say you don't want to create:
lst = [[4,5,6],[2,3,1]]
# key = operator.itemgetter(1) works too, and may be slightly faster ...
transpose_sort = sorted(zip(*lst),key = lambda x: x[1])
lst = zip(*transpose_sort)
Is there a reason for this constraint?
(Also note that you could do this all in one line if you really want to:
lst = zip(*sorted(zip(*lst),key = lambda x: x[1]))
This also results in a list of tuples. If you really want a list of lists, you can map the result:
lst = map(list, lst)
Or a list comprehension would work as well:
lst = [ list(x) for x in lst ]
If the second list doesn't contain duplicates, you could just do this:
l = [[4,5,6],[2,3,1]] #the list
l1 = l[1][:] #a copy of the to-be-sorted sublist
l[1].sort() #sort the sublist
l[0] = [l[0][l1.index(x)] for x in l[1]] #order the first sublist accordingly
(As this saves the sublist l[1] it might be a bad idea if your input list is huge)
How about this one:
a = [[4,5,6],[2,3,1]]
[a[0][i] for i in sorted(range(len(a[1])), key=lambda x: a[1][x])]
It uses the principal way numpy does it without having to use numpy and without the zip stuff.
Neither using numpy nor the zipping around seems to be the cheapest way for giant structures. Unfortunately the .sort() method is built into the list type and uses hard-wired access to the elements in the list (overriding __getitem__() or similar does not have any effect here).
So you can implement your own sort() which sorts two or more lists according to the values in one; this is basically what numpy does.
Or you can create a list of values to sort, sort that, and recreate the sorted original list out of it.

Categories

Resources