Duplicate list items using list comprehension - python

I would like to duplicate the items of a list into a new list, for example
a=[1,2]
b=[[i,i] for i in a]
gives [[1, 1], [2, 2]], whereas I would like to have [1, 1, 2, 2].
I also found that I could use:
b=[i for i in a for j in a]
but it seemed like overkill to use two for loops. Is it possible to do this using a single for loop?

You want itertools.chain.from_iterable(), which takes an iterable of iterables and returns a single iterable with all the elements of the sub-iterables (flattening by one level):
b = itertools.chain.from_iterable((i, i) for i in a)
Combined with a generator expression, you get the result you want. Obviously, if you need a list, just call list() on the iterator, but in most cases that isn't needed (and is less efficient).
If, as Ashwini suggests, you want each item len(a) times, it's simple to do that as well:
duplicates = len(a)
b = itertools.chain.from_iterable([i] * duplicates for i in a)
Note that any of these solutions do not copy i, they give you multiple references to the same element. Most of the time, that should be fine.

Your two-loop code does not actually do what you want, because the inner loop is evaluated for every step of the outer loop. Here is an easy solution:
b = [j for i in a for j in (i, i)]

You could use xrange and using a generator expression or a list comprehension
b = (x for x in a for _ in xrange(2))
b = [x for x in a for _ in xrange(2)]

if you do not mind the order:
>>> a = [1,2]
>>> a * 2
[1, 2, 1, 2]

Related

Add part of the list to another list in Python [duplicate]

This question already has answers here:
How do I concatenate two lists in Python?
(31 answers)
Closed 2 months ago.
I am trying to understand if it makes sense to take the content of a list and append it to another list.
I have the first list created through a loop function, that will get specific lines out of a file and will save them in a list.
Then a second list is used to save these lines, and start a new cycle over another file.
My idea was to get the list once that the for cycle is done, dump it into the second list, then start a new cycle, dump the content of the first list again into the second but appending it, so the second list will be the sum of all the smaller list files created in my loop. The list has to be appended only if certain conditions met.
It looks like something similar to this:
# This is done for each log in my directory, i have a loop running
for logs in mydir:
for line in mylog:
#...if the conditions are met
list1.append(line)
for item in list1:
if "string" in item: #if somewhere in the list1 i have a match for a string
list2.append(list1) # append every line in list1 to list2
del list1 [:] # delete the content of the list1
break
else:
del list1 [:] # delete the list content and start all over
Does this makes sense or should I go for a different route?
I need something efficient that would not take up too many cycles, since the list of logs is long and each text file is pretty big; so I thought that the lists would fit the purpose.
You probably want
list2.extend(list1)
instead of
list2.append(list1)
Here's the difference:
>>> a = [1, 2, 3]
>>> b = [4, 5, 6]
>>> c = [7, 8, 9]
>>> b.append(a)
>>> b
[4, 5, 6, [1, 2, 3]]
>>> c.extend(a)
>>> c
[7, 8, 9, 1, 2, 3]
Since list.extend() accepts an arbitrary iterable, you can also replace
for line in mylog:
list1.append(line)
by
list1.extend(mylog)
To recap on the previous answers. If you have a list with [0,1,2] and another one with [3,4,5] and you want to merge them, so it becomes [0,1,2,3,4,5], you can either use chaining or extending and should know the differences to use it wisely for your needs.
Extending a list
Using the list classes extend method, you can do a copy of the elements from one list onto another. However this will cause extra memory usage, which should be fine in most cases, but might cause problems if you want to be memory efficient.
a = [0,1,2]
b = [3,4,5]
a.extend(b)
>>[0,1,2,3,4,5]
Chaining a list
Contrary you can use itertools.chain to wire many lists, which will return a so called iterator that can be used to iterate over the lists. This is more memory efficient as it is not copying elements over but just pointing to the next list.
import itertools
a = [0,1,2]
b = [3,4,5]
c = itertools.chain(a, b)
Make an iterator that returns elements from the first iterable until it is exhausted, then proceeds to the next iterable, until all of the iterables are exhausted. Used for treating consecutive sequences as a single sequence.
Take a look at itertools.chain for a fast way to treat many small lists as a single big list (or at least as a single big iterable) without copying the smaller lists:
>>> import itertools
>>> p = ['a', 'b', 'c']
>>> q = ['d', 'e', 'f']
>>> r = ['g', 'h', 'i']
>>> for x in itertools.chain(p, q, r):
print x.upper()
You can also combine two lists (say a,b) using the '+' operator.
For example,
a = [1,2,3,4]
b = [4,5,6,7]
c = a + b
Output:
>>> c
[1, 2, 3, 4, 4, 5, 6, 7]
That seems fairly reasonable for what you're trying to do.
A slightly shorter version which leans on Python to do more of the heavy lifting might be:
for logs in mydir:
for line in mylog:
#...if the conditions are met
list1.append(line)
if any(True for line in list1 if "string" in line):
list2.extend(list1)
del list1
....
The (True for line in list1 if "string" in line) iterates over list and emits True whenever a match is found. any() uses short-circuit evaluation to return True as soon as the first True element is found. list2.extend() appends the contents of list1 to the end.
You can simply concatnate two lists, e.g:
list1 = [0, 1]
list2 = [2, 3]
list3 = list1 + list2
print(list3)
>> [0, 1, 2, 3]
Using the map() and reduce() built-in functions
def file_to_list(file):
#stuff to parse file to a list
return list
files = [...list of files...]
L = map(file_to_list, files)
flat_L = reduce(lambda x,y:x+y, L)
Minimal "for looping" and elegant coding pattern :)
you can use __add__ Magic method:
a = [1,2,3]
b = [4,5,6]
c = a.__add__(b)
Output:
>>> c
[1,2,3,4,5,6]
If we have list like below:
list = [2,2,3,4]
two ways to copy it into another list.
1.
x = [list] # x =[] x.append(list) same
print("length is {}".format(len(x)))
for i in x:
print(i)
length is 1
[2, 2, 3, 4]
2.
x = [l for l in list]
print("length is {}".format(len(x)))
for i in x:
print(i)
length is 4
2
2
3
4

Simple way of excluding an element from a calculation on a list?

For example I want to check the correlation coefficient between two lists like:
r = np.corrcoef(list25, list26)[0,1]
but I want to exclude -1's in the lists from the calculation. Is there a simple one-liner way of doing this instead of making a new copies of the lists and iterating through to remove all -1's and such?
There is a one liner solution. It's creating a new list without the ones. It can be done using List Comprehension:
new_list = [x for x in old_list if x != -1]
it basically copies everything that matches the condition from the old list to the new list.
So, for your example:
r = np.corrcoef([x for x in list25 if x != -1], [x for x in list26 if x != -1])[0,1]
Use a generator
def greater_neg_1(items):
for item in items:
if item>-1:
yield item
Usage:
>>> L = [1,-1,2,3,4,-1,4]
>>> list(greater_neg_1(L))
[1, 2, 3, 4, 4]
or:
r = np.corrcoef(greater_neg_1(list25), greater_neg_1(list26))[0,1]
Won't require any extra memory.
If you actually want to remove the -1 from the lists:
while -1 in list25: list25.remove(-1)

Dynamic self-referencing conditional in list comprehension

Goal: Create a conditional statement in a list comprehension that (1) dynamically tests -- i.e., upon each iteration -- if the element is not in the list being comprehended given (2) the list is itself updated on each iteration.
Background code:
arr = [2, 2, 4]
l = list()
Desired output:
l = [2, 4]
Desired behavior via for loop:
for element in arr:
if element not in l:
l.append(element)
Incorrect list comprehension not generating desired behavior:
l = [element for element in arr if element not in l]
Question restated: How do I fix the list comprehension so that it generates the desired behavior, i.e., the desired output stated above?
If you absolutely must use a list comprehesion, you can just recast your for loop into one. The downside is that you will end up with a list of None elements, since that is what list.append returns:
>>> arr = [2, 2, 4]
>>> l = list()
>>> _ = [l.append(element) for element in arr if element not in l]
>>> print(l)
[2, 4]
>>> print(_)
[None, None]
If you are tied to comprehensions, but not necessarily to list comprehensions, you can use the generator comprehension suggested by #tdelaney. This will not create any unwanted byproducts and will do exactly what you want.
>>> arr = [2, 2, 4]
>>> l = list()
>>> l.extend(element for element in arr if element not in l)
A better way than either would probably be to put the original list into a set and then back into a list. The advantage of using a set to extending a list is that sets are much faster at adding elements after checking for prior containment. A list has to do a linear search and reallocate every time you add an element.
>>> l = list(set(arr))
if you want to remove duplicates why not use set(the list containing duplicates) or list(dict.fromkeys(the list containing duplicates)?
but to answer your Question:
i think the whole thing is just wrong, l (your list) doesn't get updated with each iteration since it's inside the list comprehension

Most Pythonic way to iteratively build up a list? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I was trying to do something in Python that uses the following general procedure, and I want to know what the best way to approch this is.
First, an initialization step:
Create an item M.
Create a list L and add M to L.
Second, loop through the following:
Create a new item by modifying the last item added to L.
Add the new item to L.
As a simple example, say I want to create a list of lists where the nth list contains the numbers from 1 to n. I could use the following (silly) procedure.
Initially M is [1] and L=[[1]].
Next, modify [1] by adding 2 to it to create the new item [1,2], then add [1,2] to L so L=[[1],[1,2]].
Next, modify [1,2] by adding 3 to it to create the new item [1,2,3], then add [1,2,3] to L so L=[[1],[1,2],[1,2,3]].
Next, modify [1,2,3] by adding 4 to it to create the new item [1,2,3,4], then add [1,2,3,4] to L so L=[[1],[1,2],[1,2,3],[1,2,3,4]].
etc.
I tried a few things, but most of them would modify not just the last item added but also items added to L in previous steps. For the particular problem I was interested in, I did manage to find a solution that behaves properly (at least for small cases), but it seems inelegant, I’m not sure why it works when other things didn’t, and I’m not even confident that it would still behave as desired for large cases. I’m also not confident that I could adapt my approach to similar problems. It's not a case of me not understanding the problem, since I've coded the same thing in other programming languages without issues.
So I’m wondering how more experienced Python programmers would handle this general task.
(I’m omitting my own code in part because I’m new here and I haven’t figured out how to enter it on stackoverflow, but also because it's long-ish and I don’t want help with the particular problem, but rather with how to handle the more general procedure I described above.)
When adding a list object M to another list, you are only adding a reference; continuing to manipulate the list M means you will see those changes reflected through the other reference(s) too:
>>> M = []
>>> resultlist = []
>>> resultlist.append(M)
>>> M is resultlist[0]
True
>>> M.append(1)
>>> resultlist[0]
[1]
>>> M
[1]
Note that M is resultlist[0] is True; it is the same object.
You'd add a copy of M instead:
resultlist.append(M[:])
The whole slice here ([:] means to slice from start to end) creates a new list with a shallow copy of the contents of M.
The generic way to build produce a series L from a continuously altered starting point M is to use a generator function. Your simple add the next number to M series could be implemented as:
def growing_sequence():
M = []
counter = 0
while True:
M.append(counter)
counter += 1
yield M[:]
This will yield ever longer lists each time you iterate, on demand:
>>> gen = growing_sequence()
>>> next(gen)
[0]
>>> next(gen)
[0, 1]
>>> for i, lst in enumerate(gen):
... print i, lst
... if i == 2: break
...
0 [0, 1, 2]
1 [0, 1, 2, 3]
2 [0, 1, 2, 3, 4]
You can do:
M=[1]
L=[M]
for e in range(5):
li=L[-1][:]
li.append(li[-1]+1)
L.append(li)
Or more tersely:
for e in range(5):
L.append(L[-1][:]+[L[-1][-1]+1])
I think that the best way to do this is with a generator. That way, you don't have to deal with list.append, deep-copying lists or any of that nonsense.
def my_generator(max):
for n in range(max+1):
yield list(range(n+1))
Then, you just have to list-ify it:
>>> list(my_generator(5))
[[0], [0,1], [0,1,2], [0,1,2,3], [0,1,2,3,4], [0,1,2,3,4,5]]
This approach is also more flexible if you wanted to make it an infinite generator. Simply switch the for loop for a while true.
This will be based on iterate from Haskell.
iterate :: (a -> a) -> a -> [a]
iterate f x returns an infinite list of repeated applications of f to x:
iterate f x == [x, f x, f (f x), ...]
In Python:
def iterate(f, x):
while True:
yield x
x = f(x)
Example usage:
>>> import itertools.islice
>>> def take(n, iterable):
... return list(islice(iterable, n))
>>> take(4, iterate(lambda x: x + [len(x) + 1], [1]))
[[1], [1, 2], [1, 2, 3], [1, 2, 3, 4]]
To produce a finite list, the type signature (again starting in Haskell just for clarity) could be infiniteFinitely :: (a -> Maybe a) -> a -> [a].
If we were to use list in place of Maybe in Python:
from itertools import takewhile
def iterateFinitely(f, x):
return map(lambda a: a[0], takewhile(len, iterate(lambda y: f(y[0]), [x])))
Example usage:
>>> list(iterateFinitely(lambda x: [x / 2] if x else [], 20))
[20, 10, 5, 2, 1, 0]
Since ending with a falsy value is probably pretty common, you might also add a version of this function that does that.
def iterateUntilFalsy(f, x):
return iterateFinitely(lambda y: [f(y)] if y else [], x)
Example usage:
>>> list(iterateUntilFalsy(lambda x: x / 2, 20))
[20, 10, 5, 2, 1, 0]
>>> list(iterateUntilFalsy(lambda x: x[1:], [1,2,3,4]))
[[1, 2, 3, 4], [2, 3, 4], [3, 4], [4], []]
Try this:
M = [1]
L = [M]
for _ in xrange(3):
L += [L[-1] + [L[-1][-1] + 1]]
After the above code is executed, L will contain [[1], [1, 2], [1, 2, 3], [1, 2, 3, 4]]. Explanation:
The first two lines simply seed the iteration with initial values
The for line states how many loops we want to perform after the initial value has been set, 3 in this case. I'm using _ as the iteration variable because we're not interested in its value, we just want to do a certain number of loops
Now for the interesting part; and remember that in Python a negative index in a list starts counting from the end, so an index of -1 points to the last element.
This: L += … updates the list, appending a new sublist at the end as many times as specified in the loop
This: [L[-1] + …] creates a new sublist by taking the last sublist and adding a new element at the end
And finally this: [L[-1][-1] + 1] obtains the previous last element in the last sublist, adds one to it and returns a single-element list to be concatenated at the end of the previous expression

how to safely remove elements from a list in Python

I loop through a list and remove the elements that satisfy my condition. But why doesn't this work, as noted below? Thank you.
>>> a=[ i for i in range(4)]
>>> a
[0, 1, 2, 3]
>>> for e in a:
... if (e > 1) and (e < 4):
... a.remove(e)
...
>>> a
[0, 1, 3]
>>> a=[ i for i in range(4)]
>>> for e in a:
... if (e > -1) and (e < 3):
... a.remove(e)
...
>>> a
[1, 3]
You cannot change something while you're iterating it. The results are weird and counter-intuitive, and nearly never what you want. In fact, many collections explicitly disallow this (e.g. sets and dicts).
Instead, iterate over a copy (for e in a[:]: ...) or, instead of modifying an existing list, filter it to get a new list containing the items you want ([e for e in a if ...]). Note that in many cases, you don't have to iterate again to filter, just merge the filtering with the generation of the data.
Why don't you just do this initially in the list comprehension? E.g.
[i for i in range(4) if i <= 1 or i >= 4]
You can also use this to construct a new list from the existing list, e.g.
[x for x in a if x <= 1 or x >= 4]
The idea of filtering is a good one, however it misses the point which is that some lists may be very large and the number of elements to remove may be very small.
In which case the answer is to remember the list indexes of the elements to remove and then iterate through the list of indexes, sorted from largest to smallest, removing the elements.
The easiest way to visualize it is to think of the iteration working on list-offsets instead of the actual items - do something to the first item, then the second item, then the third item, until it runs out of items. If you change the number of items in the list, it changes the offsets of all the remaining items in the list:
lst = [1,2,3,4]
for item in lst:
if item==2:
lst.remove(item)
else:
print item
print lst
results in
1
4
[1,3,4]
which makes sense if you step through it like so:
[1,2,3,4]
^
first item is not 2, so print it -> 1
[1,2,3,4]
^
second item is 2, so remove it
[1,3,4]
^
third item is 4, so print it -> 4
The only real solution is do not change the number of items in the list while you are iterating over it. Copy the items you want to keep to a new list, or keep track of the values you want to remove and do the remove-by-value in a separate pass.
It is not safe to remove elements from a list while iterating though it. For that exists the filter function. It takes a function(that admits one argument) and an iterable(in this case your list). It returns a new iterable of the same type(list again here) with the elements where the function applied to that element returned True:
In your case you can use a lambda function like this:
a = filter(lambda x: x > 1 and x < 4, range(4))
or if you have the list already:
a = range(4)
a = filter(lambda x: x > 1 and x < 4, a)
remember that if you are using python3 it will return an iterator and not a list.

Categories

Resources