Goal: Create a conditional statement in a list comprehension that (1) dynamically tests -- i.e., upon each iteration -- if the element is not in the list being comprehended given (2) the list is itself updated on each iteration.
Background code:
arr = [2, 2, 4]
l = list()
Desired output:
l = [2, 4]
Desired behavior via for loop:
for element in arr:
if element not in l:
l.append(element)
Incorrect list comprehension not generating desired behavior:
l = [element for element in arr if element not in l]
Question restated: How do I fix the list comprehension so that it generates the desired behavior, i.e., the desired output stated above?
If you absolutely must use a list comprehesion, you can just recast your for loop into one. The downside is that you will end up with a list of None elements, since that is what list.append returns:
>>> arr = [2, 2, 4]
>>> l = list()
>>> _ = [l.append(element) for element in arr if element not in l]
>>> print(l)
[2, 4]
>>> print(_)
[None, None]
If you are tied to comprehensions, but not necessarily to list comprehensions, you can use the generator comprehension suggested by #tdelaney. This will not create any unwanted byproducts and will do exactly what you want.
>>> arr = [2, 2, 4]
>>> l = list()
>>> l.extend(element for element in arr if element not in l)
A better way than either would probably be to put the original list into a set and then back into a list. The advantage of using a set to extending a list is that sets are much faster at adding elements after checking for prior containment. A list has to do a linear search and reallocate every time you add an element.
>>> l = list(set(arr))
if you want to remove duplicates why not use set(the list containing duplicates) or list(dict.fromkeys(the list containing duplicates)?
but to answer your Question:
i think the whole thing is just wrong, l (your list) doesn't get updated with each iteration since it's inside the list comprehension
Related
This question already has answers here:
How do I concatenate two lists in Python?
(31 answers)
Closed 2 months ago.
I am trying to understand if it makes sense to take the content of a list and append it to another list.
I have the first list created through a loop function, that will get specific lines out of a file and will save them in a list.
Then a second list is used to save these lines, and start a new cycle over another file.
My idea was to get the list once that the for cycle is done, dump it into the second list, then start a new cycle, dump the content of the first list again into the second but appending it, so the second list will be the sum of all the smaller list files created in my loop. The list has to be appended only if certain conditions met.
It looks like something similar to this:
# This is done for each log in my directory, i have a loop running
for logs in mydir:
for line in mylog:
#...if the conditions are met
list1.append(line)
for item in list1:
if "string" in item: #if somewhere in the list1 i have a match for a string
list2.append(list1) # append every line in list1 to list2
del list1 [:] # delete the content of the list1
break
else:
del list1 [:] # delete the list content and start all over
Does this makes sense or should I go for a different route?
I need something efficient that would not take up too many cycles, since the list of logs is long and each text file is pretty big; so I thought that the lists would fit the purpose.
You probably want
list2.extend(list1)
instead of
list2.append(list1)
Here's the difference:
>>> a = [1, 2, 3]
>>> b = [4, 5, 6]
>>> c = [7, 8, 9]
>>> b.append(a)
>>> b
[4, 5, 6, [1, 2, 3]]
>>> c.extend(a)
>>> c
[7, 8, 9, 1, 2, 3]
Since list.extend() accepts an arbitrary iterable, you can also replace
for line in mylog:
list1.append(line)
by
list1.extend(mylog)
To recap on the previous answers. If you have a list with [0,1,2] and another one with [3,4,5] and you want to merge them, so it becomes [0,1,2,3,4,5], you can either use chaining or extending and should know the differences to use it wisely for your needs.
Extending a list
Using the list classes extend method, you can do a copy of the elements from one list onto another. However this will cause extra memory usage, which should be fine in most cases, but might cause problems if you want to be memory efficient.
a = [0,1,2]
b = [3,4,5]
a.extend(b)
>>[0,1,2,3,4,5]
Chaining a list
Contrary you can use itertools.chain to wire many lists, which will return a so called iterator that can be used to iterate over the lists. This is more memory efficient as it is not copying elements over but just pointing to the next list.
import itertools
a = [0,1,2]
b = [3,4,5]
c = itertools.chain(a, b)
Make an iterator that returns elements from the first iterable until it is exhausted, then proceeds to the next iterable, until all of the iterables are exhausted. Used for treating consecutive sequences as a single sequence.
Take a look at itertools.chain for a fast way to treat many small lists as a single big list (or at least as a single big iterable) without copying the smaller lists:
>>> import itertools
>>> p = ['a', 'b', 'c']
>>> q = ['d', 'e', 'f']
>>> r = ['g', 'h', 'i']
>>> for x in itertools.chain(p, q, r):
print x.upper()
You can also combine two lists (say a,b) using the '+' operator.
For example,
a = [1,2,3,4]
b = [4,5,6,7]
c = a + b
Output:
>>> c
[1, 2, 3, 4, 4, 5, 6, 7]
That seems fairly reasonable for what you're trying to do.
A slightly shorter version which leans on Python to do more of the heavy lifting might be:
for logs in mydir:
for line in mylog:
#...if the conditions are met
list1.append(line)
if any(True for line in list1 if "string" in line):
list2.extend(list1)
del list1
....
The (True for line in list1 if "string" in line) iterates over list and emits True whenever a match is found. any() uses short-circuit evaluation to return True as soon as the first True element is found. list2.extend() appends the contents of list1 to the end.
You can simply concatnate two lists, e.g:
list1 = [0, 1]
list2 = [2, 3]
list3 = list1 + list2
print(list3)
>> [0, 1, 2, 3]
Using the map() and reduce() built-in functions
def file_to_list(file):
#stuff to parse file to a list
return list
files = [...list of files...]
L = map(file_to_list, files)
flat_L = reduce(lambda x,y:x+y, L)
Minimal "for looping" and elegant coding pattern :)
you can use __add__ Magic method:
a = [1,2,3]
b = [4,5,6]
c = a.__add__(b)
Output:
>>> c
[1,2,3,4,5,6]
If we have list like below:
list = [2,2,3,4]
two ways to copy it into another list.
1.
x = [list] # x =[] x.append(list) same
print("length is {}".format(len(x)))
for i in x:
print(i)
length is 1
[2, 2, 3, 4]
2.
x = [l for l in list]
print("length is {}".format(len(x)))
for i in x:
print(i)
length is 4
2
2
3
4
I wrote a code that eliminates duplicates from a list in Python. Here it is:
List = [4, 2, 3, 1, 7, 4, 5, 6, 5]
NewList = []
for i in List:
if List[i] not in NewList:
NewList.append(i)
print ("Original List:", List)
print ("Reworked List:", NewList)
However the output is:
Original List: [4, 2, 3, 1, 7, 4, 5, 6, 5]
Reworked List: [4, 2, 3, 7, 6]
Why is the 1 missing from the output?
Using set() kills the order. You can try this :
>>> from collections import OrderedDict
>>> NewList = list(OrderedDict.fromkeys(List))
You missunderstood how for loops in python work. If you write for i in List: i will have the values from the list one after another, so in your case 4, 2, 3 ...
I assume you thought it'd be counting up.
You have several different ways of removing duplicates from lists in python that you don't need to write yourself, like converting it to a set and back to a list.
list(set(List))
Also you should read Pep8 and name your variables differently, but that just btw.
Also if you really want a loop with indices, you can use enumerate in python.
for idx, value in enumerate(myList):
print(idx)
print(myList[idx])
Your code is not doing what you think it does. Your problem are these two constructs:
for i in List: # 1
if List[i] # 2
Here you are using i to represent the elements inside the list: 4, 2, 3, ...
Here you are using i to represent the indices of the List: 0, 1, 2, ...
Obviously, 1. and 2. are not compatible. In short, your check is performed for a different element than the one you put in your list.
You can fix this by treating i consistently at both steps:
for i in List:
if i not in NewList:
NewList.append(i)
Your method for iterating over lists is not correct. Your code currently iterates over elements, but then does not use that element in your logic. Your code doesn't error because the values of your list happen also to be valid list indices.
You have a few options:
#1 Iterate over elements directly
Use elements of a list as you iterate over them directly:
NewList = []
for el in L:
if el not in NewList:
NewList.append(i)
#2 Iterate over list index
This is often considered anti-pattern, but is not invalid. You can iterate over the range of the size of the list and then use list indexing:
NewList = []
for idx in range(len(L)):
if L[idx] not in NewList:
NewList.append(i)
In both cases, notice how we avoid naming variables after built-ins. Don't use list or List, you can use L instead.
#3 unique_everseen
It's more efficient to implement hashing for O(1) lookup complexity. There is a unique_everseen recipe in the itertools docs, replicated in 3rd party toolz.unique. This works by using a seen set and tracking items as you iterate.
from toolz import unique
NewList = list(unique(L))
For example I want to check the correlation coefficient between two lists like:
r = np.corrcoef(list25, list26)[0,1]
but I want to exclude -1's in the lists from the calculation. Is there a simple one-liner way of doing this instead of making a new copies of the lists and iterating through to remove all -1's and such?
There is a one liner solution. It's creating a new list without the ones. It can be done using List Comprehension:
new_list = [x for x in old_list if x != -1]
it basically copies everything that matches the condition from the old list to the new list.
So, for your example:
r = np.corrcoef([x for x in list25 if x != -1], [x for x in list26 if x != -1])[0,1]
Use a generator
def greater_neg_1(items):
for item in items:
if item>-1:
yield item
Usage:
>>> L = [1,-1,2,3,4,-1,4]
>>> list(greater_neg_1(L))
[1, 2, 3, 4, 4]
or:
r = np.corrcoef(greater_neg_1(list25), greater_neg_1(list26))[0,1]
Won't require any extra memory.
If you actually want to remove the -1 from the lists:
while -1 in list25: list25.remove(-1)
I'm using python 3.4 and just learning the basics, so please bear with me..
listA = [1,2]
for a in listA:
listA.remove(a)
print(listA)
What is suppose is I get an empty list, but what I get is a list with value '2'. I debugged the code with large no. of values in list and when the list is having a single element the for loop exit.
Why is the last element not removed from the list..?
You should not change a list while iterating over it. The indices of the list change as you remove items, so that some items are never evaluated. Use a list comprehension instead, which creates a new list:
[a for a in list if ...]
In other words, try something like this:
>>> A = [1, 2, 3, 4]
>>> A = [a for a in A if a < 4] # creates new list and evaluates each element of old
>>> A
[1, 2, 3]
When you use a for-loop, an internal counter is used. If you shift the remaining elements to the left while iterating over the list, the left-most element in the remaining list will be not be evaluated. See the note for the for statement.
That happens because the length of the for is evaluated only at the beginning and you modify the list while looping on it:
>>> l = [1,2,3]
>>> l
[1, 2, 3]
>>> for a in l:
print(a)
print(l)
l.remove(a)
print(a)
print(l)
print("---")
1
[1, 2, 3]
1
[2, 3]
---
3
[2, 3]
3
[2]
---
>>>
See? The value of the implicit variable used to index the list and loop over it increases and skip the second element.
If you want to empty a list, do a clear:
>>> l.clear()
>>> l
[]
Or use a different way of looping over the list, if you need to modify it while looping over it.
As mentioned by #Justin in comments, do not alter the list while iterating on it. As you keep on removing the elements from the list, the size of the list shrinks, which will change the indices of the element.
If you need to remove elements from the list one-by-one, iterate over a copy of the list leaving the original list intact, while modifying the duplicated list in the process.
>>> listA = [1,2,3,4]
>>> listB = [1,2,3,4]
>>> for each in listA:
... print each
... listB.remove(each)
1
2
3
4
>>> listB
[]
I would like to duplicate the items of a list into a new list, for example
a=[1,2]
b=[[i,i] for i in a]
gives [[1, 1], [2, 2]], whereas I would like to have [1, 1, 2, 2].
I also found that I could use:
b=[i for i in a for j in a]
but it seemed like overkill to use two for loops. Is it possible to do this using a single for loop?
You want itertools.chain.from_iterable(), which takes an iterable of iterables and returns a single iterable with all the elements of the sub-iterables (flattening by one level):
b = itertools.chain.from_iterable((i, i) for i in a)
Combined with a generator expression, you get the result you want. Obviously, if you need a list, just call list() on the iterator, but in most cases that isn't needed (and is less efficient).
If, as Ashwini suggests, you want each item len(a) times, it's simple to do that as well:
duplicates = len(a)
b = itertools.chain.from_iterable([i] * duplicates for i in a)
Note that any of these solutions do not copy i, they give you multiple references to the same element. Most of the time, that should be fine.
Your two-loop code does not actually do what you want, because the inner loop is evaluated for every step of the outer loop. Here is an easy solution:
b = [j for i in a for j in (i, i)]
You could use xrange and using a generator expression or a list comprehension
b = (x for x in a for _ in xrange(2))
b = [x for x in a for _ in xrange(2)]
if you do not mind the order:
>>> a = [1,2]
>>> a * 2
[1, 2, 1, 2]