Remove empty nested lists - Python - python

I'm reading in a .csv file to a list and it appends an empty lists, I'm using the code below to do this.
with open('Scores.csv', 'r') as scores:
reader = csv.reader(scores)
tscores = [[str(e) for e in r] for r in reader]
It creates a list of nested lists correctly but appends an empty list after every row read in like so:
[[score1, name1], [], [score2, name2], []]
I believe it's reading \n as an empty string which is why I'm getting that, so I tried to remove empty lists using:
tscores = [tscores.remove(x) for x in tscores if x]
which does delete empty nested lists, but it sets all other nested lists that contained data to None i.e. [None, None]. I modified to:
tscores = [tscores.remove(x) for x in tscores if []]
which wipes out all nested lists completely.
How can I read the file with the same output (a list of nested lists) without appending empty lists or how can I remove all empty lists after being read in?

I think what you want to do is
tscores = [x for x in tscores if x != []]
which make a list of only the none empty lists in tscores

Alternative to user2990008's answer, you can not create the empty lists in the first place:
tscores = [[str(e) for e in r] for r in reader if len(r) > 0]

Just for completeness: In such cases I think that list comprehensions are not the most simple solution. Here functional programming would make sense, imho.
To "automatically" iterate over a list and filter specific elements, you could use the built-in function filter:
In [89]: a = [ [1, 2], [], [3, 4], [], [5, 6], [], [], [9, 5, 2, 5]]
In [91]: filter(lambda x: len(x) > 0, a)
Out[91]: [[1, 2], [3, 4], [5, 6], [9, 5, 2, 5]]
Every element x of the list a is passed to the lambda function and the returned list only contains an element of a if and only if the condition len(x) > 0 is met. Therefore a list without the nested empty lists is returned.

I'm not sure I'm understanding your question correctly, but you can remove the empty entries from a list of lists (or a list of tuples or a list of other sequences) using something like this:
#/bin/python
# ...
with open('Scores.csv', 'r') as scores:
reader = csv.reader(scores)
tscores = [[str(e) for e in r] for r in reader if len(r)]
... remember that your list comprehension can handle optional conditional clauses for filtering. This will only work if you can ensure that every element of the list that you're traversing support the len() function (of course you can ensure this by using a more complex condition such as: hasattr(r, 'len') and len(r)
Note: this only tests one level of depth ... it is not recursive.

tscores = [x for x in tscores if x]
If the list is empty the conditional will return false, and therefore will not be included in tscores.

Related

Add part of the list to another list in Python [duplicate]

This question already has answers here:
How do I concatenate two lists in Python?
(31 answers)
Closed 2 months ago.
I am trying to understand if it makes sense to take the content of a list and append it to another list.
I have the first list created through a loop function, that will get specific lines out of a file and will save them in a list.
Then a second list is used to save these lines, and start a new cycle over another file.
My idea was to get the list once that the for cycle is done, dump it into the second list, then start a new cycle, dump the content of the first list again into the second but appending it, so the second list will be the sum of all the smaller list files created in my loop. The list has to be appended only if certain conditions met.
It looks like something similar to this:
# This is done for each log in my directory, i have a loop running
for logs in mydir:
for line in mylog:
#...if the conditions are met
list1.append(line)
for item in list1:
if "string" in item: #if somewhere in the list1 i have a match for a string
list2.append(list1) # append every line in list1 to list2
del list1 [:] # delete the content of the list1
break
else:
del list1 [:] # delete the list content and start all over
Does this makes sense or should I go for a different route?
I need something efficient that would not take up too many cycles, since the list of logs is long and each text file is pretty big; so I thought that the lists would fit the purpose.
You probably want
list2.extend(list1)
instead of
list2.append(list1)
Here's the difference:
>>> a = [1, 2, 3]
>>> b = [4, 5, 6]
>>> c = [7, 8, 9]
>>> b.append(a)
>>> b
[4, 5, 6, [1, 2, 3]]
>>> c.extend(a)
>>> c
[7, 8, 9, 1, 2, 3]
Since list.extend() accepts an arbitrary iterable, you can also replace
for line in mylog:
list1.append(line)
by
list1.extend(mylog)
To recap on the previous answers. If you have a list with [0,1,2] and another one with [3,4,5] and you want to merge them, so it becomes [0,1,2,3,4,5], you can either use chaining or extending and should know the differences to use it wisely for your needs.
Extending a list
Using the list classes extend method, you can do a copy of the elements from one list onto another. However this will cause extra memory usage, which should be fine in most cases, but might cause problems if you want to be memory efficient.
a = [0,1,2]
b = [3,4,5]
a.extend(b)
>>[0,1,2,3,4,5]
Chaining a list
Contrary you can use itertools.chain to wire many lists, which will return a so called iterator that can be used to iterate over the lists. This is more memory efficient as it is not copying elements over but just pointing to the next list.
import itertools
a = [0,1,2]
b = [3,4,5]
c = itertools.chain(a, b)
Make an iterator that returns elements from the first iterable until it is exhausted, then proceeds to the next iterable, until all of the iterables are exhausted. Used for treating consecutive sequences as a single sequence.
Take a look at itertools.chain for a fast way to treat many small lists as a single big list (or at least as a single big iterable) without copying the smaller lists:
>>> import itertools
>>> p = ['a', 'b', 'c']
>>> q = ['d', 'e', 'f']
>>> r = ['g', 'h', 'i']
>>> for x in itertools.chain(p, q, r):
print x.upper()
You can also combine two lists (say a,b) using the '+' operator.
For example,
a = [1,2,3,4]
b = [4,5,6,7]
c = a + b
Output:
>>> c
[1, 2, 3, 4, 4, 5, 6, 7]
That seems fairly reasonable for what you're trying to do.
A slightly shorter version which leans on Python to do more of the heavy lifting might be:
for logs in mydir:
for line in mylog:
#...if the conditions are met
list1.append(line)
if any(True for line in list1 if "string" in line):
list2.extend(list1)
del list1
....
The (True for line in list1 if "string" in line) iterates over list and emits True whenever a match is found. any() uses short-circuit evaluation to return True as soon as the first True element is found. list2.extend() appends the contents of list1 to the end.
You can simply concatnate two lists, e.g:
list1 = [0, 1]
list2 = [2, 3]
list3 = list1 + list2
print(list3)
>> [0, 1, 2, 3]
Using the map() and reduce() built-in functions
def file_to_list(file):
#stuff to parse file to a list
return list
files = [...list of files...]
L = map(file_to_list, files)
flat_L = reduce(lambda x,y:x+y, L)
Minimal "for looping" and elegant coding pattern :)
you can use __add__ Magic method:
a = [1,2,3]
b = [4,5,6]
c = a.__add__(b)
Output:
>>> c
[1,2,3,4,5,6]
If we have list like below:
list = [2,2,3,4]
two ways to copy it into another list.
1.
x = [list] # x =[] x.append(list) same
print("length is {}".format(len(x)))
for i in x:
print(i)
length is 1
[2, 2, 3, 4]
2.
x = [l for l in list]
print("length is {}".format(len(x)))
for i in x:
print(i)
length is 4
2
2
3
4

Iterate through a list, compare values and remove duplicate - Python

I am curious. How can I correctly iterate through a list, compare two values and delete the duplicate if it exists.
Here I created a nested for loop:
my_list = [ 1, 2, 3, 4, 5 ]
temp = [1, 5, 6]
def remove_items_from_list(ordered_list, temp):
# Removes all values, found in items_to_remove list, from my_list
for j in range(0, len(temp)):
for i in range(0, len(ordered_list)):
if ordered_list[i] == temp[j]:
ordered_list.remove(ordered_list[i])
But when I execute my my code I get an error:
File "./lab3f.py", line 15, in remove_items_from_list
if ordered_list[i] == items_to_remove[j]:
can anyone explain why?
This question, wanted to me compare two lists with one another, and these lists have two different lengths. If an item in list a matched a value in list b, we wanted then to delete it from list a.
You actually can remove items from a list while iterating over it but do read links by #ReblochonMasque.
Here is one way of removing duplicates:
def remove_items_from_list(ordered_list, temp):
n = len(ordered_list)
for i in range(n - 1, -1, -1):
if ordered_list[i] in temp:
del ordered_list[i]
Then
>>> remove_items_from_list(my_list, temp)
>>> print(my_list)
[2, 3, 4]
However, one of the easiest ways of solving your problem is to use sets:
list(set(my_list) - set(temp))
When using this approach, order of items in the resulting list may be arbitrary. Also, this will create a new list instead of modifying an existing list object. If order is important - use list comprehension:
[v for v in my_list if v not in temp]
While you iterating your loop, you remove item from orderer_list which cause index error
Try this:
def remove_items_from_list(ordered_list, temp):
list_ = [x for x in orderer_list if x not in temp]
return list_
first find the duplicated elements and then remove them from the original list.
dup_list = [item for item in temp if item in my_list]
for ele in dup_list:
my_list.remove(ele)
remove() source
You can't remove an items from the list you are iterating over. You can create a copy of the array and remove items from it.

List of list, converting all strings to int, Python 3

I am trying to convert all elements of the small lists in the big list to integers, so it should look like this:
current list:
list = [['1','2','3'],['8','6','8'],['2','9','3'],['2','5','7'],['5','4','1'],['0','8','7']]
for e in list:
for i in e:
i = int(i)
new list:
list = [[1,2,3],[8,6,8],[2,9,3],[2,5,7],[5,4,1],[0,8,7]]
Could anyone tell me why doesn't this work and show me a method that does work? Thanks!
You can use a nested list comprehension:
converted = [[int(num) for num in sub] for sub in lst]
I also renamed list to lst, because list is the name of the list type and not recommended to use for variable names.
for e in range(len(List)):
for p in range(len(List[e])):
List[e][p] = int(List[e][p])
Or, you could create a new list:
New = [list(map(int, sublist)) for sublist in List]
Nested list comprehension is the best solution, but you can also consider map with lambda function:
lista = [['1','2','3'],['8','6','8'],['2','9','3'],['2','5','7'],['5','4','1'],['0','8','7']]
new_list = map(lambda line: [int(x) for x in line],lista)
# Line is your small list.
# With int(x) you are casting every element of your small list to an integer
# [[1, 2, 3], [8, 6, 8], [2, 9, 3], [2, 5, 7], [5, 4, 1], [0, 8, 7]]
In short, you are not mutating lst:
for e in lst:
for i in e:
# do stuff with i
is the equivalent of
for e in lst:
for n in range(len(e)):
i = e[n] # i and e[n] are assigned to the identical object
# do stuff with i
Now, whether the "stuff" you are doing to i is reflected in the original data, depends on whether it is a mutation of the object, e.g.
i.attr = 'value' # mutation of the object is reflected both in i and e[n]
However, string types (str, bytes, unicode) and int are immutable in Python and variable assignment is not a mutation, but a rebinding operation.
i = int(i)
# i is now assigned to a new different object
# e[n] is still assigned to the original string
So, you can make your code work:
for e in lst:
for n in range(len(e)):
e[n] = int(e[n])
or use a shorter comprehension notation:
new_lst = [[int(x) for x in sub] for sub in lst]
Note, however, that the former mutates the existing list object lst, while the latter creates a new object new_lst leaving the original unchanged. Which one you choose will depend on the needs of your program.

Simple way of excluding an element from a calculation on a list?

For example I want to check the correlation coefficient between two lists like:
r = np.corrcoef(list25, list26)[0,1]
but I want to exclude -1's in the lists from the calculation. Is there a simple one-liner way of doing this instead of making a new copies of the lists and iterating through to remove all -1's and such?
There is a one liner solution. It's creating a new list without the ones. It can be done using List Comprehension:
new_list = [x for x in old_list if x != -1]
it basically copies everything that matches the condition from the old list to the new list.
So, for your example:
r = np.corrcoef([x for x in list25 if x != -1], [x for x in list26 if x != -1])[0,1]
Use a generator
def greater_neg_1(items):
for item in items:
if item>-1:
yield item
Usage:
>>> L = [1,-1,2,3,4,-1,4]
>>> list(greater_neg_1(L))
[1, 2, 3, 4, 4]
or:
r = np.corrcoef(greater_neg_1(list25), greater_neg_1(list26))[0,1]
Won't require any extra memory.
If you actually want to remove the -1 from the lists:
while -1 in list25: list25.remove(-1)

Dynamic self-referencing conditional in list comprehension

Goal: Create a conditional statement in a list comprehension that (1) dynamically tests -- i.e., upon each iteration -- if the element is not in the list being comprehended given (2) the list is itself updated on each iteration.
Background code:
arr = [2, 2, 4]
l = list()
Desired output:
l = [2, 4]
Desired behavior via for loop:
for element in arr:
if element not in l:
l.append(element)
Incorrect list comprehension not generating desired behavior:
l = [element for element in arr if element not in l]
Question restated: How do I fix the list comprehension so that it generates the desired behavior, i.e., the desired output stated above?
If you absolutely must use a list comprehesion, you can just recast your for loop into one. The downside is that you will end up with a list of None elements, since that is what list.append returns:
>>> arr = [2, 2, 4]
>>> l = list()
>>> _ = [l.append(element) for element in arr if element not in l]
>>> print(l)
[2, 4]
>>> print(_)
[None, None]
If you are tied to comprehensions, but not necessarily to list comprehensions, you can use the generator comprehension suggested by #tdelaney. This will not create any unwanted byproducts and will do exactly what you want.
>>> arr = [2, 2, 4]
>>> l = list()
>>> l.extend(element for element in arr if element not in l)
A better way than either would probably be to put the original list into a set and then back into a list. The advantage of using a set to extending a list is that sets are much faster at adding elements after checking for prior containment. A list has to do a linear search and reallocate every time you add an element.
>>> l = list(set(arr))
if you want to remove duplicates why not use set(the list containing duplicates) or list(dict.fromkeys(the list containing duplicates)?
but to answer your Question:
i think the whole thing is just wrong, l (your list) doesn't get updated with each iteration since it's inside the list comprehension

Categories

Resources