Remove matching items in a list

Remove matching items in a list - python

Sorry if this is a duplicate question, I searched and couldn't find anything to help.
I'm currently trying to compare two lists. If there are any matching items I will remove them all from one of the lists.
However the results I have are buggy. Here is a rough but accurate representation of the method I'm using:
>>> i = [1,2,3,4,5,6,7,8,9]
>>> a = i
>>> c = a
>>> for b in c:
if b in i:
a.remove(b)
>>> a
[2, 4, 6, 8]
>>> c
[2, 4, 6, 8]
So I realised that the main issue is that as I remove items it shortens the list, so Python then skips over the intermediate item (seriously annoying). As a result I made a third list to act as an intermediate that can be looped over.
What really baffles me is that this list seems to change also even when I haven't directly asked it to!

In python, when you write this:
i = [1,2,3,4,5,6,7,8,9]
You create an Object (in this case, a list) and you assign it to the name i. Your next line, a = i, tells the interpreter that the name a refers to the same Object. If you want them to be separate Object you need to copy the original list. You can do that via the slicing shorthand, i[:], or you can use a = list(i) to be more explicit.

The easiest way to do this is use a set to determine shared items in a and b:
for x in set(a).intersection(b):
a.remove(x)

Your statements a = i and c = a merely make new names that reference the same object. Then as you removed things from a, it's removed from b and i, since they are the same object. You'll want to make copies of the lists instead, like so
a = i[:]
c = a[:]

a = i Doesn't make a copy of a list, it just sets another variable, i to point at your list a. Try something like this:
>>> i = [1, 2, 3, 2, 5, 6]
>>> s = []
>>> for i in t:
if i not in s:
s.append(i)
>>> s
[1, 2, 3, 5, 6]
You can also use set which guarantees no duplicates, but doesn't preserve the order:
list(set(i))

Related

Add part of the list to another list in Python [duplicate]

This question already has answers here:
How do I concatenate two lists in Python?
(31 answers)
Closed 2 months ago.
I am trying to understand if it makes sense to take the content of a list and append it to another list.
I have the first list created through a loop function, that will get specific lines out of a file and will save them in a list.
Then a second list is used to save these lines, and start a new cycle over another file.
My idea was to get the list once that the for cycle is done, dump it into the second list, then start a new cycle, dump the content of the first list again into the second but appending it, so the second list will be the sum of all the smaller list files created in my loop. The list has to be appended only if certain conditions met.
It looks like something similar to this:
# This is done for each log in my directory, i have a loop running
for logs in mydir:
for line in mylog:
#...if the conditions are met
list1.append(line)
for item in list1:
if "string" in item: #if somewhere in the list1 i have a match for a string
list2.append(list1) # append every line in list1 to list2
del list1 [:] # delete the content of the list1
break
else:
del list1 [:] # delete the list content and start all over
Does this makes sense or should I go for a different route?
I need something efficient that would not take up too many cycles, since the list of logs is long and each text file is pretty big; so I thought that the lists would fit the purpose.

You probably want
list2.extend(list1)
instead of
list2.append(list1)
Here's the difference:
>>> a = [1, 2, 3]
>>> b = [4, 5, 6]
>>> c = [7, 8, 9]
>>> b.append(a)
>>> b
[4, 5, 6, [1, 2, 3]]
>>> c.extend(a)
>>> c
[7, 8, 9, 1, 2, 3]
Since list.extend() accepts an arbitrary iterable, you can also replace
for line in mylog:
list1.append(line)
by
list1.extend(mylog)

To recap on the previous answers. If you have a list with [0,1,2] and another one with [3,4,5] and you want to merge them, so it becomes [0,1,2,3,4,5], you can either use chaining or extending and should know the differences to use it wisely for your needs.
Extending a list
Using the list classes extend method, you can do a copy of the elements from one list onto another. However this will cause extra memory usage, which should be fine in most cases, but might cause problems if you want to be memory efficient.
a = [0,1,2]
b = [3,4,5]
a.extend(b)
>>[0,1,2,3,4,5]
Chaining a list
Contrary you can use itertools.chain to wire many lists, which will return a so called iterator that can be used to iterate over the lists. This is more memory efficient as it is not copying elements over but just pointing to the next list.
import itertools
a = [0,1,2]
b = [3,4,5]
c = itertools.chain(a, b)
Make an iterator that returns elements from the first iterable until it is exhausted, then proceeds to the next iterable, until all of the iterables are exhausted. Used for treating consecutive sequences as a single sequence.

Take a look at itertools.chain for a fast way to treat many small lists as a single big list (or at least as a single big iterable) without copying the smaller lists:
>>> import itertools
>>> p = ['a', 'b', 'c']
>>> q = ['d', 'e', 'f']
>>> r = ['g', 'h', 'i']
>>> for x in itertools.chain(p, q, r):
print x.upper()

You can also combine two lists (say a,b) using the '+' operator.
For example,
a = [1,2,3,4]
b = [4,5,6,7]
c = a + b
Output:
>>> c
[1, 2, 3, 4, 4, 5, 6, 7]

That seems fairly reasonable for what you're trying to do.
A slightly shorter version which leans on Python to do more of the heavy lifting might be:
for logs in mydir:
for line in mylog:
#...if the conditions are met
list1.append(line)
if any(True for line in list1 if "string" in line):
list2.extend(list1)
del list1
....
The (True for line in list1 if "string" in line) iterates over list and emits True whenever a match is found. any() uses short-circuit evaluation to return True as soon as the first True element is found. list2.extend() appends the contents of list1 to the end.

You can simply concatnate two lists, e.g:
list1 = [0, 1]
list2 = [2, 3]
list3 = list1 + list2
print(list3)
>> [0, 1, 2, 3]

Using the map() and reduce() built-in functions
def file_to_list(file):
#stuff to parse file to a list
return list
files = [...list of files...]
L = map(file_to_list, files)
flat_L = reduce(lambda x,y:x+y, L)
Minimal "for looping" and elegant coding pattern :)

you can use __add__ Magic method:
a = [1,2,3]
b = [4,5,6]
c = a.__add__(b)
Output:
>>> c
[1,2,3,4,5,6]

If we have list like below:
list = [2,2,3,4]
two ways to copy it into another list.
1.
x = [list] # x =[] x.append(list) same
print("length is {}".format(len(x)))
for i in x:
print(i)
length is 1
[2, 2, 3, 4]
2.
x = [l for l in list]
print("length is {}".format(len(x)))
for i in x:
print(i)
length is 4
2
2
3
4

Python append function is not working as expected

>>> a = [1,2,3]
>>> b = []
>>> b.append(a)
>>> print(b)
[[1, 2, 3]]
>>> num = a.pop(0)
>>> a.append(num)
>>> print(a)
[2, 3, 1]
>>> b.append(a)
>>> print(b)
[[2, 3, 1], [2, 3, 1]]
>>>
Why is this happening and how to fix it? I need the list like
[[1, 2, 3], [2, 3, 1]]
Thank you.
Edit:
Also, why is this working?
>>> a = []
>>> b = []
>>> a = [1,2,3]
>>> b.append(a)
>>> a = [1,2,3,4]
>>> b.append(a)
>>> print(b)
[[1, 2, 3], [1, 2, 3, 4]]
>>>
'''

Append a copy of your list a, at least the first time. Otherwise, you've appended the same list both times.
b.append(a[:])

When you append the list a, python creates a reference to that variable inside the list b. So when you edit the list a, it is reflected again in the list b. You need to create a copy of your variable and then append it to get the desired result.

Every variable name in Python should be thought of as a reference to a piece of data. In your first listing, b contains two references to the same underlying object that is also referenced by the name a. That object gets changed in-place by the operations you’re using to rotate its members. The effect of that change is seen when you look at either of the two references to the object found in b, or indeed when you look at the reference associated with the name a.
Their identicality can be seen by using the id() function: id(a), id(b[0]) and id(b[1]) all return the same number, which is the unique identifier of the underlying list object that they all refer to. Or you can use the is operator: b[0] is b[1] evaluates to True.
By contrast, in the second listing, you reassign a—in other words, by using the assignment operator = you cause that name to become associated with a different object: in this case, a new list object that you just created with your square-bracketed literal expression. b still contains one reference to the old list, and now you append a new reference that points to this different piece of underlying data. So the two elements of b now look different from each other—and indeed they are different objects and accordingly have different id() numbers, only one of which is the same as the current id(a). b[0] is b[1] now evaluates to False
How to fix it? Reassign the name a before changing it: for example, create a copy:
a = list(a)
or:
import copy
a = copy.copy(a)
(or you could even use copy.deepcopy()—study the difference). Alternatively, rotate the members a using methods that entail reassignment rather than in-place changes—e.g.:
a = a[1:] + a[:1]
(NB immutable objects such as the tuple avoid this whole confusion —not because they behave fundamentally differently but because they lack methods that produce in-place changes and therefore force you to use reassignment strategies.)

In addition to making the copy of a by doing a[:] and assigning it to b.
You can also use collections.deque.rotate to rotate your list
from collections import deque
a = [1,2,3]
#Make a deque of copy of a
b = deque(a[:])
#Rotate the deque
b.rotate(len(a)-1)
#Create the list and print it
print([a,list(b)])
#[[1, 2, 3], [2, 3, 1]]

Function which removes the first item in a list (Python)

I am trying to write a function which removes the first item in a Python list. This is what I've tried. Why doesn't remove_first_wrong change l when I call the function on it? And why does the list slicing approach work when I do it in the main function?
def remove_first_wrong(lst):
lst = lst[1:]
def remove_first_right(lst):
lst.pop(0)
if __name__ == '__main__':
l = [1, 2, 3, 4, 5]
remove_first_wrong(l)
print(l)
l_2 = [1, 2, 3, 4, 5]
remove_first_right(l_2)
print(l_2)
# Why does this work and remove_first_wrong doesn't?
l_3 = [1, 2, 3, 4, 5]
l_3 = l_3[1:]
print(l_3)

Slicing a list returns a new list object, which is a copy of the original list indices you indicated in the slice. You then rebound lst (a local name in the function) to reference that new list instead. The old list is never altered in that process.
list.pop() on the other hand, operates on the list object itself. It doesn't matter what reference you used to reach the list.
You'd see the same thing without functions:
>>> a = [1, 2]
>>> b = a[:] # slice with all the elements, produces a *copy*
>>> b
[1, 2]
>>> a.pop() # remove an element from a won't change b
2
>>> b
[1, 2]
>>> a
[1]
Using [:] is one of two ways of making a shallow copy of a list, see How to clone or copy a list?
You may want to read or watch Ned Batchelder's Names and Values presestation, to further help understand how Python names and objects work.

Inside the function remove_first_wrong the = sign reassigns the name lst to the object on the right. Which is a brand new object, created by slicing operation lst[1:]. Thus, the object lst assigned to is local to that function (and it actually will disappear on return).
That is what Martijn means by "You then rebound lst (a local name in the function) to reference that new list instead."
On contrary, lst.pop(0) is a call to the given object -- it operates on the object.
For example, this will work right too:
def remove_first_right2(lst):
x = lst # x is assigned to the same object as lst
x.pop(0) # pop the item from the object

Alternately, you can use del keyword:
def remove_first_element(lst):
del lst[0]
return lst

Merge two lists that are generated by a python code

I know we can merge two lists by using something like final_list= list1 + list2 but if the lists are generated by a python code and they don't have a variable associated with them like list1 and list2, how can we merge them? Say, my code does something like print output to give:
[1,2,3,4]
[2,0,5,6]
I'd like to merge them so I can get unique values using set(final_list). But how do I get the final_list?
PS- My code can return multiple lists. It is not restricted to two.

def somefunc(param):
#does something
return alist,blist
my_alist,my_blist = somefunc(myparam)
print my_alist, my_blist
#prints both lists.
When you return multiple values from a function they are returned in a tuple. You can easily unpack the tuple

You can either modify the function which is generating output, or the harder way being you manually convert it into a string and then into a set.
list = []
strings_of_list = output.split('\n')
for string in strings_of_list:
values = string[1:-1].split(',')
for val in values:
list+=[int(val)]
set(list)

Assign a variable to a function. Taking the lists the function generated, join them together in another variable. Just make sure that your function returns the generated list, and doesn't just print it out.
# my_list_generator returns two values.
>>> a, b = my_list_generator()
>>> a
[1, 2, 3, 4]
>>> b
[2, 0, 5, 6]
>>> final_list = a + b
>>> final_list
[1, 2, 3, 4, 2, 0, 5, 6]
Cross all that out! Now that I know the function can return multiple objects, let do this (with a little list comprehension):
lists = [i for i in my_list_generator()]
# lists might look like [[1, 2, 3, 4], [2, 0, 5, 6]]
# And now use a for loop to get each value
final_list = []
for sublist in lists:
final_list.extend(sublist)
# final_list will look like [1,2,3,4,2,0,5,6]
Also, if you don't want duplicates, just do one more thing:
real_final_list = [i for i in final_list if i not in real_final_list]

If I understand correctly:
You have a function (let's call it listGen() for now) which returns some number of lists. Now, you want to put these list together into one big list, final_list.
You could do the following:
# listGen defined earlier
final_list = []
for i in listGen():
final_list += i
unique_values = set(final_list) # or whatever you wanted to do with it
Since listGen returns a tuple, we can loop over its contents, those being the lists you want to append to each other.

How to check if a list exists in python?

I have a networkx graph.
I am adding nodes by adding edges
G.add_edge(route[i-1],route[i]);
Now once the node is created by directly adding edges,
I add a list named
G.node[route[i]]['position'] = list()
and I append positions to it when I get same nodes again and again
G.node[route[i]]['position'].append( i - 3 )
Now when I want to append how do I check whether the list exist?
does doing
G.node[route[i]]['position'] = list()
clear the list of already existing elements?
edit----- my earlier question was confusing
I want to keep adding to the list
but I cant append unless a list exists, right?
So I have to do do
G.node[route[i]]['position'] = list() in my loop
So next time when I want to add to the same list in another loop instance how do I know that a list exists for G.node[route[i]]['position'] and I dont have to create it again.
edit-----
I think my list itself is a key here
so I did
if not 'position' in G.node[route[i]]:
and it works

G.node[route[i]]['position'] = list() will leave the slot G.node[route[i]]['position'] holding an empty list, but it will not affect the list that it previously held, which other objects may have a reference to.
Instead, use: del l[:] to empty the list.
If you want to have a list automatically created, use collections.defaultdict to have newly created entries default to a list.

Yes, that clears the existing list. You could try
G.node[route[i]].setdefault('position', []).append(...)
whenever you want to append elements.

Not sure if this is what you mean, but assigning list() should make sure that there is a list to append to. If there's already a list the assignment creates a new one (see answer of Marcin). Test:
>>> a = list()
>>> for i in range(10):
... a.append(i)
...
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> b = a
>>> b
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> a = list()
>>> a
[]
>>> b
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Just use get:
if G.node[route[i]].get('position') is None:
G.node[route[i]]['position'] = list()
else:
G.node[route[i]]['position'].append(stuff)

1) How do I check whether the list exist?
Use isinstance() and do something like:
if not ininstance(G.node[route[i]]['position'], list):
G.node[route[i]]['position'] = list()
G.node[route[i]]['position'].append(i - 3)
Or use type like:
if not type(G.node[route[i]]['position']) is list
I must say that this kind of checking is rather un-pythonic, usually you should know what G.node[route[i]]['position'] was before becoming a list and check for that.
For example, if it was None you could do (assuming that the key 'position' exists, otherwise just call get('position')):
if G.node[route[i]]['position'] is None:
G.node[route[i]]['position'] = list()
G.node[route[i]]['position'].append(i - 3)
2) Does doing .. = list() clear the list of already existing elements?
The answer is No.
list() will instantiate a new empty list.
You may want to take a look at this SO question: How to empty a list in Python?.
In short:
G.node[route[i]]['position'][:] = []
will clear your list.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Remove matching items in a list - python

The easiest way to do this is use a set to determine shared items in a and b: for x in set(a).intersection(b): a.remove(x)

Your statements a = i and c = a merely make new names that reference the same object. Then as you removed things from a, it's removed from b and i, since they are the same object. You'll want to make copies of the lists instead, like so a = i[:] c = a[:]

Related

Add part of the list to another list in Python [duplicate]

Python append function is not working as expected

Function which removes the first item in a list (Python)

Merge two lists that are generated by a python code

How to check if a list exists in python?

Categories

Resources