Python: if element in one list, change element in other? - python

I have two lists (of different lengths). One changes throughout the program (list1), the other (longer) doesn't (list2). Basically I have a function that is supposed to compare the elements in both lists, and if an element in list1 is in list2, that element in a copy of list2 is changed to 'A', and all other elements in the copy are changed to 'B'. I can get it to work when there is only one element in list1. But for some reason if the list is longer, all the elements in list2 turn to B....
def newList(list1,list2):
newList= list2[:]
for i in range(len(list2)):
for element in list1:
if element==newList[i]:
newList[i]='A'
else:
newList[i]='B'
return newList

Try this:
newlist = ['A' if x in list1 else 'B' for x in list2]
Works for the following example, I hope I understood you correctly? If a value in B exists in A, insert 'A' otherwise insert 'B' into a new list?
>>> a = [1,2,3,4,5]
>>> b = [1,3,4,6]
>>> ['A' if x in a else 'B' for x in b]
['A', 'A', 'A', 'B']

It could be because instead of
newList: list2[:]
you should have
newList = list2[:]
Personally, I prefer the following syntax, which I find to be more explicit:
import copy
newList = copy.copy(list2) # or copy.deepcopy
Now, I'd imagine part of the problem here is also that you use the same name, newList, for both your function and a local variable. That's not so good.
def newList(changing_list, static_list):
temporary_list = static_list[:]
for index, content in enumerate(temporary_list):
if content in changing_list:
temporary_list[index] = 'A'
else:
temporary_list[index] = 'B'
return temporary_list
Note here that you have not made it clear what to do when there are multiple entries in list1 and list2 that match. My code marks all of the matching ones 'A'. Example:
>>> a = [1, 2, 3]
>>> b = [3,4,7,2,6,8,9,1]
>>> newList(a,b)
['A', 'B', 'B', 'A', 'B', 'B', 'B', 'A']

I think this is what you want to do and can put newLis = list2[:] instead of the below but prefer to use list in these cases:
def newList1(list1,list2):
newLis = list(list2)
for i in range(len(list2)):
if newLis[i] in list1:
newLis[i]='A'
else: newLis[i]='B'
return newLis
The answer when passed
newList1(range(5),range(10))
is:
['A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B']

Related

pytest dynamically generate cases by comparing 2 lists

I am having so much trouble in dynamically generating pytest test cases. here is the scenario.
I have 2 lists with string elements
List1 = ['a', 'b', 'c']
List2 = ['a', 'c', 'd']
I want to compare these 2 lists and compare first element from List1 with first element of List2.
For instance if first element from List1 'a' == first element from List2 'a' then it's a PASS. Now 2nd element from List1 does not match with 2nd element from List2 so it should throw FAIL in this case 'b' != 'c'
I am not sure how to write pytest test case for this. These 2 lists are long lists with too many elements
this is what I am doing
def list1():
for i in some_csv:
list1.append(i)
def list2():
for i in some_csv:
list2.append(i)
#pytest.mark.parametrize("list1", list1())
def test_validation(list1):
assert list1 in list2()
What am I doing wrong?
Why not just compare the list directly?
assert list1() == list2()
But if you insist on testing element-wise:
#pytest.mark.parametrize(
"one, two", # these are the args of the test case
zip(list1(), list2()), # generate tuples to unpack into the args
)
def test_element(one, two):
assert one == two
The test names will be messy, though. You may want to provide a list / iterator to ids to make the names look nicer.
In a first time you forgot the return in each list method.
In a second time the test is not the test you describe.
prametrize will loop over the result of list1() this mean you will compare if 'a' is in the result of list2().
Exemple
'a' in ['a', 'c', 'd']
Using in could introduce some issue for examle
List1 = ['a', 'b', 'c', 'd']
List2 = ['a', 'c', 'd' 'b']
'b' in list2() # This is True
My understanding of your issue is to compare the equality of the two list.To do it you can do
['a', 'b', 'c'] == ['a', 'c', 'd']
Your example will look like this
def list1():
for i in some_csv:
list1.append(i)
return list1
def list2():
for i in some_csv:
list2.append(i)
return list2
def test_validation():
assert list1() == list2()
Best regards

Would like to prevent dupes in a python list of lists

I am creating a list of lists and want to prevent dupes. For example, I have:
mainlist = [[a,b],[c,d],[a,d]]
the next item (list) to be added is [b,a] which is considered a duplicate of [a,b].
UPDATE
mainlist = [[a,b],[c,d],[a,d]]
swap = [b,a]
for item in mainlist:
if set(item) & set(swap):
print "match was found", item
else:
mainlist.append(swap)
Any suggestions as to how I can test whether the next item to be added is already in the list?
Here's an approach using frozensets within a set to check for duplicates. It's a bit ugly since I'm invoking a function that works with global variables.
def add_to_mainlist(new_list):
if frozenset(new_list) not in dups:
mainlist.append(new_list)
mainlist = [['a', 'b'],['c', 'd'],['a', 'd']]
dups = set()
for l in mainlist:
dups.add(frozenset(l))
print("Before:", mainlist)
add_to_mainlist(['a', 'b'])
print("After:", mainlist)
This outputs:
Before: [['a', 'b'], ['c', 'd'], ['a', 'd']]
After: [['a', 'b'], ['c', 'd'], ['a', 'd']]
Showing that the new list was indeed not added to the original.
Here's a cleaner version that calculates the existing set on the fly inside a function that does everything locally:
def add_to_mainlist(mainlist, new_list):
dups = set()
for l in mainlist:
dups.add(frozenset(l))
if frozenset(new_list) not in dups:
mainlist.append(new_list)
return mainlist
mainlist = [['a', 'b'],['c', 'd'],['a', 'd']]
print("Before:", mainlist)
mainlist = add_to_mainlist(mainlist, ['a', 'b']) # the assignment isn't needed, but done anyway :-)
print("After:", mainlist)
Why doesn't your existing code work?
This is what you're doing:
...
for item in mainlist:
if set(item) & set(swap):
print "match was found", item
else:
mainlist.append(swap)
You're intersecting two sets and checking the truthiness of the result. While this might be okay for 0 intersections, in the event that even one of the elements are common (example, ['a', 'b'] and ['b', 'd']), you'd still declare a match which is false.
Ideally you'd want to check the length of the resultant set and make sure its length is equal to than 2:
dups = False
for item in mainlist:
if len(set(item) & set(swap)) == 2:
dups = True
break
if dups == False:
mainlist.append(swap)
You'd also ideally want a flag to ensure that you did not find duplicates. Your previous code would add without checking all items first.
If the order of your inner lists doesn't matter, then this can trivially be accomplished using frozenset()s:
>>> mainlist = [['a', 'b'],['c', 'd'],['a', 'd']]
>>> mainlist = [frozenset(sublist) for sublist in mainlist]
>>>
>>> def add_to_list(lst, sublist):
... if frozenset(sublist) not in lst:
... lst.append(frozenset(sublist))
...
>>> mainlist
[frozenset({'a', 'b'}), frozenset({'d', 'c'}), frozenset({'a', 'd'})]
>>> add_to_list(mainlist, ['b', 'a'])
>>> mainlist
[frozenset({'a', 'b'}), frozenset({'d', 'c'}), frozenset({'a', 'd'})]
>>>
If the order does matter you can either do what #Coldspeed suggested - Construct a set() from your list, construct a frozenset() from the list to be added, and test for membership - or you can use all() and sorted() to test if the list to be added is equivalent to any of the other lists:
>>> def add_to_list(lst, sublist):
... for l in lst:
... if all(a == b for a, b in zip(sorted(sublist), sorted(l))):
... return
... lst.append(sublist)
...
>>> mainlist
[['a', 'b'], ['c', 'd'], ['a', 'd']]
>>> add_to_list(mainlist, ['b', 'a'])
>>> mainlist
[['a', 'b'], ['c', 'd'], ['a', 'd']]
>>>

Including first and last elements in list comprehension

I would like to keep the first and last elements of a list, and exclude others that meet defined criteria without using a loop. The first and last elements may or may not have the criteria of elements being removed.
As a very basic example,
aList = ['a','b','a','b','a']
[x for x in aList if x !='a']
returns ['b', 'b']
I need ['a','b','b','a']
I can split off the first and last values and then re-concatenate them together, but this doesn't seem very Pythonic.
You can use slice assignment:
>>> aList = ['a','b','a','b','a']
>>> aList[1:-1]=[x for x in aList[1:-1] if x !='a']
>>> aList
['a', 'b', 'b', 'a']
Yup, it looks like dawg’s and jez’s suggested answers are the right ones, here. Leaving the below for posterity.
Hmmm, your sample input and output don’t match what I think your question is, and it is absolutely pythonic to use slicing:
a_list = ['a','b','a','b','a']
# a_list = a_list[1:-1] # take everything but the first and last elements
a_list = a_list[:2] + a_list[-2:] # this gets you the [ 'a', 'b', 'b', 'a' ]
Here's a list comprehension that explicitly makes the first and last elements immune from removal, regardless of their value:
>>> aList = ['a', 'b', 'a', 'b', 'a']
>>> [ letter for index, letter in enumerate(aList) if letter != 'a' or index in [0, len(x)-1] ]
['a', 'b', 'b', 'a']
Try this:
>>> list_ = ['a', 'b', 'a', 'b', 'a']
>>> [value for index, value in enumerate(list_) if index in {0, len(list_)-1} or value == 'b']
['a', 'b', 'b', 'a']
Although, the list comprehension is becoming unwieldy. Consider writing a generator like so:
>>> def keep_bookends_and_bs(list_):
... for index, value in enumerate(list_):
... if index in {0, len(list_)-1}:
... yield value
... elif value == 'b':
... yield value
...
>>> list(keep_bookends_and_bs(list_))
['a', 'b', 'b', 'a']

Comparing Order of 2 Python Lists

I am looking for some help comparing the order of 2 Python lists, list1 and list2, to detect when list2 is out of order.
list1 is static and contains the strings a,b,c,d,e,f,g,h,i,j. This is the "correct" order.
list2 contains the same strings, but the order and the number of strings may change. (e.g. a,b,f,d,e,g,c,h,i,j or a,b,c,d,e)
I am looking for an efficient way to detect when list2 is our of order by comparing it against list1.
For example, if list2 is a,c,d,e,g,i should return true (as the strings are in order)
While, if list2 is a,d,b,c,e should return false (as string d appears out of order)
First, let's define list1:
>>> list1='a,b,c,d,e,f,g,h,i,j'.split(',')
>>> list1
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
While your list1 happens to be in alphabetical order, we will not assume that. This code works regardless.
Now, let's create a list2 that is out-of-order:
>>> list2 = 'a,b,f,d,e,g,c,h,i,j'.split(',')
>>> list2
['a', 'b', 'f', 'd', 'e', 'g', 'c', 'h', 'i', 'j']
Here is how to test whether list2 is out of order or not:
>>> list2 == sorted(list2, key=lambda c: list1.index(c))
False
False means out-of-order.
Here is an example that is in order:
>>> list2 = 'a,b,d,e'.split(',')
>>> list2 == sorted(list2, key=lambda c: list1.index(c))
True
True means in-order.
Ignoring elements of list1 not in list2
Let's consider a list2 that has an element not in list1:
>>> list2 = 'a,b,d,d,e,z'.split(',')
To ignore the unwanted element, let's create list2b:
>>> list2b = [c for c in list2 if c in list1]
We can then test as before:
>>> list2b == sorted(list2b, key=lambda c: list1.index(c))
True
Alternative not using sorted
>>> list2b = ['a', 'b', 'd', 'd', 'e']
>>> indices = [list1.index(c) for c in list2b]
>>> all(c <= indices[i+1] for i, c in enumerate(indices[:-1]))
True
Why do you need to compare it to list1 since it seems like list1 is in alphabetical order? Can't you do the following?
def is_sorted(alist):
return alist == sorted(alist)
print is_sorted(['a','c','d','e','g','i'])
# True
print is_sorted(['a','d','b','c','e'])
# False
Here's a solution that runs in expected linear time. That isn't too important if list1 is always 10 elements and list2 isn't any longer, but with longer lists, solutions based on index will experience extreme slowdowns.
First, we preprocess list1 so we can quickly find the index of any element. (If we have multiple list2s, we can do this once and then use the preprocessed output to quickly determine whether multiple list2s are sorted):
list1_indices = {item: i for i, item in enumerate(list1)}
Then, we check whether each element of list2 has a lower index in list1 than the next element of list2:
is_sorted = all(list1_indices[x] < list1_indices[y] for x, y in zip(list2, list2[1:]))
We can do better with itertools.izip and itertools.islice to avoid materializing the whole zip list, letting us save a substantial amount of work if we detect that list2 is out of order early in the list:
# On Python 3, we can just use zip. islice is still needed, though.
from itertools import izip, islice
is_sorted = all(list1_indices[x] < list1_indices[y]
for x, y in izip(list2, islice(list2, 1, None)))
is_sorted = not any(list1.index(list2[i]) > list1.index(list2[i+1]) for i in range(len(list2)-1))
The function any returns true if any of the items in an iterable are true. I combined this with a generator expression that loops through all the values of list2 and makes sure they're in order according to list1.
if list2 == sorted(list2,key=lambda element:list1.index(element)):
print('sorted')
Let's assume that when you are writing that list1 is strings a,b,c,d,e,f,g,h,i that this means that a could be 'zebra' and string b could actually be 'elephant' so the order may not be alphabetical. Also, this approach will return false if an item is in list2 but not in list1.
good_list2 = ['a','c','d','e','g','i']
bad_list2 = ['a','d','b','c','e']
def verify(somelist):
list1 = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
while len(list1) > 0:
try:
list1 = list1[:list1.index(somelist.pop())]
except ValueError:
return False
return True

How do I write "for item in list not in other_list" in one line using Python?

I want to make a loop for items in list that are not present in other_list, in one line. Something like this:
>>> list = ['a', 'b', 'c', 'd']
>>> other_list = ['a', 'd']
>>> for item in list not in other_list:
... print item
...
b
c
How is this possible?
for item in (i for i in my_list if i not in other_list):
print item
Its a bit more verbose, but its just as efficient, as it only renders each next element on the following loop.
Using set (which might do more than what you actually want to do) :
for item in set(list)-set(other_list):
print item
A third option: for i in filter(lambda x: x not in b, a): print(i)
list comprehension is your friend
>>> list = ['a', 'b', 'c', 'd']
>>> other_list = ['a', 'd']
>>> [x for x in list if x not in other_list]
['b', 'c']
also dont name things "list"

Categories

Resources