How to copy data in Python - python

After entering a command I am given data, that I then transform into a list. Once transformed into a list, how do I copy ALL of the data from that list [A], and save it - so when I enter a command and am given a second list of data [B], I can compare the two; and have data that is the same from the two lists cancel out - so what is not similar between [A] & [B] is output. For example...
List [A]
1
2
3
List [B]
1
2
3
4
Using Python, I now want to compare the two lists to each other, and then output the differences.
Output = 4
Hopefully this makes sense!

You can use set operations.
a = [1,2,3]
b = [1,2,3,4]
print set(b) - set(a)
to output the data in list format you can use the following print statement
print list(set(b) - set(a))

>>> b=[1,2,3,4]
>>> a=[1,2,3]
>>> [x for x in b if x not in a]
[4]

for element in b:
if element in a:
a.remove(element)
This answer will return a list not a set, and should take duplicates into account. That way [1,2,1] - [1,2] returns [1] not [].

Try itertools.izip_longest
import itertools
a = [1,2,3]
b = [1,2,3,4]
[y for x, y in itertools.izip_longest(a, b) if x != y]
# [4]
You could easily modify this further to return a duple for each difference, where the first item in the duple is the position in b and the second item is the value.
[(i, pair[1]) for i, pair in enumerate(itertools.izip_longest(a, b)) if pair[0] != pair[1]]
# [(3, 4)]

For entering the data use a loop:
def enterList():
result = []
while True:
value = raw_input()
if value:
result.append(value)
else:
return result
A = enterList()
B = enterList()
For comparing you can use zip to build pairs and compare each of them:
for a, b in zip(A, B):
if a != b:
print a, "!=", b
This will truncate the comparison at the length of the shorter list; use the solution in another answer given here using itertools.izip_longest() to handle that.

Related

assert that all lists in a list of lists are equal

I want to check if all lists in a list of lists are equal.
One example for which I succeeded is a lists of two lists l2 for which
all([a == b for a, b in zip(*l2)])
correctly returns True if l2 = [[1,2],[1,2]] and Falsewhen l2 = [[1,2],[1,666]].
I expected to be able to directly use this code in the case in which the list of lists l has more lists in it by using the same code, but it seems to not work.
For example, when
l=[[1,2],[1,2],[1,2]]
all([a == b for a, b in zip(*l)])
returns the following error:
ValueError: too many values to unpack (expected 2)
I do not understand why this is the case as the result of zip(*l) looks like it should work:
list(zip(*l))
>> [(1, 1, 1), (2, 2, 2)]
Using the observation a == b and a == c imply a == c. You should test the first list with the other lists.
def equalLists(lists):
return not lists or all(lists[0] == b for b in lists[1:])
>>> equalLists([])
True
>>> equalLists([1,2],[1,2])
True
>>> equalLists([1,2],[1,2],[1,2])
True
>>> equalLists([1,2],[1,2],[1,3])
False
You could create a set of tuples and check the size is 1, otherwise they are not all the same:
len(set(tuple(elem) for elem in l)) == 1
This will work for a list of lists of any length. It will also be more efficient than linear time comparisons.
(You have to convert to a tuple first because a list is not hashable and a set requires its members to be hashable.)
Your method (and the other answers here) don't consider that if the lists' lengths vary, zip will shorten them to the length of the shortest:
all(a == b for a,b in zip([1,2], [1,2,3]))
>>> True
Firstly note that it's not necessary to construct a list in all like all([...]) as this adds an extra iteration after list creation, whereas as I've done above uses a generator which evaluates as it goes along.
If each list has hashable elements, I'd exploit set to calculate the distinct elements and check there's only 1:
len(set(tuple(x) for x in l)) == 1
If the elements aren't hashable, but do have the equals method defined on them (unlike your examples, since int is hashable) I'd compare each list to the first, possibly using a generator if you want to avoid comparing the first to itself:
li = iter(l)
first = next(li)
all(x == first for x in li)
This still makes use of python's built-in list equals method and won't do more comparisons than any zip methods in the case that all the lists are equal.
The only case where the above is inefficient is if you have a list of long lists, where most but not all are equal. In that case it's possible a zip method would be quicker:
from itertools import zip_longest
all(len(set(x)) == 1 for x in zip_longest(*l))
Here I used zip_longest for the case the list lengths are unequal. If you knw the lengths are equal you can use zip. By default it fills values with None from the shorter lists once they 'run out' in the iterator, so only use this if your lists have no legitimate Nones! (In that case you can set zip_longest(..., fillvalue="<something not in the lists>").
Equivalent for non-hashable list elements (with equals method):
all(all(i == x[0] for i in x[1:]) for x in zip_longest(*l))
l=[[1,2],[1,2],[1,2]]
all([a == b == c for a, b, c in zip(*l)])
You're telling zip() there's 2 values to unpack but zip has 3 lists to unpack.
You have three item after zip (1,1,1)
Use
l=[[1,2],[1,2],[1,2]]
all([a == b for a, b, c in zip(*l)])
or
l=[[1,2],[1,2],[1,2]]
all([a == b for a, b, *_ in zip(*l)])
# OR all([a == b for a, b, _ in zip(*l)])
# OR all([i[0] == i[1] for i in zip(*l)])
Edit as per comment.
To test all sub elements are equal
all([len(set(i)) == 1 for i in zip(*l)])

Python tuples: compare and merge without for loops

I have two lists of over 100,000 tuples in each. The first tuple list has two strings in it, the latter has five. Each tuple within the first list has a tuple with a common value in the other list. For example
tuple1 = [('a','1'), ('b','2'), ('c','3')]
tuple2 = [('$$$','a','222','###','HHH'), ('ASA','b','QWER','TY','GFD'), ('aS','3','dsfs','sfs','sfs')]
I have a function that is able to remove redundant tuple values and match on the information that is important:
def match_comment_and_thread_data(tuple1, tuple2):
i = 0
out_thread_tuples = [(b, c, d, e) for a, b, c, d, e in tuple2]
print('Out Thread Tuples Done')
final_list = [x + y for x in tuple2 for y in tuple1 if x[0] == y[0]]
return final_list
which ought to return:
final_list = [('a','1','222','###','HHH'), ('b','2','QWER','TY','GFD'), ('c','3','dsfs','sfs','sfs')]
However, the lists are insanely long. Is there any way to get around the computational time commitment of for loops when comparing and matching tuple values?
By using dictionary, this can be done in O(n)
dict1 = dict(tuple1)
final_list = [(tup[1],dict[tup[1]])+ tup[1:] for tup in tuple2]
tuple1 = [('a','1'), ('b','2'), ('c','3')]
tuple2 = [('$$$','a','222','###','HHH'), ('ASA','b','QWER','TY','GFD'), ('aS','3','dsfs','sfs','sfs')]
def match_comment_and_thread_data(tuple1, tuple2):
i = 0
out_thread_dict = dict([(b, (c, d, e)) for a, b, c, d, e in tuple2])
final_list = [x + out_thread_dict.get(x[0],out_thread_dict.get(x[1])) for x in tuple1]
return final_list
by using a dictionary instead your lookup time is O(1) ... you still have to visit each item in list1 ... but the match is fast... although you need alot more values than 3 to get the benefits

How to XOR two lists in Python? [duplicate]

This question already has answers here:
Comparing two lists and only printing the differences? (XORing two lists)
(6 answers)
Closed 2 years ago.
I've got two lists, for example:
a = ['hello','world']
b = ['hello','world','im','steve']
If I want to create a third list that only contains elements NOT in both:
c = ['im','steve']
How do I do this if the order of the elements IS important? I know I can use sets but they keep throwing out the order of my lists. I could use ' '.join(list) to convert them to strings but not sure how to do this operation in that format either.
You can concatenate the lists and use list comprehension:
a = ['hello','world']
b = ['hello','world','im','steve']
final_vals = [i for i in a+b if i not in a or i not in b]
Output:
['im', 'steve']
Option 1: set method (recommended)
Sets have a symmetric_difference method that exclusively return elements from either a or b. Order can be preserved with a list comprehension for a concatenated list a + b.
comp = set(a).symmetric_difference(b)
[x for x in a + b if x in comp]
# ['im', 'steve']
Option 2: pathlib method
For reference, another way to diff two lists might be with pathlib.Path.relative_to method:
import pathlib
p = pathlib.Path(*b)
r = p.relative_to(*a)
list(r.parts)
# ['im', 'steve']
Note: b is the longer list. This option is potentially less efficient than a simple list comprehension.
Add two lists together and minus the intersection part if it shows in the new list. Order is preserved.
c = a + b
for v in set(a).intersection(set(b)):
while v in c:
c.remove(v)
a = ['hello','world']
b = ['hello','world','im','steve']
a = set(a)
b = set(b)
print(a.symmetric_difference(b))
This code print elements that are only in one of the tables.
Look here:
https://learnpython.org/en/Sets
You could also just create a function that filters elements from l1 that don't exist in l2, and call it twice with the arguments flipped:
a = ['hello','world', 'foo']
b = ['hello','world','im','steve']
def difference(l1, l2):
return list(filter(lambda x: x not in l2, l1))
print(difference(a, b) + difference(b, a))
# ['foo', 'im', 'steve']
If you don't wish to use filter(), a simple list comprehension like this also works:
[item for item in l1 if item not in l2]
The question is not very clear, indeed, and probably you're good with #Ajax1234 's answer, but here's another "take" on it.
If you wanna compare positions (kind of what a bit-wise XOR would do) you can do something like getting the shortest list, iterate checking position by position with the longest list (check the same position in the longest list matches the word in the shortest list) and then add the remainder (the "unwalked" part of the longest list). Something like the following:
a = ['hello', 'world']
b = ['hello', 'world', 'im', 'steve']
min_list = a if len(a) < len(b) else b
max_list = b if len(b) > len(a) else a
results = []
for i, item in enumerate(min_list):
# Iterate through the shortest list to avoid IndexError(s)
if min_list[i] != max_list[i]:
results.append(min_list[i])
results.append(max_list[i])
results.extend(max_list[i + 1:])
print(results)
# Prints: ['im', 'steve']
However, then you have the problem of what to do if the same positions don't match. I mean... What to do in that case? In the code above, I just added both entries to the results list, which means for the following inputs:
a = ['hello', 'foo']
b = ['hello', 'world', 'im', 'steve']
would output:
>>> ['foo', 'world', 'im', 'steve']
(notice both foo from list a and world from list b have been added)
Using standard for loop to check for items not in one or the other list (may be more understandable than list comprehension):
a = ['hello','world', 'foo']
b = ['hello','world','im','steve']
c = a+b
ans = []
for i in c:
if i not in a or i not in b:
ans.append(i)
print(ans)
Output:
['foo', 'im', 'steve']
I recommend, using ^ operator with sets, like set(a) ^ set(b), Example (demo):
>>> a = ['hello','world']
>>> b = ['hello','world','im','steve']
>>> set(a) ^ set(b)
{'steve', 'im'}
>>> sorted(set(a) ^ set(b),key=max([a,b],key=len).index)
['im', 'steve']
>>>
https://docs.python.org/2/library/stdtypes.html#frozenset.symmetric_difference

Opposite of set.intersection in python?

In Python you can use a.intersection(b) to find the items common to both sets.
Is there a way to do the disjoint opposite version of this? Items that are not common to both a and b; the unique items in a unioned with the unique items in b?
You are looking for the symmetric difference; all elements that appear only in set a or in set b, but not both:
a.symmetric_difference(b)
From the set.symmetric_difference() method documentation:
Return a new set with elements in either the set or other but not both.
You can use the ^ operator too, if both a and b are sets:
a ^ b
while set.symmetric_difference() takes any iterable for the other argument.
The output is the equivalent of (a | b) - (a & b), the union of both sets minus the intersection of both sets.
a={1,2,4,5,6}
b={5,6,4,9}
c=(a^b)&b
print(c) # you got {9}
The best way is a list comprehension.
a = [ 1,2,3,4]
b = [ 8,7,9,2,1]
c = [ element for element in a if element not in b]
d = [ element for element in b if element not in a]
print(c)
# output is [ 3,4]
print(d)
# output is [8,7,9]
You can join both lists
Try this code for (set(a) - intersection(a&b))
a = [1,2,3,4,5,6]
b = [2,3]
for i in b:
if i in a:
a.remove(i)
print(a)
the output is [1,4,5,6]
I hope, it will work
e, f are two list you want to check disjoint
a = [1,2,3,4]
b = [8,7,9,2,1]
c = []
def loop_to_check(e,f):
for i in range(len(e)):
if e[i] not in f:
c.append(e[i])
loop_to_check(a,b)
loop_to_check(b,a)
print(c)
## output is [3,4,8,7,9]
This loops around to list and returns the disjoint list

Map the max function for list of the lists

I stack with the following problem, I need to finding maximum between equal positions between lists. Map function works pretty well, but how to make it work for the list of the lists? using map(max,d) gave the max of the every list. The problem is that the number of the lists in the list is variable. Any suggestions are welcome!
Input for the problem is d not an a,b,c, d - is a list of the lists, and the comparison is pairwise per position in the list.
a = [0,1,2,6]
b = [5,1,0,7]
c = [3,8,0,8]
map(max,a,b,c)
# [5,8,2,8]
d = [a,b,c]
map(max,d)
[6,7,8]
a = [0,1,2,6]
b = [5,1,0,7]
c = [3,8,0,8]
print [max(itm) for itm in zip(a, b, c)]
or even shorter:
print map(max, zip(a, b, c))
How about this:
max(map(max,d))

Categories

Resources