Comparing elements in two lists in python - python

I have a function that compares the elements of two lists and returns the difference between them. I have two versions of it. The first one works but not the second one. What is wrong with the second function? The inputs a and b are two lists of same length.
def compareLists(a, b):
A = sum([1 if i > j else 0 for i, j in zip(a, b)])
B = sum([1 if j > i else 0 for i, j in zip(a, b)])
return (A, B)
def compareLists(a, b):
A = sum([1 for i in range(0, len(a)) if a[i] > b[i] else 0])
B = sum([1 for i in range(0, len(a)) if b[i] > a[i] else 0])
return (A, B)
Eg input and output: a = [1, 2, 3,4]; b = [0, -2, 5, 6]; output = (2, 2)

You don't need the ternary operator (if-else) in the second code since using the if expression in a list comprehension is how the output can be filtered:
A = sum([1 for i in range(0, len(a)) if a[i] > b[i]])
B = sum([1 for i in range(0, len(a)) if b[i] > a[i]])
Adding else as you do in your second code makes the syntax invalid.
For completeness, as #wim noted in the comment, the use of the ternary operator is unnecessary in your first code either because Boolean values in Python are simply integers of 1 and 0, so you can output the Boolean values returned by the comparison operators directly instead:
A = sum([i > j for i, j in zip(a, b)])
B = sum([j > i for i, j in zip(a, b)])

Related

How to judge if a string contains a given substring (have gap)

e.g.
a = 'abc123def'
b = 'abcdef'
I want a function which can judge whether b in a.
contains(a,b)=True
p.s. gap is also allowed in the represention of b, e.g.
b='abc_def'
but regular expressions are not allowed.
If what you want to do is to check whether b is a subsequence of a, you can write:
def contains(a, b):
n, m = len(a), len(b)
j = 0
for i in range(n):
if j < m and a[i] == b[j]:
j += 1
return j == m
Try using list comprehension:
def contains(main_string, sub_string):
return all([i in main_string for i in sub_string])
NOTE: 'all' is a builtin function which takes an iterable of booleans and returns try if all are True.
def new_contained(a,b):
boo = False
c = [c for c in a]
d = [i for i in b]
if len(c)<=len(d):
for i in c:
if i in d:
boo = True
return boo

python :return the maximum of the first n elements in a list

I have a list and need to find two lists in list a, keeping track of the maximum/minimum, respectively.
Is there a function in some package or numpy that doesn't require loop? I need to speed up my code as my dataset is huge.
a=[4,2,6,5,2,6,9,7,10,1,2,1]
b=[];c=[];
for i in range(len(a)):
if i==0:
b.append(a[i])
elif a[i]>b[-1]:
b.append(a[i])
for i in range(len(a)):
if i==0:
c.append(a[i])
elif a[i]<c[-1]:
c.append(a[i])
#The output should be a list :
b=[4,6,9,10];c=[4,2,1]
Since you are saying you are dealing with a very large dataset, and want to avoid using loops, maybe this is a potential solution, which keeps the loops to a minimum:
def while_loop(a):
b = [a[0]]
c = [a[0]]
a = np.array(a[1:])
while a.size:
if a[0] > b[-1]:
b.append(a[0])
elif a[0] < c[-1]:
c.append(a[0])
a = a[(a > b[-1]) | (a < c[-1])]
return b, c
EDIT:
def for_loop(a):
b = [a[0]]
c = [a[0]]
for x in a[1:]:
if x > b[-1]:
b.append(x)
elif x < c[-1]:
c.append(x)
return b, c
print(
timeit(lambda: while_loop(np.random.randint(0, 10000, 10000)), number=100000)
) # 27.847886939000002
print(
timeit(lambda: for_loop(np.random.randint(0, 10000, 10000)), number=100000)
) # 112.90950811199998
Ok, so I just checked the timing against the regular for loop, and the while loop seems to be about 4-5x faster. No guarantee though, since this strongly seems to depend on the structure of your dataset (see comments).
To start, you can simply initialize b and c with the first element of a. This simplifies the loop (of which you only need 1):
a = [...]
b = [a[0]]
c = [a[0]]
for x in a[1:]:
if x > b[-1]:
b.append(x)
elif x < c[-1]:
c.append(x)
Note that inside the loop, a value of x cannot be both larger than the current maximum and smaller than the current minimum, hence the elif rather than two separate if statements.
Another optimization would be two use additional variables to avoid indexing b and c repeatedly, as well as an explicit iterator to avoid making a shallow copy of a.
a = [...]
a_iter = iter(a)
curr_min = curr_max = next(a_iter)
b = [curr_max]
c = [curr_min]
for x in a_iter:
if x > curr_max:
b.append(x)
curr_max = x
elif x curr_min:
c.append(x)
curr_min = x

Find number of pairs that add up to a specific number from two different lists?

a = [1,2,3,4,5,6,7]
b = [56,59,62,65,67,69]
def sumOfTwo(a,b,v):
for i in range (len(a)):
val_needed = v - a[i]
for j in range (len(b)):
if b[j] == val_needed:
x = b[j]
y = a[i]
print(x,y)
sumOfTwo(a,b,v=70)
Output: 5 65
What if more pairs are possible from the given lists in the problem, how do I do that?
Help.
What are more ways to achieve this?
If you just want to print matched values, you just have to indent the print statement to be inside theif, as stated below. Also, you should use a more pythonic approach to for loops and also for variable assignments.
a = [1,2,3,4,5,6,7]
b = [56,59,62,65,67,69]
def sumOfTwo(a,b,v):
for i in a:
val_needed = v - i
for j in b:
if j == val_needed:
x, y = j, i
print(x,y)
sumOfTwo(a,b,v=70)
Using a list comprehension:
a = [1,2,3,4,5,6,7]
b = [56,59,62,65,67,69]
c = [(x, y)
for x in a
for y in b
if x + y == 70]
print(c)
This yields
[(1, 69), (3, 67), (5, 65)]

Finding missing elements in a List

Hello I have a List with a lot of element in it. These are numbers and ordered but some numbers are missing.
Example: L =[1,2,3,4,6,7,10]
Missing: M = [5,8,9]
How can I find missing numbers in Python?
Take the difference between the sets:
set(range(min(L),max(L))) - set(L)
If you are really crunched for time and L is truly sorted, then
set(range(L[0], L[-1])) - set(L)
This function should do the trick
def missing_elements(L):
s, e = L[0], L[-1]
return sorted(set(range(s, e + 1)).difference(L))
miss = missing_elements(L)
Here you are:
L =[1,2,3,4,6,7,10]
M = [i for i in range(1, max(L)) if i not in L]
# If 0 shall be included replace range(1, max(L)) to range(max(L))
With a comprehension it would look like this:
L = [1,2,3,4,6,7,10]
M = [i for i in range(min(L), max(L)+1) if i not in L]
M
#[5,8,9]
And a fun one, just to add to the bunch:
[i for a, b in zip(L, L[1:]) for i in range(a + 1, b) if b - a > 1]
L =[1,2,3,4,6,7,10]
R = range(1, max(L) + 1)
> [1,2,3,4,5,6,7,8,9,10]
M = list(set(R) - set(L))
> [5,8,9]
Note that M will not necessarily be ordered, but can easily be sorted.

Elegant way to filter two related lists

I have a simple for loop to do. Here's a MWE:
a = [0.6767, -0.0386, 0.6767, 0.4621, 0.6052, 0.3906, 0.6052, 0.3906, 0.6052, 0.4621, 0.6052, 0.4621, 0.5337]
b = [3.6212, 1.5415, 3.4871, 1.8889, 3.3709, 2.078, 3.3012, 2.2236, 3.2265, 2.369, 3.1273, 2.522, 3.0076]
low_lim, high_lim = 0.5, 0.7
c, d = [], []
for indx,i in enumerate(a):
if low_lim <= i <= high_lim:
c.append(i)
d.append(b[indx])
So what this for loop does is basically to check whether an item in a is within a certain range and if it is then it stores that element in c and the corresponding b element (ie: the element with the same index) in d.
How can I write the last block of code more elegantly/succinctly?
numpy is your friend here :)
import numpy as np
a = np.array([0.6767, -0.0386, 0.6767, 0.4621, 0.6052, 0.3906, 0.6052, 0.3906, 0.6052, 0.4621, 0.6052, 0.4621, 0.5337])
b = np.array([3.6212, 1.5415, 3.4871, 1.8889, 3.3709, 2.078, 3.3012, 2.2236, 3.2265, 2.369, 3.1273, 2.522, 3.0076])
low_lim, high_lim = 0.5, 0.7
mask = (low_lim <= a) & (a <= high_lim)
c = a[mask]
d = b[mask]
cd = np.array([a[mask], b[mask]])
#now if you want a one dimensional array, flatten it.
cd = cd.flatten()
Use zip to pair and unpair the lists:
c,d = zip(*[(ia,ib) for (ia, ib) in zip(a,b) if low_lim <= ia <= high_lim])
The splat operator * is necessary here. It is possible to splat a generator expression, but I have used a list comprehension here for readability.
Very similar to Marcin's answer, however, uses indexes. If you need to do this for more than just two arrays, enumerate(a) might be more efficient than using zip(a,b,c,d,..):
c,d = zip(*((a[i],b[i]) for i, x in enumerate(a) if low_lim <= x <= high_lim))
for i, j in itertools.izip(a, b):
if low_lim <= i <= high_lim:
c.append(i)
d.append(j)
Using zip to do the exact same thing:
c, d = [], []
for a_elem, b_elem in zip(a, b):
if low_lim <= a_elem <= high_lim:
c.append(a_elem)
d.append(b_elem)
If it's acceptable to make a list of tuples instead of two lists, then
cd = [(a_elem, b_elem)
for a_elem, b_elem in zip(a,b)
if low_lim <= a_elem <= high_lim]

Categories

Resources