Two number Sum program in python O(N^2) - python

I am used to write code in c++ but now I am trying to learn python. I came to know about the Python language and it is very popular among everyone. So I thought, let's give it a shot.
Currently I am preparing for companies interview questions and able to solve most of them in c++. Alongside which, I am trying to write the code for the same in Python. For the things which I am not familiar with, I do a google search or watch tutorials etc.
While I was writing code for my previously solved easy interview questions in python, I encountered a problem.
Code : Given an array of integers, return indices of the two numbers such that they add up to a specific target.
You may assume that each input would have exactly one solution, and you may not use the same element twice.
Given an array of integers, print the indices of the two numbers such that they add up to a specific target.
def twoNum(*arr, t):
cur = 0
x = 0
y = 0
for i in range (len(arr) - 1):
for j in range (len(arr) - 1):
if(i == j):
break
cur = arr[i] + arr[j]
if(t == cur):
x = arr[i]
y = arr[j]
break
if(t == cur):
break
print(f"{x} + {y} = {x+y} ")
arr = [3, 5, -4, 8, 11, 1, -1, 6]
target = 10
twoNum(arr, t=target)
So here is the problem: I have defined x, y in function and then used x = arr[i] and y = arr[j] and I m printing those values.
output coming is : is 0 + 0 = 10 (where target is 10)
This is I guess probably because I am using x = 0 and y = 0 initially in the function and it seems x and y values are not updating then I saw outline section in VSCode there I saw x and y are declared twice, once at the starting of the function and second in for loop.
Can anyone explain to me what is going on here?
For reference, here is an image of the code I wrote in C++

Change this:
def twoNum(*arr, t):
to this:
def twoNum(arr, t):
* is used to indicate that there will be a variable number of arguments, see this. It is not for pointers as in C++.

Basically what you are trying to do is to write C code in python.
I would instead try to focus first on how to write python code in a 'pythonic' way first. But for your question - sloving it your way using brute force in python:
In [173]: def two_num(arr, t):
...: for i in arr:
...: for j in arr[i + 1: ]:
...: if i + j == t:
...: print(f"{i} + {j} = {t}")
...: return

Here's a way to implement a brute force approach using a list comprehension:
arr = [1,3,5,7,9]
target = 6
i,j = next((i,j) for i,n in enumerate(arr[:-1]) for j,m in enumerate(arr[i+1:],i+1) if n+m==target)
output:
print(f"arr[{i}] + arr[{j}] = {arr[i]} + {arr[j]} = {target}")
# arr[0] + arr[2] = 1 + 5 = 6
Perhaps even more pythonic would be to use iterators:
from itertools import tee
iArr = enumerate(arr)
i,j = next((i,j) for i,n in iArr for j,m in tee(iArr,1)[0] if n+m==target)
When you get to implementing an O(n) solution, you should look into dictionaries:
d = { target-n:j for j,n in enumerate(arr) }
i,j = next( (i,d[m]) for i,m in enumerate(arr) if m in d and d[m] != i )

Related

Changing a list (passed as a function parameter) changes the list with the same name in the previous function call

Recently a friend of mine asked me to explain a strange behavior in a piece of code originally intended to count permutations using recursion. There were many improvements that could be made to the code, which I noted, but these seemed to not have any real impact.
I simplified the code down to the following, which reproduces only the problem, and not the permutations.
def foo(bar, lst):
if(bar == 1):
lst.append(0)
return
print(lst)
foo(1, lst)
print(lst)
foo(2, [])
The output is
[]
[0]
I tried lst += [0] or deleting lst after appending 0, but these did not help. Doing lst2 = lst.copy() followed by lst2.append(0) gave the expected result of two []s, however. I am confused as to why appending 0 (or any value) to lst where bar == 1 would have an effect on the lst where bar == 2. I do not consider myself a total beginner to Python, and I usually can determine the behavior of local variables. This has baffled me though. An explanation would be really appreciated.
In case you want the original code, though I don't think it'll give much more info, here it is:
A = 0
A2 = 0
NN = 3
def P(N, C):
global A
Temp = [X for X in range(1, NN + 1) if X not in C]
if(N == 1):
C.append(None)
A += 1
return
for e in Temp:
C.append(e)
P(N - 1, C)
del C[-1]
def P2(N, C):
global A2
Temp = [X for X in range(1, NN + 1) if X not in C]
if(N == 1):
A2 += 1
return
for e in Temp:
C.append(e)
P2(N - 1, C)
del C[-1]
P(NN, [])
P2(NN, [])
print(A, A2, sep = " ")
print(A == A2)

How to make nested list behave like numpy array?

I'm trying to implements an algorithm to count subsets with given sum in python which is
import numpy as np
maxN = 20
maxSum = 1000
minSum = 1000
base = 1000
dp = np.zeros((maxN, maxSum + minSum))
v = np.zeros((maxN, maxSum + minSum))
# Function to return the required count
def findCnt(arr, i, required_sum, n) :
# Base case
if (i == n) :
if (required_sum == 0) :
return 1
else :
return 0
# If the state has been solved before
# return the value of the state
if (v[i][required_sum + base]) :
return dp[i][required_sum + base]
# Setting the state as solved
v[i][required_sum + base] = 1
# Recurrence relation
dp[i][required_sum + base] = findCnt(arr, i + 1, required_sum, n) + findCnt(arr, i + 1, required_sum - arr[i], n)
return dp[i][required_sum + base]
arr = [ 2, 2, 2, 4 ]
n = len(arr)
k = 4
print(findCnt(arr, 0, k, n))
And it gives the expected result, but I was asked to not use numpy, so I replaced numpy arrays with nested lists like this :
#dp = np.zeros((maxN, maxSum + minSum)) replaced by
dp = [[0]*(maxSum + minSum)]*maxN
#v = np.zeros((maxN, maxSum + minSum)) replaced by
v = [[0]*(maxSum + minSum)]*maxN
but now the program always gives me 0 in the output, I think this is because of some behavior differences between numpy arrays and nested lists, but I don't know how to fix it
EDIT :
thanks to #venky__ who provided this solution in the comments :
[[0 for i in range( maxSum + minSum)] for i in range(maxN)]
and it worked, but I still don't understand what is the difference between it and what I was doing before, I tried :
print( [[0 for i in range( maxSum + minSum)] for i in range(maxN)] == [[0]*(maxSum + minSum)]*maxN )
And the result is True, so how this was able to fix the problem ?
It turns out that I was using nested lists the wrong way to represent 2d arrays, since python was not crating separate objets, but the same sub list indexes was referring to the same integer object, for better explanation please read this.

How can this function be vectorized?

I have a NumPy array with the following properties:
shape: (9986080, 2)
dtype: np.float32
I have a method that loops over the range of the array, performs an operation and then inputs result to new array:
def foo(arr):
new_arr = np.empty(arr.size, dtype=np.uint64)
for i in range(arr.size):
x, y = arr[i]
e, n = ''
if x < 0:
e = '1'
else:
w = '2'
if y > 0:
n = '3'
else:
s = '4'
new_arr[i] = int(f'{abs(x)}{e}{abs(y){n}'.replace('.', ''))
I agree with Iguananaut's comment that this data structure seems a bit odd. My biggest problem with it is that it is really tricky to try and vectorize the putting together of integers in a string and then re-converting that to an integer. Still, this will certainly help speed up the function:
def foo(arr):
x_values = arr[:,0]
y_values = arr[:,1]
ones = np.ones(arr.shape[0], dtype=np.uint64)
e = np.char.array(np.where(x_values < 0, ones, ones * 2))
n = np.char.array(np.where(y_values < 0, ones * 3, ones * 4))
x_values = np.char.array(np.absolute(x_values))
y_values = np.char.array(np.absolute(y_values))
x_values = np.char.replace(x_values, '.', '')
y_values = np.char.replace(y_values, '.', '')
new_arr = np.char.add(np.char.add(x_values, e), np.char.add(y_values, n))
return new_arr.astype(np.uint64)
Here, the x and y values of the input array are first split up. Then we use a vectorized computation to determine where e and n should be 1 or 2, 3 or 4. The last line uses a standard list comprehension to do the string merging bit, which is still undesirably slow for super large arrays but faster than a regular for loop. Also vectorizing the previous computations should speed the function up hugely.
Edit:
I was mistaken before. Numpy does have a nice way of handling string concatenation using the np.char.add() method. This requires converting x_values and y_values to Numpy character arrays using np.char.array(). Also for some reason, the np.char.add() method only takes two arrays as inputs, so it is necessary to first concatenate x_values and e and y_values and n and then concatenate these results. Still, this vectorizes the computations and should be pretty fast. The code is still a bit clunky because of the rather odd operation you are after, but I think this will help you speed up the function greatly.
You may use np.apply_along_axis. When you feed this function with another function that takes row (or column) as an argument, it does what you want to do.
For you case, You may rewrite the function as below:
def foo(row):
x, y = row
e, n = ''
if x < 0:
e = '1'
else:
w = '2'
if y > 0:
n = '3'
else:
s = '4'
return int(f'{abs(x)}{e}{abs(y){n}'.replace('.', ''))
# Where you want to you use it.
new_arr = np.apply_along_axis(foo, 1, n)

What is the meaning of following line of python code?

Please elaboarate the following lines of code
def lcs(X , Y):
# find the length of the strings
m = len(X)
n = len(Y)
l = [[None] * (n + 1) for i in xrange(m + 1)]
I would advise you to adopt ways to figure this out yourself.
edit1: First thing you do is print(l) and see whats up.
This is a pythonesque way of creating arrays:
l = [[None]*(n+1) for i in xrange(m+1)]
and it could be written
l = []
for i in xrange( m + 1 ):
l.append( [None]*(n+1) )
now its clearer right?
and then you could try to print( [None] * 3 ) to see what this does.
and since the comment says len of strings. then X and Y are strings.
then pass some strings to the function and see what comes out :)

How to stop a loop?

def sum_div(x, y):
for k in range(x,y+1):
for z in range(x,y+1):
sx = 0
sy = 0
for i in range(1, k+1):
if k % i == 0:
sx += i
for j in range(1, z+1):
if z % j == 0:
sy += j
if sx == sy and k!= z:
print "(", k ,",", z, ")"
x = input("Dati x : ")
y = input("Dati y : ")
sum_div(x, y)
How do I stop the looping if the value of z == y?
The loops print a pair of numbers in a range from x to y, but when it hit the y value the loop prints a reverse pair of numbers that I don't need it to.
The break command will break out of the loop. So a line like this:
if (z == y):
break
should do what you want.
What you're think you are asking for is the break command, but what you're actually looking for is removal of duplication.
Your program lacks some clarity. For instance:
for i in range(1, k+1):
if k % i == 0:
sx += i
for j in range(1, z+1):
if z % j == 0:
sy += j
These two things are doing essentially the same thing, which can be written more cleanly with a list comprehension (in the REPL):
>>> def get_divisors(r: int) -> list:
... return [i if r % i == 0 else 0 for i in range(1, r+1)]
...
...
>>> get_divisors(4)
>>> [1, 2, 0, 4]
>>> sum(get_divisors(4))
>>> 7
Your line:
while y:
... will infinitely loop if you find a match. You should just remove it. while y means "while y is true", and any value there will evaluate as true.
This reduces your program to the following:
def get_divisors(r: int) -> list:
return [i if r % i == 0 else 0 for i in range(1, r+1)]
def sum_div(x, y):
for k in range(x,y+1):
sum_of_x_divisors = sum(get_divisors(k)) # Note this is moved here to avoid repeating work.
for z in range(x,y+1):
sum_of_y_divisors = sum(get_divisors(z))
if sum_of_x_divisors == sum_of_y_divisors and k!= z:
print("({},{})".format(k, z))
Testing this in the REPL it seems correct based on the logic of the code:
>>> sum_div(9,15)
(14,15)
(15,14)
>>> sum_div(21, 35)
(21,31)
(31,21)
(33,35)
(35,33)
But it's possible that for sum_div(9,15) you want only one of (14,15) and (15,14). However, this has nothing to do with breaking your loop, but the fact that what you're attempting to do has two valid values when k and z don't equal each other. This is demonstrated by the second test case, where (33,35) is a repeated value, but if you broke the for loop on (21,31) you would not get that second set of values.
One way we can account for this is by reordering when work is done:
def sum_div(x, y):
result_set = set() # Sets cannot have duplicate values
for k in range(x,y+1):
sum_of_x_divisors = sum(get_divisors(k))
for z in range(x,y+1):
sum_of_y_divisors = sum(get_divisors(z))
if sum_of_x_divisors == sum_of_y_divisors and k!= z:
result_set.add(tuple(sorted((k,z)))) # compile the result set by sorting it and casting to a tuple, so duplicates are implicitly removed.
for k, z in result_set: # Print result set after it's been compiled
print("({},{})".format(k, z))
And we see a correct result:
>>> sum_div(9,15)
(14,15)
>>> sum_div(21,35)
(21,31)
(33,35)
Or, the test case you provided in comments. Note the lack of duplicates:
>>> sum_div(10,25)
(16,25)
(14,15)
(15,23)
(10,17)
(14,23)
Some takeaways:
Break out functions that are doing the same thing so you can reason more easily about it.
Name your variables in a human-readable format so that we, the readers of your code (which includes you) understands what is going on.
Don't use loops unless you're actually looping over something. for, while, etc. only need to be used if you're planning on going over a list of things.
When asking questions, be sure to always include test input, expected output and what you're actually getting back.
The current best-practice for printing strings is to use the .format() function, to make it really clear what you're printing.

Categories

Resources