Does anyone know of any good resources to learn big o notation? In particular learning how to walk through some code and being able to see that it would be O(N^2) or O(logN)? Preferably something that can tell me why a code like this is equal to O(N log N)
def complex(numbers):
N = len(numbers)
result = 0
for i in range(N):
j = 1
while j < N:
result += numbers[i]*numbers[j]
j = j*2
return result
Thanks!
To start, let me define to you what O(N log N) is. It means, that the program will run at most N log N operations, i.e. it has a upper bound of ~N log N (where N is the size of the input).
Now here, your N is the size of numbers, or your code:
N = len(numbers)
Notice that the first for loop runs from 0 to N-1, for a total of N operations. This is where the first N comes from.
-
Then, where does the log N come from? It is from the while loop.
In the while loop, you keep multiplying 2 to j until j is greater or equal than N.
This will be completed when we have executed the loop ~log2(N) times, which describes how many times we have to multiply j by 2 to get to N. For example, log2(8) = 3, because we multiply j by 2 three times to get 8:
#ofmult. j oldj
1 2 2 <- 1 * 2
2 4 4 <- 2 * 2
3 8 8 <- 4 * 2
To better illustrate this, I have added a print statement in your code, for i and j:
def complex(numbers):
N = len(numbers)
result = 0
for i in range(N):
j = 1
while j < N:
print(str(i) + " " + str(j))
result += numbers[i]*numbers[j]
j = j*2
return result
When this is run:
>>> complex([2,3,5,1,5,3,7,3])
This is what is outputted:
0 1
0 2
0 4
1 1
1 2
1 4
2 1
2 2
2 4
3 1
3 2
3 4
4 1
4 2
4 4
5 1
5 2
5 4
6 1
6 2
6 4
7 1
7 2
7 4
Notice how our i goes from 0...7 (N times for a total of O(N) ), and the second part, there are always 3 ( log2(N) ) j-outputs for every i.
So, the code is O(N log2 N).
Also, some good websites I would recommend are:
https://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/
And, a video from a lecture series from a Stanford professor:
https://www.youtube.com/watch?v=eNsKNfFUqFo
When you multiply j by 2, you're effectively saying "I've done half the remaining problem!". At each step in the while loop, you're solving half the remaining problem. Therefore if your problem is x size, then the number of iterations required would be i = log_2 x, which we just say is log x. In this case your x is just equal to N.
The for loop has you do the above section N times again, so you get N * log N.
We use O(N log N) to mean that, at each step, we might do any CONSTANT number of things (for example inside the while loop I might do a billion operations), but we don't care about this constant, because generally N is usually bigger, and can get arbitrarily big (beyond a certain size point, even a billion is nothing in comparison to what N COULD be, i.e. a googol). Hence we have O(N log N).
Here's a short crash course in the form of a pdf:
http://www1.icsi.berkeley.edu/~barath/cs61b-summer2002/lectures/lecture10.pdf
Here's a short crash course in the form of a lecture:
https://www.youtube.com/watch?v=VIS4YDpuP98
Related
I am solving a problem where I am given three integers (a,b,c), all three can be very large and (a>b>c)
I want to identify for which base between b and c, produces the smallest sum of digits, when we convert 'a' to that base.
For example a = 216, b=2, c=7 -> the output= 6, because: 216 base 2 = 11011000, and the sum of digits = 4, if we do the same for all bases between 2 and 7, we find that 216 base 6 produces the smallest sum of digits, because 216 base 6 = 1000, which has sum 1.
My question is, is there any function out there that can convert a number to any base in constant time faster than the below algorithm? Or any suggestions on how to optimise my algorithm?
from collections import defaultdict
n = int(input())
for _ in range(n):
(N,X) = map(int,input().split())
array = list(map(int,input().split()))
my_dict = defaultdict(int)
#original count of elements in array
for i in range(len(array)):
my_dict[array[i]] +=1
#ensure array contains distinct elements
array = set(array)
count = max(my_dict.values()) #count= max of single value
temp = count
res = None
XOR_count = float("inf")
if X==0:
print(count,0)
break
for j in array:
if j^X in my_dict:
curr = my_dict[j^X] + my_dict[j]
if curr>=count:
count = curr
XOR_count = min(my_dict[j],XOR_count)
if count ==temp:
XOR_count = 0
print(f"{count} {XOR_count}")
Here are some sample input and outputs:
Sample Input
3
3 2
1 2 3
5 100
1 2 3 4 5
4 1
2 2 6 6
Sample Output
2 1
1 0
2 0
Which for the problem I am solving runs into time limit exceeded error.
I found this link to be quite useful (https://www.purplemath.com/modules/logrules5.htm) in terms of converting log bases, which I can kind of see how it relates, but I couldn't use it to get a solution for my above problem.
You could separate the problem in smaller concerns by writing a function that returns the sum of digits in a given base and another one that returns a number expressed in a given base (base 2 to 36 in my example below):
def digitSum(N,b=10):
return N if N<b else N%b+digitSum(N//b,b)
digits = "0123456789abcdefghijklmnopqrstuvwxyz"
def asBase(N,b):
return "" if N==0 else asBase(N//b,b)+digits[N%b]
def lowestBase(N,a,b):
return asBase(N, min(range(a,b+1),key=lambda c:digitSum(N,c)) )
output:
print(lowestBase(216,2,7))
1000 # base 6
print(lowestBase(216,2,5))
11011000 # base 2
Note that both digitSum and asBase could be written as iterative instead of recursive if you're manipulating numbers that are greater than base^1000 and don't want to deal with recursion depth limits
Here's a procedural version of digitSum (to avoid recursion limits):
def digitSum(N,b=10):
result = 0
while N:
result += N%b
N //=b
return result
and returning only the base (not the encoded number):
def lowestBase(N,a,b):
return min(range(a,b+1),key=lambda c:digitSum(N,c))
# in which case you don't need the asBase() function at all.
With those changes results for a range of bases from 2 to 1000 are returned in less than 60 milliseconds:
lowestBase(10**250+1,2,1000) --> 10 in 57 ms
lowestBase(10**1000-1,2,1000) --> 3 in 47 ms
I don't know how large is "very large" but it is still sub-second for millions of bases (yet for a relatively smaller number):
lowestBase(10**10-1,2,1000000) --> 99999 in 0.47 second
lowestBase(10**25-7,2,1000000) --> 2 in 0.85 second
[EDIT] optimization
By providing a maximum sum to the digitSum() function, you can make it stop counting as soon as it goes beyond that maximum. This will allow the lowestBase() function to obtain potential improvements more efficiently based on its current best (minimal sum so far). Going through the bases backwards also gives a better chance of hitting small digit sums faster (thus leveraging the maxSum parameter of digitSum()):
def digitSum(N,b=10,maxSum=None):
result = 0
while N:
result += N%b
if maxSum and result>=maxSum:break
N //= b
return result
def lowestBase(N,a,b):
minBase = a
minSum = digitSum(N,a)
for base in range(b,a,-1):
if N%base >= minSum: continue # last digit already too large
baseSum = digitSum(N,base,minSum)
if baseSum < minSum:
minBase,minSum = base,baseSum
if minSum == 1: break
return minBase
This should yield a significant performance improvement in most cases.
fib1 = 1
fib2 = 2
i = 0
sum = 0
while i < 3999998:
fibn = fib1 + fib2
fib1 = fib2
fib2 = fibn
i += 1
if fibn % 2 == 0:
sum = sum + fibn
print(sum + 2)
The challenge is to add even Fibonacci numbers under 4000000. It works for small limits say 10 numbers. But goes on forever when set for 4000000.
Code is in Python
Yes, there are inefficiencies in your code, but the biggest one is that you're mistaken about what you're computing.
At each iteration i increases by one, and you are checking at each step whether i < 3999998. You are effectively finding the first 4 million fibonacci numbers.
You should change your loop condition to while fib2 < 3999998.
A couple of other minor optimisations. Leverage python's swapping syntax x, y = y, x and its sum function. Computing the sum once over a list is slightly faster then summing them up successively over a loop.
a, b = 1, 2
fib = []
while b < 3999998:
a, b = b, a + b
if b % 2 == 0:
fib.append(b)
sum(fib) + 2
This runs in 100000 loops, best of 3: 7.51 µs per loop, a whopping 3 microseconds faster than your current code (once you fix it, that is).
You are computing the first 4 million fibonacci numbers. It's going to take a while. It took me almost 5 minutes to compute the result, which was about 817 KB of digits, after I replaced fibn % 2 == 0 with fibn & 1 == 0 - an optimization that makes a big difference on such large numbers.
In other words, your code will eventually finish - it will just take a long time.
Update: your version finished after 42 minutes.
I'm new to python (and programming in general), I was asked in my class to calculate Catalan numbers up to a billion but the program I wrote for it is not working as intended.
from numpy import division
C=1
n=0
while C<=10000000000:
print (C)
C=(4*n+2)/(n+2)*C
n=n+1
This is what it prints
1,
1,
2,
4,
8,
24,
72,
216,
648,
1944,
5832,
17496,
52488,
157464,
472392,
1417176,
4251528,
12754584,
38263752,
114791256,
344373768,
1033121304,
3099363912,
9298091736,
As you can see from my fourth iteration onwards, I get the wrong number and I don't understand why.
EDIT:
The mathematical definition I used is not wrong! I know the Wiki has another definition but this one is not wrong.
Co=1, Cn+1 = (4*n+2)/(n+2)*Cn
C=(4*n+2)/(n+2)*C
This applies the calculation in the wrong order. Because you are using integer arithmetic, (4*n+2)/(n+2) loses information if you have not already factored in C. The correct calculation is:
C=C*(4*n+2)/(n+2)
Try this:
from scipy.special import factorial
C = 1
n = 0
while C <= 10000000000:
print(C)
C = factorial(2*n, exact=True)/(factorial((n+1), exact=True)*factorial(n, exact=True))
n = n + 1
It works for me :)
This is solved using recursion:
def catalan(n):
if n <=1 :
return 1
res = 0
for i in range(n):
res += catalan(i) * catalan(n-i-1)
return res
for i in range(10000000000):
print (catalan(i))
you can read more about Catalan numbers here or here
Based on this expression for Catalan Numbers:
from math import factorial
C = 1
n = 0
while C <= 10000000000:
print(C)
C = (factorial(2 * n)) / (factorial(n + 1) * factorial(n))
n = n + 1
Returns:
1
1.0
1.0
2.0
5.0
14.0
42.0
132.0
429.0
1430.0
4862.0
16796.0
58786.0
208012.0
742900.0
2674440.0
9694845.0
35357670.0
129644790.0
477638700.0
1767263190.0
6564120420.0
The problem
Your mathematical definition of Catalan numbers is incorrect when translated into code.
This is because of operator precedence in programming languages such as Python.
Multiplication and division both have the same precedence, so they are computed left to right. What happens is that the division operation (4*n+2)/(n+2) happens before the multiplication with C. When n is 2, 2*(2*n+2)/(n+2) becomes 10/4 which is 2 in integer arithmetic. 1*C which is 2 at this stage, gives 4 instead of giving the expected 5.
Once a number in the series is incorrect, being computed iteratively is incorrect.
A possible work around
Here's the definition Catalan Numbers
Which means, the nth Catalan number is given by:
import operator as op
def ncr(n, r):
r = min(r, n-r)
if r == 0: return 1
numer = reduce(op.mul, xrange(n, n-r, -1))
denom = reduce(op.mul, xrange(1, r+1))
return numer//denom
def catalan(n):
return ncr(2*n, n)/(n+1)
This is not very efficient, but it is at least correct.
The right fix
To compute the series, you can do, using the recursive formula.
N=1000000 # limit
C=1
for i in xrange(0, N+1):
print i,C
C = (2*(2*i +1)*C)/(i+2)
For the first few, it looks like this:
0 1
1 1
2 2
3 5
4 14
5 42
6 132
7 429
8 1430
9 4862
10 16796
Im still pretty new to python and I'm trying to get all of the prime numbers from 600851475143 into a list. However, I keep getting a random assortment of numbers in the list instead of the prime numbers. I'm not really sure where I am going wrong. Thank you for your time
import math
factors_list = []
prime_factors = []
def number_factors(s):
s = int(math.sqrt(s))
for num in range(2, s):
for i in range(2, num):
if (num % i) == 0:
factors_list.append(num)
else:
prime_factors.append(num)
number_factors(600851475143)
print factors_list
print prime_factors
Currently you append to prime_factor every time if (num % i) == 0. So, for example, if num=12 (not prime), and i=5 you'll do the append to prime_factor.
Instead, you should only append if it has no divisors at all, not just a single number doesn't divide evenly.
I'll warn you ahead of time though, this problem is not only about calculating prime numbers, but that 600851475143 is a very large number. So you should probably get your current code working as a learning exercise, but you'll need to rethink your approach to the full solution.
Here's a better algorithm for factoring n. I'll describe it in words, so you can work out the coding yourself.
1) Set f = 2. Variable f represents the current trial factor.
2) If f * f > n, then n must be prime, so output n and stop.
3) Divide n by f. If the remainder is 0, then f is a factor of n,
so output f and set n = n / f, then return to Step 2.
4) Since the remainder in the prior step was not 0, set f = f + 1
and return to Step 2.
For instance, to factor 13195, first set f = 2; the test in Step 2 is not satisfied, the remainder in Step 3 is 1, so in Step 4 set f = 3 and return to Step 2. Now the test in Step 2 is not satisfied, the remainder in Step 3 is 1, so in Step 4 set f = 4 and return to Step 2. Now the test in Step 2 is not satisfied, the remainder in Step 3 is 3, so in Step 4 set f = 5 and return to Step 2.
Now the test in Step 2 is not satisfied, but the remainder in Step 3 is 0, so 5 is a factor of 13195; output 5, set n = 2639, and return to Step 2. Now the test in Step 2 is not satisfied, the remainder in Step 3 is 4, so in Step 4 set f = 6 and return to Step 2. Now the test in Step 2 is not satisfied, the remainder in Step 3 is 5, so in Step 4 set f = 7 and return to Step 2.
Now the test in Step 2 is not satisfied, but the remainder in Step 3 is 0, so 7 is a factor of 2639 (and also of 13195); output 7, set n = 377, and return to Step 2. Now the test in Step 2 is not satisfied, the remainder in Step 3 is 6, so in Step 4 set f = 8 and return to Step 2. Continue in this way until f = 13.
Now the test in Step 2 is not satisfied, but the remainder in Step 3 is 0, so 13 is a factor of 377 (and also of 2639 and 13195); output 13, set n = 29, and return to Step 2. Here the test in Step 2 is satisfied, since 13 * 13 = 169 which is greater than 29, so 29 is prime, output it and halt. The final factorization is 5 * 7 * 13 * 29 = 13195.
The factorization of 600851475143 works in exactly the same way, except that it takes longer. There are better ways to factor integers. But this algorithm is simple, and is sufficient for PE3.
This will run quite slowly for large numbers. Consider the case in which the algorithm attempts to find the prime factors where num = 1000000. Your nested FOR loop will generate 1million operations before the next number is even considered!
Consider using the Sieve of Eratosthones to get all of the prime numbers up to a certain integer. It is not as efficient as certain other Sieves, but is easy to implement. Spend some time reading the theory behind the sieve before implementing--this will help your understanding of later problems.
http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
A Python HOMEWORK Assignment asks me to write a function “that takes as input a positive whole number, and prints out a multiplication, table showing all the whole number multiplications up to and including the input number.”(Also using the while loop)
# This is an example of the output of the function
print_multiplication_table(3)
>>> 1 * 1 = 1
>>> 1 * 2 = 2
>>> 1 * 3 = 3
>>> 2 * 1 = 2
>>> 2 * 2 = 4
>>> 2 * 3 = 6
>>> 3 * 1 = 3
>>> 3 * 2 = 6
>>> 3 * 3 = 9
I know how to start, but don’t know what to do next. I just need some help with the algorithm. Please DO NOT WRITE THE CORRECT CODE, because I want to learn. Instead tell me the logic and reasoning.
Here is my reasoning:
The function should multiply all real numbers to the given value(n) times 1 less than n or (n-1)
The function should multiply all real numbers to n(including n) times two less than n or (n-2)
The function should multiply all real numbers to n(including n) times three less than n or (n-3) and so on... until we reach n
When the function reaches n, the function should also multiply all real numbers to n(including n) times n
The function should then stop or in the while loop "break"
Then the function has to print the results
So this is what I have so far:
def print_multiplication_table(n): # n for a number
if n >=0:
while somehting:
# The code rest of the code that I need help on
else:
return "The input is not a positive whole number.Try anohter input!"
Edit: Here's what I have after all the wonderful answers from everyone
"""
i * j = answer
i is counting from 1 to n
for each i, j is counting from 1 to n
"""
def print_multiplication_table(n): # n for a number
if n >=0:
i = 0
j = 0
while i <n:
i = i + 1
while j <i:
j = j + 1
answer = i * j
print i, " * ",j,"=",answer
else:
return "The input is not a positive whole number.Try another input!"
It's still not completely done!
For example:
print_multiplication_table(2)
# The output
>>>1 * 1 = 1
>>>2 * 2 = 4
And NOT
>>> 1 * 1 = 1
>>> 1 * 2 = 2
>>> 2 * 1 = 2
>>> 2 * 2 = 4
What am I doing wrong?
I'm a little mad about the while loop requirement, because for loops are better suited for this in Python. But learning is learning!
Let's think. Why do a While True? That will never terminate without a break statement, which I think is kind of lame. How about another condition?
What about variables? I think you might need two. One for each number you want to multiply. And make sure you add to them in the while loop.
I'm happy to add to this answer if you need more help.
Your logic is pretty good. But here's a summary of mine:
stop the loop when the product of the 2 numbers is n * n.
In the mean time, print each number and their product. If the first number isn't n, increment it. Once that's n, start incrementing the second one. (This could be done with if statements, but nested loops would be better.) If they're both n, the while block will break because the condition will be met.
As per your comment, here's a little piece of hint-y psuedocode:
while something:
while something else:
do something fun
j += 1
i += 1
where should original assignment of i and j go? What is something, something else, and something fun?
This problem is better implemented using nested loops since you have two counters. First figure out the limits (start, end values) for the two counters. Initialize your counters to lower limits at the beginning of the function, and test the upper limits in the while loops.
The first step towards being able to produce a certain output is to recognize the pattern in that output.
1 * 1 = 1
1 * 2 = 2
1 * 3 = 3
2 * 1 = 2
2 * 2 = 4
2 * 3 = 6
3 * 1 = 3
3 * 2 = 6
3 * 3 = 9
The number on the right of = should be trivial to determine, since we can calculate it by multiplying the other two numbers on each row; obtaining those is the core of the assignment. Think of the two operands of * as two counters, let's call them i and j. We can see that i is counting from 1 to 3, but for each i, j is counting from 1 to 3 (resulting in a total of 9 rows; more generally there will be n2 rows). Therefore, you might try using nested loops, one to loop i (from 1 to n) and another to loop j (from 1 to n) for each i. On each iteration of the nested loop, you can print the string containing i, j and i*j in the desired format.