Floyd's Algorithm in Python - python

I am not sure how to implement Floyd's algorithm in the following program. It must print a 5x5 array that represents this graph on page 466 and include a counter which is used to print the total number of comparisons when the algorithm is executed - each execution of the "if" structure counts as one comparison.
Does anyone know how to even start this program? I am not sure how to begin.

The following is purely a transcription of the pseudocode you linked. I changed almost nothing.
for k in range(n):
for i in range(n):
for j in range(n):
if A[i][k]+A[k][j]<A[i][j]:
A[i][j]=A[i][k]+A[k][j]

Translated from the page you linked to,
k=0
while (k <= n-1):
i=0
while (i<=n-1):
j=0
while(j<=n-1):
if(A[i,k] + A[k,j] < A[i,j]):
A[i,j] = A[i,k] + A[k,j]
j += 1
i += 1
k += 1
NB This is the exact translation to Python.
Better, more Pythonic code is also possible - see, e.g. 5xum's answer
which uses the range function instead of manually incrementing the loop counters.
Also A here would be a 2d matrix (e.g. a numpy ndarray).
See more information about numpy here

Related

How to find the time complexity of this function?

def f1(n):
cnt = 0
for i in range(n):
for j in range(n):
k = 1
while k < i*j:
k *= 2
cnt += 1
return cnt
I am trying to analyze the time complexity of this function f1, I'm having some troubles dealing with the k < i*j in the loop.
My point of view:
I'm trying to find the number of iterations of the inner loop, so what I'm basically trying to find is when 2^k >= i*j, but I'm having trouble dealing in how to compute i*j each time and find the overall time complexity. I know that at the end I will have 2^k >= n^2 which gives me k >= log(n) , but I must be missing all the iterations before this and I would be happy to know how to calculate them. Any help is really appreciated, Thanks in advance!
EDIT:
With Prune's help I reached this: We're trying to calculate how many times the inner loop iterates, which is log(i*j).
taking i=2 we get log(2) + log(j) (n times) which is n + log(1)+log(2)+...+log(n).
So we have n+log(n!) for each i=0,1,...,n basically n(n+log(n!)). Which is either O(n^2) or O(nlog(n!)). as it's my first time meeting log(n!) I'm not sure which is considered to be the time complexity.
For convenience, let m = i*j.
As you've noted, you execute the while loop log(m) times.
The complexity figure you're missing is the summation:
sum([log(i*j)
for j in range(n)
for i in range(n)])
Would it help to attack this with actual numbers? For instance, try one iteration of the outer loop, with i=2. Also, we'll simplify the log expression:
sum([log(2) + log(j) for j in range(n)])
Using base 2 logs for convenience, and separating this, we have
n*1 + sum([log(j) for j in range(n)])
That's your start. Now, you need to find a closed form for sum(log(j)), and then sum that for i = 0,n
Can you take it from there?
After OP update
The desired closed form for sum(log(j)) is, indeed, log(n!)
This isn't simply "go n times`: it's the sum of log(i)*n + log(n!) over the range.

Why does my prime number sieve return the same result slower than the brute force method for finding primes in Python 2.7?

I am fairly new to Python and I have been trying to find a fast way to find primes till a given number.
When I use the Prime of Eratosthenes sieve using the following code:
#Finding primes till 40000.
import time
start = time.time()
def prime_eratosthenes(n):
list = []
prime_list = []
for i in range(2, n+1):
if i not in list:
prime_list.append(i)
for j in range(i*i, n+1, i):
list.append(j)
return prime_list
lists = prime_eratosthenes(40000)
print lists
end = time.time()
runtime = end - start
print "runtime =",runtime
Along with the list containing the primes, I get a line like the one below as output:
runtime = 20.4290001392
Depending upon the RAM being used etc, I usually consistently get a value within an range of +-0.5.
However when I try to find the primes till 40000 using a brute force method as in the following code:
import time
start = time.time()
prime_lists = []
for i in range(1,40000+1):
for j in range(2,i):
if i%j==0:
break
else:
prime_lists.append(i)
print prime_lists
end = time.time()
runtime = end - start
print "runtime =",runtime
This time, along with the the list of primes, I get a smaller value for runtime:
runtime = 16.0729999542
The value only varies within a range of +-0.5.
Clearly, the sieve is slower than the brute force method.
I also observed that the difference between the runtimes in the two cases only increases with an increase in the value 'n' till which primes are to be found.
Can anyone give a logical explanation for the above mentioned behavior? I expected the sieve to function more efficiently than the brute force method but it seems to work vice-versa here.
While appending to a list is not the best way to implement this algorithm (the original algorithm uses fixed size arrays), it is amortized constant time. I think a bigger issue is if i not in list which is linear time. The best change you can make for larger inputs is having the outer for loop only check up to sqrt(n), which saves a lot of computation.
A better approach is to keep a boolean array which keeps track of striking off numbers, like what is seen in the Wikipedia article for the Sieve. This way, skipping numbers is constant time since it's an array access.
For example:
def sieve(n):
nums = [0] * n
for i in range(2, int(n**0.5)+1):
if nums[i] == 0:
for j in range(i*i, n, i):
nums[j] = 1
return [i for i in range(2, n) if nums[i] == 0]
So to answer your question, your two for loops make the algorithm do potentially O(n^2) work, while being smart about the outer for loop makes the new algorithm take up to O(n sqrt(n)) time (in practice, for reasonably-sized n, the runtime is closer to O(n))

My code is very slow. How to optimize it? Python

def function_1(arr):
return [j for i in range(len(arr)) for j in range(len(arr))
if np.array(arr)[i] == np.sort(arr)[::-1][j]]
An arrarr array is given. It is required for each position [i] to find the arriarri element number in the arrarr array, sorted in descending order. All values ​​of the arrarr array are different.
I have to write func in 1 line. It is working, but very slowly. I have to do this:
np.random.seed(42)
arr = function_1(np.random.uniform(size=1000000))
print(arr[7] + arr[42] + arr[445677] + arr[53422])
Please help to optimize the code.
You are repeatedly sorting and reversing the array, but the result of that operation is independent of the current value of i or j. The simple thing to do is to pre-compute that, then use its value in the list comprehension.
For that matter, range(len(arr)) can also be computed once.
Finally, arr is already an array; you don't need to make a copy each time through the i loop.
def function_1(arr):
arr_sr = np.sort(arr)[::-1]
r = range(len(arr))
return [j for i in r for j in r if arr[i] == arr_sr[j]]
Fitting this into a single line becomes trickier. Aside from extremely artificial outside constraints, there is no reason to do so, but once Python 3.8 is released, assignment expressions will make it simpler to do so. I think the following would be equivalent.
def function_1(arr):
return [j for i in (r:=range(len(arr))) for j in r if arr[i] == (arr_sr:=np.sort(arr)[::-1])[j]]
Have a think about the steps that are going on in here:
[j
for i in range(len(arr))
for j in range(len(arr))
if np.array(arr)[i] == np.sort(arr)[::-1][j]
]
Suppose your array contains N elements.
You pick an i, N different times
You pick a j N different times
Then for each (i,j) pair you are doing the final line.
That is, you're doing the final line N^2 times.
But in that final line, you're sorting an array containing N elements. That's an NlogN operation. So the complexity of your code is O(N^3.logN).
Try making a sorted copy of the array before your [... for i ... for j ...] is called. That'll reduce the time complexity to O(N^2 + NlogN)
I think...

Fastest way to count duplicate integers in two distinct sections of an array

In this snippet of Python code,
fun iterates through the array arr and counts the number of identical integers in two array sections for every section pair. (It simulates a matrix.) This makes n*(n-1)/2*m comparisons in total, giving a time complexity of O(n^2).
Are there programming solutions or ways of reframing this problem that would yield equivalent results but have reduced time complexity?
# n > 500000, 0 < i < n, m = 100
# dim(arr) = n*m, 0 < arr[x] < 4294967311
arr = mp.RawArray(ctypes.c_uint, n*m)
def fun(i):
for j in range(i-1,0,-1):
count = 0
for k in range(0,m):
count += (arr[i*m+k] == arr[j*m+k])
if count/m > 0.7:
return (i,j)
return ()
arr is a shared memory array, therefore it's best kept read-only for simplicity and performance reasons.
arr is implemented as a 1D RawArray from multiprocessing. The reason for this it has by far the fastest performance according to my tests. Using a numpy 2D array, for example, like this:
arr = np.ctypeslib.as_array(mp.RawArray(ctypes.c_uint, n*m)).reshape(n,m)
would provide vectorization capabilities, but increases the total runtime by an order of magnitude - 250s vs. 30s for n = 1500, which amounts to 733%.
Since you can't change the array characteristics at all, I think you're stuck with O(n^2). numpy would gain some vectorization, but would change the access for others sharing the array. Start with the innermost operation:
for k in range(0,m):
count += (arr[i][k] == arr[j][k])
Change this to a one-line assignment:
count = sum(arr[i][k] == arr[j][k] for k in range(m))
Now, if this is truly an array, rather than a list of lists, use the array package's vectorization to simplify the loops, one at a time:
count = sum(arr[i] == arr[j]) # results in a vector of counts
You can now return the j indices where count[j] / m > 0.7. Note that there's no real need to return i for each one: it's constant within the function, and the calling program already has the value. Your array package likely has a pair of vectorized indexing operations that can return those indices. If you're using numpy, those are easy enough to look up on this site.
So after fiddling around some more, I was able to cut down the running time greatly with help from NumPy's vectorization and Numba's JIT compiler. Going back to the original code:
arr = mp.RawArray(ctypes.c_uint, n*m)
def fun(i):
for j in range(i-1,0,-1):
count = 0
for k in range(0,m):
count += (arr[i*m+k] == arr[j*m+k])
if count/m > 0.7:
return (i,j)
return ()
We can leave out the bottom return statement as well as dismiss the idea of using count entirely, leaving us with:
def fun(i):
for j in range(i-1,0,-1):
if sum(arr[i*m+k] == arr[j*m+k] for k in range(m)) > 0.7*m:
return (i,j)
Then, we change the array arr to a NumPy format:
np_arr = np.frombuffer(arr,dtype='int32').reshape(m,n)
The important thing to note here is that we do not use a NumPy array as a shared memory array to be written from multiple processes, avoiding the overhead pitfall.
Finally, we apply Numba's decorator and rewrite the sum function in vector form so that it works with the new array:
import numba as nb
#nb.njit(fastmath=True,parallel=True)
def fun(i):
for j in range(i-1, 0, -1):
if np.sum(np_arr[i] == np_arr[j]) > 0.7*m:
return (i,j)
This reduced the running time to 7.9s, which is definitely a victory for me.

Fibonacci and Prime numbers Python

Let F[n] and P[n] be the nth Fibonacci and prime number respectively. There are some values of n for which F[n] % P[n] = 0.
Let the first k indices which satisfies this condition be n_1 < n_2 < ... < n_k.
I want to calculate the sum of the first k indices (i.e. n_1 + ... + n_k). The program is fine for k = 2 but too slow for k = 5 (as below).
Is there any way I can speed this up?
def primelist(n):
prime = [True]*n
for p in range(3,n,2):
if p**2>n:
break
if prime[p]:
for i in range(p*p,n,2*p):
prime[i]=False
return [2]+[p for p in range(3,n,2) if prime[p]]
l= primelist(100000)
l.insert(0,0)
fib = [0,1]
for i in range(2,len(l)):
fib.append(fib[i-1]+fib[i-2])
k=0
sum_=0
i=1
while i<len(l):
if fib[i]%l[i]==0:
k=k+1
sum_=sum_+i
if k==5:
i=len(l)-1
i=i+1
print sum_
Both of those series are calculation intensive, in other words it doesn't surprise me that it is taking so much time to calculate the values especially that python is an interpreted language, making it slower in these kinds of calculations. I would suggest you use the library numpy to do the calculations you need. It will make your calculations much faster.
First of all, you should redefine your variables like;
i+=1
Not;
i=i+1
Second one, tuples process are faster than lists. So you could use tuples instead of lists if you are not going to change anything.
Also in this statement;
for p in range(3,n,2):
if p**2>n:
break
You probably want to do if p bigger than square root of n.So you should change that line to;
for p in range (3,int(n**0.5+1),2):

Categories

Resources