any tip to improve performance when using nested loops with python - python

so, I had this exercise where I would receive a list of integers and had to find how many sum pairs were multiple to 60
example:
input: list01 = [10,90,50,40,30]
result = 2
explanation: 10 + 50, 90 + 30
example2:
input: list02 = [60,60,60]
result = 3
explanation: list02[0] + list02[1], list02[0] + list02[2], list02[1] + list02[2]
seems pretty easy, so here is my code:
def getPairCount(numbers):
total = 0
cont = 0
for n in numbers:
cont+=1
for n2 in numbers[cont:]:
if (n + n2) % 60 == 0:
total += 1
return total
it's working, however, for a big input with over 100k+ numbers is taking too long to run, and I need to be able to run in under 8 seconds, any tips on how to solve this issue??
being with another lib that i'm unaware or being able to solve this without a nested loop

Here's a simple solution that should be extremely fast (it runs in O(n) time). It makes use of the following observation: We only care about each value mod 60. E.g. 23 and 143 are effectively the same.
So rather than making an O(n**2) nested pass over the list, we instead count how many of each value we have, mod 60, so each value we count is in the range 0 - 59.
Once we have the counts, we can consider the pairs that sum to 0 or 60. The pairs that work are:
0 + 0
1 + 59
2 + 58
...
29 + 31
30 + 30
After this, the order is reversed, but we only
want to count each pair once.
There are two cases where the values are the same:
0 + 0 and 30 + 30. For each of these, the number
of pairs is (count * (count - 1)) // 2. Note that
this works when count is 0 or 1, since in both cases
we're multiplying by zero.
If the two values are different, then the number of
cases is simply the product of their counts.
Here's the code:
def getPairCount(numbers):
# Count how many of each value we have, mod 60
count_list = [0] * 60
for n in numbers:
n2 = n % 60
count_list[n2] += 1
# Now find the total
total = 0
c0 = count_list[0]
c30 = count_list[30]
total += (c0 * (c0 - 1)) // 2
total += (c30 * (c30 - 1)) // 2
for i in range(1, 30):
j = 60 - i
total += count_list[i] * count_list[j]
return total
This runs in O(n) time, due to the initial one-time pass we make over the list of input values. The loop at the end is just iterating from 1 through 29 and isn't nested, so it should run almost instantly.

Below is a translation of Tom Karzes's answer but using numpy. I benchmarked it and it is only faster if the input is already a numpy array, not a list. I still want to write it here because it nicely shows how loops in python can be one-liners in numpy.
def get_pairs_count(numbers, /):
# Count how many of each value we have, modulo 60.
numbers_mod60 = np.mod(numbers, 60)
_, counts = np.unique(numbers_mod60, return_counts=True)
# Now find the total.
total = 0
c0 = counts[0]
c30 = counts[30]
total += (c0 * (c0 - 1)) // 2
total += (c30 * (c30 - 1)) // 2
total += np.dot(counts[1:30:+1], counts[59:30:-1]) # Notice the slicing indices used.
return total

Related

count '9's from 1 to n python optimization

def count_nines(n):
x = list(map(str,range(n + 1)))
count = 0
for i in x:
c = i.count('9')
count += c
return count
Execution Timed Out (12000 ms)
How can i optimize this code ?
Here's some working code (I tested it on some not too big numbers, and it returns the same results as yours) based on my comment:
def count_nines(n):
if n==0:
return 0
k = len(str(n))-1
leading_digit, remainder = divmod(n, 10**k) # Thanks #Stef for this optimization
# Number of nines in numbers from 1 to leading_digit * 10**k - 1
count1 = leading_digit * k*10**(k-1)
# If the leading_digit is 9, number of times it appears
# (in numbers from 9 * 10**k to n)
count2 = remainder+1 if leading_digit==9 else 0
# Number of nines in remainder
count3 = count_nines(remainder)
# Total number of nines
return int(count1 + count2 + count3)
Explanations
For starters, the numbers of nines (shortened as c9() hereafter) in 1-10^k is k * 10^(k-1); this is easy to prove by recurrence, but I'll just explain on an example:
assuming c9(1000) = 300, the number of nines in the xxx part of numbers 0xxx, 1xxx ... 9xxx is equal to 10 * 300; add to that the number of 9 in 9xxx is 1000 (from 9000 to 9999), which yields c9(10000) = 10*300 + 1000 = 4000 .
Now imagine you want c9(7935) : you have 7 * 300 nines in numbers 1-7000, then 9*20 nines in numbers 7 000 to 7 900, then 36 leading nines in number 900 to 935, then ...
Example
count_nines(9254287593789050756)
Out[12]: 16880680640899572416

How to recursively find the sequence in a list with the highest difference?

Given this grid:
grid = [[10,23,16,25,12],
[19,11,8,1,4],
[3,6,9,7,20],
[18,24,4,17,5],
[7,3,4,6,1]]
The sequence with the greatest difference between the sum of its odd rows and the sum of its even rows is the sequence of row 1 to row 3. This is because (10 + 23 + 16 + 25 + 12) - (19 + 11 + 8 + 1 + 4) + (3 + 6 + 9 + 7 + 20) = 88 which is the maximum difference out of all sequences like this.
The sequence should have an even row and an odd row so it must have at least 2 rows. The maximum number of rows depends on the size of the grid.
The problem is I need it to work on an O(log n) time complexity. My idea is to use recursion to divide the grid into 2 and solve it from there. However, it doesn't work as I wanted to.
This is my whole code:
import math
class Sequence:
def __init__(self,grids):
self.grids = grids
self.calculate_max_difference()
def calculate_max_difference(self):
# Get the odd and even rows using list slicing
odd_rows = self.grids[::2]
even_rows = self.grids[1::2]
odd_sum = 0
even_sum = 0
for odd_lst in odd_rows:
odd_sum += sum(odd_lst)
for even_lst in even_rows:
even_sum += sum(even_lst)
self.diff = odd_sum - even_sum
def consecutive_seq(start,end,max,grids):
middle = math.ceil((start+end)/2)
sequence = []
for row in range(end-start):
sequence.append(grids[start+row])
seq_ins = Sequence(sequence)
if (end-start) <= 3 and (end-start) > 1:
return seq_ins.grids
upper_seq = consecutive_seq(start,middle,seq_ins.diff,seq_ins.grids)
lower_seq = consecutive_seq(middle+1,end,seq_ins.diff,seq_ins.grids)
greater_seq = upper_seq
if upper_seq.diff < lower_seq.diff:
greater_seq = lower_seq
if greater_seq.diff < max:
return seq_ins.grids
# Sample Input
grid = [[10,23,16,25,12],
[19,11,8,1,4],
[3,6,9,7,20],
[18,24,4,17,5],
[7,3,4,6,1]]
n = len(grid)
max_seq = consecutive_seq(0,n-1,0,grid)
print(max_seq)
How should I go about this?
Firstly you don't really need a 2d array for this. You can sum up all the rows and only store the sums in a 1D array. So for example
grid = [[10,23,16,25,12],
[19,11,8,1,4],
[3,6,9,7,20],
[18,24,4,17,5],
[7,3,4,6,1]]
turns in to
sums = [sum(row) for row in grid] # sums = [86, 43, 45, 68, 21]
Once you have the sums you have to simply invert the signs for odd indices
[86, 43, 45, 68, 21] becomes => [86, -43, 45, -68, 21]
Once you have the data in this format, you can use the algorithm for Finding the largest sum in a contiguous subarray which has a time complexity of O(n). You might have to make a few small tweaks to that to include at least 2 numbers.
Also if you care only about the difference, you will have to run the algorithm again but this time multiply the even indices by -1.
I really don't think you can solve this in O(log n) time.

How to slice array quickly based on conditions?

I have a giant nested for loop....10 in all, but for illustration here, i am including 6. I am doing a summation (over multiple indices; the incides are not independent!). The index in any inner for loop depends on the index of the outer loop (except for one instance). The inner-most loop contains an operation where i slice an array (named 'w') based on 8 different conditions all combined using '&' and '|'. There is also this 'HB' function that takes as an argument this sliced array (named 'wrange'), performs some operations on it and returns an array of the same size.
The timescale for this slicing and the 'HB' function to execute is 300-400 microseconds and 100 microseconds respectively. I need to bring it down drastically. To nanoseconds.!!
Tried using dictionary instead of array (where i am slicing). It is much slower. Tried storing the sliced array for all possible values. That is a very huge computation in its own right since there are many many possible combinations of the conditions (these conditions depend indirectly on the indices of the for loop)
s goes from 1 to 49
t goes from -s to s
and there are 641 combinations of l,n
Here, i have posted one value of s,t and an l,n combination for illustration.
s = 7
t = -7
l = 72
n = 12
Nl = Dictnorm[n,l]
Gamma_l = Dictfwhm[n,l]
Dictc1 = {}
Dictc2 = {}
Dictwrange = {}
DictH = {}
DictG = {}
product = []
startm = max(-l-t,-l)
endm = min(l-t,l)+1
sum5 = 0
for sp in range(s-2,s+3): #s'
sum4 = 0
for tp in range(-sp,-sp+1): #t'
#print(tp)
sum3 = 0
integral = 1
for lp in range(l-2,l+3): #l'
sum2 = 0
if (n,lp) in Dictknl2.keys():
N1 = Dictnorm[n,lp]
Gamma_1 = Dictfwhm[n,lp]
for lpp in range(l-2,l+3): #l"
sum1 = 0
if ((sp+lpp-lp)%2 == 1 and sp>=abs(lpp-lp) and
lp>=abs(sp-lpp) and lpp>=abs(sp-lp) and
(n,lpp) in Dictknl2.keys()):
F = f(lpp,lp,sp)
N2 = Dictnorm[n,lpp]
Gamma_2 = Dictfwhm[n,lpp]
for m in range(startm, endm): #m
sum0 = 0
L1 = LKD(n,l,m,l,m)
L2 = LKD(n,l,m+t,l,m+t)
for mp in range(max(m+t-tp-5,m-5),
min(m+5,m+t-tp+5)+1): #m'
if (abs(mp)<=lp and abs(mp)<=lpp and
abs(mp+tp)<=lp and abs(mp+tp)<=lpp
and LKD(n,l,m,lp,mp)!=0
and LKD(n,l,m+t,lpp,mp+tp)!=0):
c3 = Dictomega[n,lp,mp+tp]
c4 = Dictomega[n,lpp,mp]
wrange = np.unique(np.concatenate
((Dictwrange[m],
w[((w>=(c3-Gamma_1))&
((c3+Gamma_1)>=w))|
((w>=(c4-Gamma_2))&
((c4+Gamma_2)>=w))])))
factor = (sum(
HB(Dictc1[n,l,m+t],
Dictc2[n,l,m],Nl,
Nl,Gamma_l,
Gamma_l,wrange,
Sigma).conjugate()
*HB(c3,c4,N1,N2,Gamma_1,
Gamma_2,wrange,0)*L1*L2)
*LKD(n,l,m,lp,mp)
*LKD(n,l,m+t,lpp,mp+tp) *DictG[m]
*gamma(lpp,sp,lp,tp,mp)
*F)
sum0 = sum0 + factor #sum over m'
sum1 = sum1 + sum0 #sum over m
sum2 = sum2 + sum1 #sum over l"
sum3 = sum3 + sum2 #sum over l'
sum4 = sum4 + sum3*integral #sum over t'
sum5 = sum5 + sum4 #sum over s'
z = (1/(sum(product)))*sum5
print(z.real,z.imag,l,n)
TL;DR
def HB(a,...f,array1): #########timesucker
perform_some_operations_on_array1_using_a_b_c_d
return operated_on_array1
for i in ():
for j in ():
...
...
for o in ():
array1 = w[w>some_function1(i,j,..k) &
w<some_function2(i,j,..k) |.....] #########timesucker
factor = HB(a,....f,array1) * HB(g,...k,array1) *
alpha*beta*gamma....
It takes about 30 seconds to run this whole section once. I need to bring it down to as low as possible. 1 second is the minimum target

python - print squares of numbers which are palindromes : improve efficiency

I have an assignment to do. The problem is something like this. You give a number, say x. The program calculates the square of the numbers starting from 1 and prints it only if it's a palindrome. The program continues to print such numbers till it reaches the number x provided by you.
I have solved the problem. It works fine for uptil x = 10000000. Works fine as in executes in a reasonable amount of time. I want to improve upon the efficiency of my code. I am open to changing the entire code, if required. My aim is to make a program that could execute 10^20 within around 5 mins.
limit = int(input("Enter a number"))
def palindrome(limit):
count = 1
base = 1
while count < limit:
base = base * base #square the number
base = list(str(base)) #convert the number into a list of strings
rbase = base[:] #make a copy of the number
rbase.reverse() #reverse this copy
if len(base) > 1:
i = 0
flag = 1
while i < len(base) and flag == 1:
if base[i] == rbase[i]: #compare the values at the indices
flag = 1
else:
flag = 0
i += 1
if flag == 1:
print(''.join(base)) #print if values match
base = ''.join(base)
base = int(base)
base = count + 1
count = count + 1
palindrome(limit)
He're my version:
import sys
def palindrome(limit):
for i in range(limit):
istring = str(i*i)
if istring == istring[::-1]:
print(istring,end=" ")
print()
palindrome(int(sys.argv[1]))
Timings for your version on my machine:
pu#pumbair: ~/Projects/Stackexchange time python3 palin1.py 100000
121 484 676 10201 12321 14641 40804 44944 69696 94249 698896 1002001 1234321
4008004 5221225 6948496 100020001 102030201 104060401 121242121 123454321 125686521
400080004 404090404 522808225 617323716 942060249
real 0m0.457s
user 0m0.437s
sys 0m0.012s
and for mine:
pu#pumbair: ~/Projects/Stackexchange time python3 palin2.py 100000
0 1 4 9
121 484 676 10201 12321 14641 40804 44944 69696 94249 698896 1002001 1234321
4008004 5221225 6948496 100020001 102030201 104060401 121242121 123454321 125686521
400080004 404090404 522808225 617323716 942060249
real 0m0.122s
user 0m0.104s
sys 0m0.010s
BTW, my version gives more results (0, 1, 4, 9).
Surely something like this will perform better (avoiding the unnecessary extra list operations) and is more readable:
def palindrome(limit):
base = 1
while base < limit:
squared = str(base * base)
reversed = squared[::-1]
if squared == reversed:
print(squared)
base += 1
limit = int(input("Enter a number: "))
palindrome(limit)
I think we can do it a little bit easier.
def palindrome(limit):
count = 1
while count < limit:
base = count * count # square the number
base = str(base) # convert the number into a string
rbase = base[::-1] # make a reverse of the string
if base == rbase:
print(base) #print if values match
count += 1
limit = int(input("Enter a number: "))
palindrome(limit)
String into number and number into string conversions were unnecessary. Strings can be compared, this is why you shouldn't make a loop.
You can keep a list of square palindromes upto a certain limit(say L) in memory.If the Input number x is less than sqrt(L) ,you can simply iterate over the list of palindromes and print them.This way you wont have to iterate over every number and check if its square is palindrome .
You can find a list of square palindromes here : http://www.fengyuan.com/palindrome.html
OK, here's my program. It caches valid suffixes for squares (i.e. the values of n^2 mod 10^k for a fixed k), and then searches for squares which have both that suffix and start with the suffix reversed. This program is very fast: in 24 seconds, it lists all the palindromic squares up to 10^24.
from collections import defaultdict
# algorithm will print palindromic squares x**2 up to x = 10**n.
# efficiency is O(max(10**k, n*10**(n-k)))
n = 16
k = 6
cache = defaultdict(list)
print 0, 0 # special case
# Calculate everything up to 10**k; these will be the prefix/suffix pairs we use later
tail = 10**k
for i in xrange(tail):
if i % 10 == 0: # can't end with 0 and still be a palindrome
continue
sq = i*i
s = str(sq)
if s == s[::-1]:
print i, s
prefix = int(str(sq % tail).zfill(k)[::-1])
cache[prefix].append(i)
prefixes = sorted(cache)
# Loop through the rest, but only consider matching prefix/suffix pairs
for l in xrange(k*2+1, n*2+1):
for p in prefixes:
low = (p * 10**(l-k))**.5
high = ((p+1) * 10**(l-k))**.5
low = int(low / tail) * tail
high = (int(high / tail) + 1) * tail
for n in xrange(low, high, tail):
for suf in cache[p]:
x = n + suf
s = str(x*x)
if s == s[::-1]:
print x, s
Sample output:
0 0
1 1
2 4
3 9
11 121
22 484
26 676
101 10201
111 12321
121 14641
202 40804
212 44944
<snip>
111010010111 12323222344844322232321
111100001111 12343210246864201234321
111283619361 12384043938083934048321
112247658961 12599536942224963599521
128817084669 16593841302620314839561
200000000002 40000000000800000000004

How to keep a number below n

I have a python program that outputs a list of coordinates that correspond to points in a survey. To keep this simple, I'm trying to make any coordinate above n (36) display something like: 1.8+36, which is 37.8, however 1x1.8 (same number) could also work, or any similar permutation... the coordinates are in lists (one for x and one for y). I currently use an if statement, but that obviously only works for numbers less than 72.
The simplest way is probably to use integer division and the modulus operator (which takes the remainder), so;
blocks = n // 36
small = n % 36
format_n = str(small) + ' + ' + str(blocks) + '*36'
Should give i + k*36, where i < 36 and k is an integer.
As long as your values remain below 1296 (36*36), you can divide your number by 36 and represent it as this number into 36.
input_1 = 105
output_1 = (105 * 1.0) / 36 # 2.197
print '36*' + output_1 # 36*2.197
n = float(input())
if n > 36:
result = str(n -36) + "+36"
else:
result = n
print(result)
this outputs the remainder of n-36, and then +36, for example if n is 124.79, 88.79+36 is outputted.

Categories

Resources