Am I doing something wrong trying to #jit this Python function - python

I've got this Python function which I'm trying to #jit, but I can't really get around it (it's the first time I try jitting something, so I'm here to learn more than anything). I'll include the code (commented) and I'll explain what it does briefly:
def pwrchk(n, m):
no_prob = [] #The list that will contain the exponents
for i in range(2, m+1): #Start from 2 because 1 is useless
power = n**i
is_good = True #If this becomes false later, there's a zero.
for j in range(math.floor(math.log(power, 10)) + 1):
#The formula in the range is the number of digits of 'power'
digit = power % 10 #Returns the last digit so it can be checked
if digit in digits:
is_good = False #This is the check
power = 0
break
power = power // 10 #Gets rid of the digit just checked
if is_good:
no_prob.append(i) #Append to the list of "good" exponents
return no_prob
This function computes n^i , with 2 < i < m, and checks if n^i contains a zero in its digits, and then returns a list of which exponents are such that n^i contains no zeros. It works absolutely fine in normal Python compilation.
Since for big values of m the execution time gets really long (I've tried m = 10^6 and it goes to a crawl), I thought of putting it in Anaconda to #jit it. The problem is that when I use the #jit decorator it tells me that it keeps falling back to object mode, so I can't compile it in nopython mode.
I tried changing the lists to numpy arrays and populating them with the powers of n out of the for cycle, changing math.floor(math.log(power, 10)) using numpy to manage the arrays, nothing.
Am I doing something wrong? I'm sure there's a simple explanation to it that's just going over my head, but as said before I'm kinda new to Numba, so if I'm doing something really dumb please tell me, so I won't do it again in the future, and if I need to provide something else I'll update with anything needed.

Related

Been using rand.int for a while and seeing unexpected results

I've been running some code for an hour or so using a rand.int function, where the code models a dice's roll, where the dice has ten faces, and you have to roll it six times in a row, and each time it has to roll the same number, and it is tracking how many tries it takes for this to happen.
success = 0
times = 0
count = 0
total = 0
for h in range(0,100):
for i in range(0,10):
times = 0
while success == 0:
numbers = [0,0,0,0,0,0,0,0,0,0]
for j in range(0,6):
x = int(random.randint(0,9))
numbers[x] = 1
count = numbers.count(1)
if count == 1:
success = 1
else:
times += 1
print(i)
total += times
success = 0
randtst = open("RandomTesting.txt", "a" )
randtst.write(str(total / 10)+"\n")
randtst.close()
And running this code, this has been going into a file, the contents of which is below
https://pastebin.com/7kRK1Z5f
And taking the average of these numbers using
newtotal = 0
totalamounts = 0
with open ('RandomTesting.txt', 'rt') as rndtxt:
for myline in rndtxt: ,
newtotal += float(myline)
totalamounts += 1
print(newtotal / totalamounts)
Which returns 742073.7449342106. This number is incorrect, (I think) as this is not near to 10^6. I tried getting rid of the contents and doing it again, but to no avail, the number is nowhere near 10^6. Can anyone see a problem with this?
Note: I am not asking for fixes to the code or anything, I am asking whether something has gone wrong to get the above number rather that 100,000
There are several issues working against you here. Bottom line up front:
your code doesn't do what you described as your intent;
you currently have no yardstick for measuring whether your results agree with the theoretical answer; and
your expectations regarding the correct answer are incorrect.
I felt that your code was overly complex for the task you were describing, so I wrote my own version from scratch. I factored out the basic experiment of rolling six 10-sided dice and checking to see if the outcomes were all equal by creating a list of length 6 comprised of 10-sided die rolls. Borrowing shamelessly from BoarGules' comment, I threw the results into a set—which only stores unique elements—and counted the size of the set. The dice are all the same value if and only if the size of the set is 1. I kept repeating this while the number of distinct elements was greater than 1, maintaining a tally of how many trials that required, and returned the number of trials once identical die rolls were obtained.
That basic experiment is then run for any desired number of replications, with the results placed in a numpy array. The resulting data was processed by numpy and scipy to yield the average number of trials and a 95% confidence interval for the mean. The confidence interval uses the estimated variability of the results to construct a lower and an upper bound for the mean. The bounds produced this way should contain the true mean for 95% of estimates generated in this way if the underlying assumptions are met, and address the second point in my BLUF.
Here's the code:
import random
import scipy.stats as st
import numpy as np
NUM_DIGITS = 6
SAMPLE_SIZE = 1000
def expt():
num_trials = 1
while(len(set([random.randrange(10) for _ in range(NUM_DIGITS)])) > 1):
num_trials += 1
return num_trials
data = np.array([expt() for _ in range(SAMPLE_SIZE)])
mu_hat = np.mean(data)
ci = st.t.interval(alpha=0.95, df=SAMPLE_SIZE-1, loc=mu_hat, scale=st.sem(data))
print(mu_hat, ci)
The probability of producing 6 identical results of a particular value from a 10-sided die is 10-6, but there are 10 possible particular values so the overall probability of producing all duplicates is 10*10-6, or 10-5. Consequently, the expected number of trials until you obtain a set of duplicates is 105. The code above took a little over 5 minutes to run on my computer, and produced 102493.559 (96461.16185897154, 108525.95614102845) as the output. Rounding to integers, this means that the average number of trials was 102493 and we're 95% confident that the true mean lies somewhere between 96461 and 108526. This particular range contains 105, i.e., it is consistent with the expected value. Rerunning the program will yield different numbers, but 95% of such runs should also contain the expected value, and the handful that don't should still be close.
Might I suggest if you're working with whole integers that you should be receiving a whole number back instead of a floating point(if I'm understanding what you're trying to do.).
##randtst.write(str(total / 10)+"\n") Original
##randtst.write(str(total // 10)+"\n")
Using a floor division instead of a division sign will round down the number to a whole number which is more idea for what you're trying to do.
If you ARE using floating point numbers, perhaps using the % instead. This will not only divide your number, but also ONLY returns the remainder.
% is Modulo in python
// is floor division in python
Those signs will keep your numbers stable and easier to work if your total returns a floating point integer.
If this isn't the case, you will have to account for every number behind the decimal to the right of it.
And if this IS the case, your result will never reach 10x^6 because the line for totalling your value is stuck in a loop.
I hope this helps you in anyway and if not, please let me know as I'm also learning python.

Subtracting two numbers of any base between 2 and 10

For two numbers x and y that are base b, does this work for subtracting them? The numbers are given in string format and 2 <= b <= 10.
def n2b(n, b): # function to convert number n from base 10 to base b
if n == 0:
return 0
d = []
while n:
d.append(int(n % b))
n /= b
return ''.join(map(str,d[::-1]))
x = int(x,b) # convert to integers in base 10
y = int(y,b)
z = x - y
z = n2b(z,b) # convert back to base b, still in integer form
You have some confusion about how integers work in python. As the comments above say: python always stores integers in binary form and only converts them to a base when you print them. Depending on how you get x and y and how you need to give back z the code needs to be different
Situation 1: x, y and z are all integers
In this case you only need to do
z = x - y
And you're done.
Situation 2: x, y and z are all strings
In this case you first need to convert the strings into integers with the right base. I think that's your case, since you already deal with int(x, b) which is correct to transform a string into an integer (e.g. int("11", 2) gives 3 (integer represented in base 10). I would advice you to reform your code into something like this:
x_int = int(x, b)
y_int = int(y, b)
z_str = n2b(x_int - y_int, b)
In your code x is first a string and then an integer, which is bad practice. So e.g. use x_int instead of x.
Now it comes down to if your n2b function is correct. It looks good from the distance, although you're not handling signs and bases bigger than 10. There is a broadly accepted convert integer to base b answer so you might take this to be sure.
This is exactly the problem I just ran into in the google foobar challenge (which I suspect is the same source of ops problem). Granted its years later and op has no use for my answer but someone else might.
The issue
The function op used looked a lot like a copy and paste of this provided by the accepted answer but slightly modified (although I didn't look to closely).
I used the same function and quickly realized that I needed my output to be a string. Op appears to have realized the same thing based on the return statement at the end of the function.
This is why most of the test cases passed. Op did almost everything right.
See, the function begins with
if n==0:
return 0
One of the test cases in the foobar challenge uses 0 as the input. Its an easy line to miss but a very important one.
Solution
When I was presented this problem, I thought about the possible outlier cases. 0 was my first idea (which turned out to be right). I ran the program in vscode and would ya look at that - it failed.
This let me see the error message (in my case it was a list rather than int but op would have received a similar error).
The solution is simply changing return 0 to return '0' (a string rather than int)
I wasn't super excited to write out this answer since it feels a bit like cheating but its such a small detail that can be so infuriating to deal with. It doesn't solve the challenge that foobar gives you, just tells you how to get past a speed bump along the way.
Good luck infiltrating commander lambda's castle and hopefully this helps.

Are two integers equal in SOME base?

Disclaimer: this problem is purely for fun. There may already be a solution (I've searched, no luck). I have several questions below, I'd appreciate an answer to any of them. RIYL...
I was inspired by a Three Stooges video I saw earlier, where it's shown that 13x7 = 28. You may have seen it. But I started wondering: is there SOME "base" in which this equation is true (I put base in quotations because I'm using the term in the wrong sense..see the final paragraph for what I mean)?
The answer is clearly no, if we define multiplication the same as for integers. If you break up 13 into "base" i, say 13 = 1*i+3, and 28 = 2*i+8, the multiplication factor of 7 ensures equality won't happen.
Okay, but now suppose you want to ask the question, is there some base where two numbers are equal, say 8 = 10 (I'm probably using the term "base" wrong, sorry for that)?
What I mean is, if we write 8 = 008 = 0*8^2+0*8+8, 10 = 010 = 0*8^2+1*8^1+0, then according to my (clearly wrong) usage of base, we have equality. I wrote some simple code, up to 3 digit numbers, to verify this. But my code sucks.
''' We define two numbers, such that n1 > n2...tho I'm flexible'''
n1 = "013"
n2 = "025"
''' Set the numbers as arrays. '''
num1 = list(range(len(n1)))
num2 = list(range(len(n2)))
for i in range(len(n1)):
num1[i] = int(n1[i])
for i in range(len(n2)):
num2[i] = int(n2[i])
''' Now we loop until we find a match, or no match is possible. '''
i = 1
j = 0
while True:
t1=(num1[0]*(i**2)+num1[1]*i+num1[2])
t2=(num2[0]*(i**2)+num2[1]*i+num2[2])
''' We need some way to check if t1 > t2 changes to t1 < t2 at some point
or vise-versa -> then we know no match is possible '''
if(i == 1):
if t1>t2:
j = 0
else: j = 1
if(t1==t2):
print("The numbers are equal in base %d" % i)
break
if(t2 > t1 and j == 0):
print("No base possible! After %d steps" % i)
break
if(t1 > t2 and j == 1):
print("No base possible! After %d steps" % i)
break
i=i+1
if (i > 2**6):
print("your search might be hopeless")
break
Sorry if your eyes hurt from that hideous code. I didn't even use numpy arrays. What I'm wondering is,
Has this problem been solved before, for arbitrary digits? If not..
I wanted to be flexible about the number of digits entered in n1 and n2. Is there a more clever way to define the functions t1 and t2 so that they adaptively expand in base i, depending on the number of digits entered?
Performance-wise I'm sure there's a better way to do my main iteration. This is the fun part for me, combined with the answer to part 2. Any suggestions?
If it happens that t1 and t2 forever remain ordered the same, as in the example from the code, the iteration will play out 2^^6 times. The iteration number was chosen arbitrarily, but one could imagine if we extended to many digits, one might need even more iterations! Surely there's a more clever way to stop the iteration?
Is my stopping condition just wrong?
Thanks so much for reading. This is not a homework assignment and the answers are probably completely useless. I'm just interested in seeing a real coder's take on it. It would be neat if this wasn't an already-solved problem. Much love.
You can make the problem simpler by applying a little bit of number of theory. Taking your first example 13x7=28 we can expand this out into an explicit polynomial over the base: (1n+3)*7=2n+8.
If this solution has real roots then the roots are values of n (bases) for which the equation is true. If you like this sort of problem then you should read Computational Number Theory by Shoup. It's a fun book.

Solving recursive sequence

Lately I've been solving some challenges from Google Foobar for fun, and now I've been stuck in one of them for more than 4 days. It is about a recursive function defined as follows:
R(0) = 1
R(1) = 1
R(2) = 2
R(2n) = R(n) + R(n + 1) + n (for n > 1)
R(2n + 1) = R(n - 1) + R(n) + 1 (for n >= 1)
The challenge is writing a function answer(str_S) where str_S is a base-10 string representation of an integer S, which returns the largest n such that R(n) = S. If there is no such n, return "None". Also, S will be a positive integer no greater than 10^25.
I have investigated a lot about recursive functions and about solving recurrence relations, but with no luck. I outputted the first 500 numbers and I found no relation with each one whatsoever. I used the following code, which uses recursion, so it gets really slow when numbers start getting big.
def getNumberOfZombits(time):
if time == 0 or time == 1:
return 1
elif time == 2:
return 2
else:
if time % 2 == 0:
newTime = time/2
return getNumberOfZombits(newTime) + getNumberOfZombits(newTime+1) + newTime
else:
newTime = time/2 # integer, so rounds down
return getNumberOfZombits(newTime-1) + getNumberOfZombits(newTime) + 1
The challenge also included some test cases so, here they are:
Test cases
==========
Inputs:
(string) str_S = "7"
Output:
(string) "4"
Inputs:
(string) str_S = "100"
Output:
(string) "None"
I don't know if I need to solve the recurrence relation to anything simpler, but as there is one for even and one for odd numbers, I find it really hard to do (I haven't learned about it in school yet, so everything I know about this subject is from internet articles).
So, any help at all guiding me to finish this challenge will be welcome :)
Instead of trying to simplify this function mathematically, I simplified the algorithm in Python. As suggested by #LambdaFairy, I implemented memoization in the getNumberOfZombits(time) function. This optimization sped up the function a lot.
Then, I passed to the next step, of trying to see what was the input to that number of rabbits. I had analyzed the function before, by watching its plot, and I knew the even numbers got higher outputs first and only after some time the odd numbers got to the same level. As we want the highest input for that output, I first needed to search in the even numbers and then in the odd numbers.
As you can see, the odd numbers take always more time than the even to reach the same output.
The problem is that we could not search for the numbers increasing 1 each time (it was too slow). What I did to solve that was to implement a binary search-like algorithm. First, I would search the even numbers (with the binary search like algorithm) until I found one answer or I had no more numbers to search. Then, I did the same to the odd numbers (again, with the binary search like algorithm) and if an answer was found, I replaced whatever I had before with it (as it was necessarily bigger than the previous answer).
I have the source code I used to solve this, so if anyone needs it I don't mind sharing it :)
The key to solving this puzzle was using a binary search.
As you can see from the sequence generators, they rely on a roughly n/2 recursion, so calculating R(N) takes about 2*log2(N) recursive calls; and of course you need to do it for both the odd and the even.
Thats not too bad, but you need to figure out where to search for the N which will give you the input. To do this, I first implemented a search for upper and lower bounds for N. I walked up N by powers of 2, until I had N and 2N that formed the lower and upper bounds respectively for each sequence (odd and even).
With these bounds, I could then do a binary search between them to quickly find the value of N, or its non-existence.

Slow Big Int Output in python

Is there anyway to improve performance of "str(bigint)" and "print bigint" in python ? Printing big integer values takes a lot of time. I tried to use the following recursive technique :
def p(x,n):
if n < 10:
sys.stdout.write(str(x))
return
n >>= 1
l = 10**n
k = x/l
p(k,n)
p(x-k*l,n)
n = number of digits,
x = bigint
But the method fails for certain cases where x in a sub call has leading zeros. Is there any alternative to it or any faster method. ( Please do not suggest using any external module or library ).
Conversion from a Python integer to a string has a running of O(n^2) where n is the length of the number. For sufficiently large numbers, it will be slow. For a 1,000,001 digit number, str() takes approximately 24 seconds on my computer.
If you are really needing to convert very large numbers to a string, your recursive algorithm is a good approach.
The following version of your recursive code should work:
def p(x,n=0):
if n == 0:
n = int(x.bit_length() * 0.3)
if n < 100:
return str(x)
n >>= 1
l = 10**n
a,b = divmod(x, l)
upper = p(a,n)
lower = p(b,n).rjust(n, "0")
return upper + lower
It automatically estimates the number of digits in the output. It is about 4x faster for a 1,000,001 digit number.
If you need to go faster, you'll probably need to use an external library.
For interactive applications, the built-in print and str functions run in the blink of an eye.
>>> print(2435**356)
392312129667763499898262143039114894750417507355276682533585134425186395679473824899297157270033375504856169200419790241076407862555973647354250524748912846623242257527142883035360865888685267386832304026227703002862158054991819517588882346178140891206845776401970463656725623839442836540489638768126315244542314683938913576544051925370624663114138982037489687849052948878188837292688265616405774377520006375994949701519494522395226583576242344239113115827276205685762765108568669292303049637000429363186413856005994770187918867698591851295816517558832718248949393330804685089066399603091911285844172167548214009780037628890526044957760742395926235582458565322761344968885262239207421474370777496310304525709023682281880997037864251638836009263968398622163509788100571164918283951366862838187930843528788482813390723672536414889756154950781741921331767254375186751657589782510334001427152820459605953449036021467737998917512341953008677012880972708316862112445813219301272179609511447382276509319506771439679815804130595523836440825373857906867090741932138749478241373687043584739886123984717258259445661838205364797315487681003613101753488707273055848670365977127506840194115511621930636465549468994140625
>>> str(2435**356)
'392312129667763499898262143039114894750417507355276682533585134425186395679473824899297157270033375504856169200419790241076407862555973647354250524748912846623242257527142883035360865888685267386832304026227703002862158054991819517588882346178140891206845776401970463656725623839442836540489638768126315244542314683938913576544051925370624663114138982037489687849052948878188837292688265616405774377520006375994949701519494522395226583576242344239113115827276205685762765108568669292303049637000429363186413856005994770187918867698591851295816517558832718248949393330804685089066399603091911285844172167548214009780037628890526044957760742395926235582458565322761344968885262239207421474370777496310304525709023682281880997037864251638836009263968398622163509788100571164918283951366862838187930843528788482813390723672536414889756154950781741921331767254375186751657589782510334001427152820459605953449036021467737998917512341953008677012880972708316862112445813219301272179609511447382276509319506771439679815804130595523836440825373857906867090741932138749478241373687043584739886123984717258259445661838205364797315487681003613101753488707273055848670365977127506840194115511621930636465549468994140625'
If however you are printing big integers to (standard output, say) so that they can be read (from standard input) by another process, and you are finding the binary-to-decimal operations impacting the overall performance, you can look at Is there a faster way to convert an arbitrary large integer to a big endian sequence of bytes? (although the accepted answer suggests numpy, which is an external library, though there are other suggestions).

Categories

Resources