Crossing number/Winding number polygon test expressed as binary (Python) - python

I'm trying to make an implementation of either a winding number or crossing number test, built mainly around boolean operations.
The boolean requirement is due to the methods and efficiency of the underlying dataset, making it sub-optimal to delegate variables for counting other than a boolean value.
Crossing number seems easiest to implement (I would think) since it is inherently binary (even (0) vs. odd (1)), where the result of the crossing number test of each side can be xor-ed with the previous results such as in the code given below, where xyz is our evaluated coordinate. Code is adapted from http://geomalgorithms.com/a03-_inclusion.html at the end.
#Original points:
pts=[[100,100],[200,200],[300,100],[400,300],[300,400],[200,300],[100,100]]
#Extremas:
min=[pts[0][0],pts[0][1]]
max=[pts[0][0],pts[0][1]]
for i in pts:
for j in range(2):
if i[j]<min[j]:
min[j]=i[j]
if i[j]>max[j]:
max[j]=i[j]
#New dimensions:
w=max[0]-min[0]
h=max[1]-min[1]
if len(sys.argv) > 2:
xyz=[int(sys.argv[1]),int(sys.argv[2])]
else:
xyz=[200,100]
#Normalize by cutting off lower than minimum, higher than maximum:
for i,p in enumerate(pts):
pts[i]=[p[0]-min[0],p[1]-min[1]]
x=0
y=1
logic=None
counting=0
for i in range(len(pts)-1):
test=( ( (pts[i][y] <= xyz[y]) and (pts[i+1][y] > xyz[y]) ) or \
( (pts[i][y] > xyz[y]) and (pts[i+1][y] <= xyz[y]) ) ) and \
(xyz[x] < pts[i][x] + ( (xyz[y]-pts[i][y])/(pts[i+1][y]-pts[i][y]) ) * (pts[i+1][x] - pts[i][x]))
if logic is None:
logic=test
else:
logic^=test
if test:
counting+=1
print logic
print counting
Results:
For a whole image the binary flow results in these images where each square is one step.
Obviously something is amiss, yet I can't seem to figure out why it goes all haywire after it rounds the lower right corner... Any thoughts?

Aha!
&!==and and |!==or. By changing the operators it worked.

Related

Creating an s-curve based on data points

I have a series of data points which form a curve I do not have an equation for, and for which i have not been able to satisfyingly calculate an equation with either libreoffice or the online curve fitting tools in the first 2 pages of google results.
I would like the equation for the curve and ideally a python implementation of calculating y values for a given x value along that curve in case there are unexpected hoops to jump through. Failing that I would like any more elegant python solution than a list of elif statements incrementing y if x is high enough for it to increase by a whole number, which is the ugly solution of last resort - my immediate plans do not require decimal precision.
The curve crosses the zero line at 10, and every whole number incrementation of y requires x to be incremented by one more whole number than the previous, so y1 is reached at x11, y2 at x13, y3 at x16 etc, with the curve bending in the other direction in the negatives such that y-1 is at x9, y-2 is at x7 etc. I suspect i am missing something obvious as far as finding the curve equation when i already have this knowledge.
In addition to trying to use libreoffice calc and several online curve-fitting websites to no avail, i have tried slicing the s-curve (I have given up on searching the term sigmoid function as all my results are either related to neural nets or expect my y values to never exceed +-1) into two logarythmic curves, which almost works - 5 *(np.log(x) - 11) gets something frustratingly close to the top half of the curve, but which i ultimately haven't been able to use - in addition to crossing the number line at 9 it produced some odd behaviour when I returned round() rounded y values directly, displaying results in the negative 40s when returned directly, but seeming to work fine when those numbers are fed into other calculations.
If somebody can give me two working logarythms that round to the right numbers for x values between 0 and 50 that is good enough for this project.
Thank you for your time and patience.
-EDIT-
these are triangular numbers apparently, x-10 is equal to the number of dots in a triangle with y dots on each side, what I need is the inverse of the triangular number formula. Thank you to everyone who commented.
As mentioned in my edit, the y i am trying to find is the triangular root of x. This solution:
def get_triangle_root(x: int) -> int:
current_value = x - 10
negative = False
if current_value < 0:
current_value = current_value * -1
negative = True
current_value = np.sqrt(1 + (current_value * 8))
current_value = (current_value - 1)/2
if negative == True:
current_value = current_value * -1
current_value = int(current_value)
return current_value
seems to work fine for now. Curiously, when I calculate (-1+(sqrt(1+(8*x)))/2) using libreoffice or google, rather than getting the same results this python script gives me, i get results 0.5 lower than the actual triangle root. Unimportant at this time, but I am curious as to what would cause it.
At any rate, thank you to everyone who lent their time to me. I apologise to anyone looking at this question who was looking for a universal solution for creating S-curves rather than just one that works for my specific task, but feel it is best to attach an answer to this question so as not to prevail on more people's time.
-EDIT- changed python script to handle negative triangular numbers as well, something i had overlooked in excitement.
What you're looking for are a class of functions called "Sigmoid functions". They have a characteristic S-shape. Go to Wolfram and play around with some common Sigmoid funcs, remembering that the "a" in a function, f(x-a), shifts the entire curve left or right, and appending a value "b" to the function, f(x-a) + b will shift the curve up and down. Using a coefficient of "c", f(c*x - a) + b here acts as a scalar. That should get you where you want to be in short time.
Example: (1/(1 + C*exp(-(x + A)))) + B

Binary-like search

It's theoretical question.
exercise from leetcode as basis.
My solution for task is binary search. But question is not about it.
I found perfect solution on Discuss tab.
(next code has been taken from there)
class Solution:
def mySqrt(self, x: int) -> int:
low, high= 1, x
while low<high:
high = (low + high) // 2
low = x // high
return high
It works perfect. My question is:
For regular binary search we take middle of sequence and depending of comparison result remove excessive part (left or right) next repeat till result.
What is this implementation based on?
This solution cut part of sequence right after middle and small part from start.
This code isn't based on binary search. It's based instead on adapting the ancient "Babylonian method" to integer arithmetic. That in turn can be viewed as anticipating an instance of Newton's more-general method for finding a root of an equation.
Keeping distinct low and high variables isn't important in this code. For example, it's more commonly coded along these lines:
def intsqrt(n):
guess = n # must be >= true floor(sqrt(n))
while True:
newguess = (guess + (n // guess)) // 2
if guess <= newguess:
return guess
guess = newguess
but with more care taken to find a better initial guess.
BTW, binary search increases the number of "good bits" by 1 per iteration. This method approximately doubles the number of "good bits" per iteration, so is much more efficient the closer the guess gets to the final result.
This method is subtle, though was known of the Babylonians (see Tim's answer).
Assume that h > √x. Then
l = x/h < √x and
(l+h)/2 > √x.
The first property is obvious. For the second, observe that 1. and 2. imply
x+h² > 2h√x or (h-√x)^2 > 0, which is true.
So h remains above √x, but it gets closer and closer (because (l+h)/2 < h). And when the computation is made with integers, there is a moment such that l≥h.
How was this method discovered ?
Assume that you have an approximation h of √x and we want to improve it, with a correction δ. We write x = (h-δ)² = h²-2hδ + δ² = x. If we neglect δ², then we draw h-δ = (h²+x)/2h = (h+x/h)/2, which is our (h+l)/2.

Been using rand.int for a while and seeing unexpected results

I've been running some code for an hour or so using a rand.int function, where the code models a dice's roll, where the dice has ten faces, and you have to roll it six times in a row, and each time it has to roll the same number, and it is tracking how many tries it takes for this to happen.
success = 0
times = 0
count = 0
total = 0
for h in range(0,100):
for i in range(0,10):
times = 0
while success == 0:
numbers = [0,0,0,0,0,0,0,0,0,0]
for j in range(0,6):
x = int(random.randint(0,9))
numbers[x] = 1
count = numbers.count(1)
if count == 1:
success = 1
else:
times += 1
print(i)
total += times
success = 0
randtst = open("RandomTesting.txt", "a" )
randtst.write(str(total / 10)+"\n")
randtst.close()
And running this code, this has been going into a file, the contents of which is below
https://pastebin.com/7kRK1Z5f
And taking the average of these numbers using
newtotal = 0
totalamounts = 0
with open ('RandomTesting.txt', 'rt') as rndtxt:
for myline in rndtxt: ,
newtotal += float(myline)
totalamounts += 1
print(newtotal / totalamounts)
Which returns 742073.7449342106. This number is incorrect, (I think) as this is not near to 10^6. I tried getting rid of the contents and doing it again, but to no avail, the number is nowhere near 10^6. Can anyone see a problem with this?
Note: I am not asking for fixes to the code or anything, I am asking whether something has gone wrong to get the above number rather that 100,000
There are several issues working against you here. Bottom line up front:
your code doesn't do what you described as your intent;
you currently have no yardstick for measuring whether your results agree with the theoretical answer; and
your expectations regarding the correct answer are incorrect.
I felt that your code was overly complex for the task you were describing, so I wrote my own version from scratch. I factored out the basic experiment of rolling six 10-sided dice and checking to see if the outcomes were all equal by creating a list of length 6 comprised of 10-sided die rolls. Borrowing shamelessly from BoarGules' comment, I threw the results into a set—which only stores unique elements—and counted the size of the set. The dice are all the same value if and only if the size of the set is 1. I kept repeating this while the number of distinct elements was greater than 1, maintaining a tally of how many trials that required, and returned the number of trials once identical die rolls were obtained.
That basic experiment is then run for any desired number of replications, with the results placed in a numpy array. The resulting data was processed by numpy and scipy to yield the average number of trials and a 95% confidence interval for the mean. The confidence interval uses the estimated variability of the results to construct a lower and an upper bound for the mean. The bounds produced this way should contain the true mean for 95% of estimates generated in this way if the underlying assumptions are met, and address the second point in my BLUF.
Here's the code:
import random
import scipy.stats as st
import numpy as np
NUM_DIGITS = 6
SAMPLE_SIZE = 1000
def expt():
num_trials = 1
while(len(set([random.randrange(10) for _ in range(NUM_DIGITS)])) > 1):
num_trials += 1
return num_trials
data = np.array([expt() for _ in range(SAMPLE_SIZE)])
mu_hat = np.mean(data)
ci = st.t.interval(alpha=0.95, df=SAMPLE_SIZE-1, loc=mu_hat, scale=st.sem(data))
print(mu_hat, ci)
The probability of producing 6 identical results of a particular value from a 10-sided die is 10-6, but there are 10 possible particular values so the overall probability of producing all duplicates is 10*10-6, or 10-5. Consequently, the expected number of trials until you obtain a set of duplicates is 105. The code above took a little over 5 minutes to run on my computer, and produced 102493.559 (96461.16185897154, 108525.95614102845) as the output. Rounding to integers, this means that the average number of trials was 102493 and we're 95% confident that the true mean lies somewhere between 96461 and 108526. This particular range contains 105, i.e., it is consistent with the expected value. Rerunning the program will yield different numbers, but 95% of such runs should also contain the expected value, and the handful that don't should still be close.
Might I suggest if you're working with whole integers that you should be receiving a whole number back instead of a floating point(if I'm understanding what you're trying to do.).
##randtst.write(str(total / 10)+"\n") Original
##randtst.write(str(total // 10)+"\n")
Using a floor division instead of a division sign will round down the number to a whole number which is more idea for what you're trying to do.
If you ARE using floating point numbers, perhaps using the % instead. This will not only divide your number, but also ONLY returns the remainder.
% is Modulo in python
// is floor division in python
Those signs will keep your numbers stable and easier to work if your total returns a floating point integer.
If this isn't the case, you will have to account for every number behind the decimal to the right of it.
And if this IS the case, your result will never reach 10x^6 because the line for totalling your value is stuck in a loop.
I hope this helps you in anyway and if not, please let me know as I'm also learning python.

Binary search to find the highest possible value in Python

I have a very large number (20 digits), which I need to find. So in a range between 0 and 99999999999999999999.
I can perform a check if the number is larger or smaller than the guessed number, so for example:
is_smaller(12341234123412341234)
# True
is_smaller(98769876987698769876)
# False
However, how the function is_smaller works is unknown, but the value for a number is constant.
Could this be solved with a binary search - I'm not quite sure how I can implement this as I only ever know If the number is smaller/larger.
Most implementations of the binary search I've come across, use it to find a given number in an array, which doesn't work for me as the number is unknown.
How could I use binary search in this scenario, or would another method be better suited?
The goal is to find the highest possible value, that still returns True for is_smaller.
edit: I do not have a way of telling if the number is bigger, so I have no is_bigger function. So in a smaller range (e.g. 0 to 10), if the number of interest is 6, the function I have would return:
[...]
is_smaller(4)
# True
is_smaller(5)
# True
is_smaller(6)
# True
is_smaller(7)
# False
is_smaller(8)
# False
I have to admit the functions name in the question was very poorly chosen.
If something is neither bigger nor smaller than the number you're looking for, it's the number you're looking for.
def is_answer(n):
return not is_smaller(n) and not is_larger(n)
Now you can use standard binary search; just replace conditionals that look like
if x == search_term:
if x < search_term:
if x > search_term:
With
if is_answer(x):
if is_smaller(x):
if is_larger(x):
Respectively. If you want a <= or >= operator for your binary search, you can construct it yourself from these building blocks.
Binary search splits a range from lower_boundary .. higher_boundary to the range lower_boundary .. (lower_boundary + higher_boundary) // 2 or (lower_boundary + higher_boundary) // 2 + 1 .. lower_boundary depending on the outcome of your is_smaller function.

Solving recursive sequence

Lately I've been solving some challenges from Google Foobar for fun, and now I've been stuck in one of them for more than 4 days. It is about a recursive function defined as follows:
R(0) = 1
R(1) = 1
R(2) = 2
R(2n) = R(n) + R(n + 1) + n (for n > 1)
R(2n + 1) = R(n - 1) + R(n) + 1 (for n >= 1)
The challenge is writing a function answer(str_S) where str_S is a base-10 string representation of an integer S, which returns the largest n such that R(n) = S. If there is no such n, return "None". Also, S will be a positive integer no greater than 10^25.
I have investigated a lot about recursive functions and about solving recurrence relations, but with no luck. I outputted the first 500 numbers and I found no relation with each one whatsoever. I used the following code, which uses recursion, so it gets really slow when numbers start getting big.
def getNumberOfZombits(time):
if time == 0 or time == 1:
return 1
elif time == 2:
return 2
else:
if time % 2 == 0:
newTime = time/2
return getNumberOfZombits(newTime) + getNumberOfZombits(newTime+1) + newTime
else:
newTime = time/2 # integer, so rounds down
return getNumberOfZombits(newTime-1) + getNumberOfZombits(newTime) + 1
The challenge also included some test cases so, here they are:
Test cases
==========
Inputs:
(string) str_S = "7"
Output:
(string) "4"
Inputs:
(string) str_S = "100"
Output:
(string) "None"
I don't know if I need to solve the recurrence relation to anything simpler, but as there is one for even and one for odd numbers, I find it really hard to do (I haven't learned about it in school yet, so everything I know about this subject is from internet articles).
So, any help at all guiding me to finish this challenge will be welcome :)
Instead of trying to simplify this function mathematically, I simplified the algorithm in Python. As suggested by #LambdaFairy, I implemented memoization in the getNumberOfZombits(time) function. This optimization sped up the function a lot.
Then, I passed to the next step, of trying to see what was the input to that number of rabbits. I had analyzed the function before, by watching its plot, and I knew the even numbers got higher outputs first and only after some time the odd numbers got to the same level. As we want the highest input for that output, I first needed to search in the even numbers and then in the odd numbers.
As you can see, the odd numbers take always more time than the even to reach the same output.
The problem is that we could not search for the numbers increasing 1 each time (it was too slow). What I did to solve that was to implement a binary search-like algorithm. First, I would search the even numbers (with the binary search like algorithm) until I found one answer or I had no more numbers to search. Then, I did the same to the odd numbers (again, with the binary search like algorithm) and if an answer was found, I replaced whatever I had before with it (as it was necessarily bigger than the previous answer).
I have the source code I used to solve this, so if anyone needs it I don't mind sharing it :)
The key to solving this puzzle was using a binary search.
As you can see from the sequence generators, they rely on a roughly n/2 recursion, so calculating R(N) takes about 2*log2(N) recursive calls; and of course you need to do it for both the odd and the even.
Thats not too bad, but you need to figure out where to search for the N which will give you the input. To do this, I first implemented a search for upper and lower bounds for N. I walked up N by powers of 2, until I had N and 2N that formed the lower and upper bounds respectively for each sequence (odd and even).
With these bounds, I could then do a binary search between them to quickly find the value of N, or its non-existence.

Categories

Resources