The R ppoints function is described as:
Ordinates for Probability Plotting
Description:
Generates the sequence of probability points ‘(1:m - a)/(m +
(1-a)-a)’ where ‘m’ is either ‘n’, if ‘length(n)==1’, or
‘length(n)’.
Usage:
ppoints(n, a = ifelse(n <= 10, 3/8, 1/2))
...
I've been trying to replicate this function in python and I have a couple of doubts.
1- The first m in (1:m - a)/(m + (1-a)-a) is always an integer: int(n) (ie: the integer of n) if length(n)==1 and length(n) otherwise.
2- The second m in the same equation is NOT an integer if length(n)==1 (it assumes the real value of n) and it IS an integer (length(n)) otherwise.
3- The n in a = ifelse(n <= 10, 3/8, 1/2) is the real number n if length(n)==1 and the integer length(n) otherwise.
This points are not made clear at all in the description and I'd very much appreciate if someone could confirm that this is the case.
Add
Well this was initially posted at https://stats.stackexchange.com/ because I was hoping to get the input of staticians who work with the ppoints function. Since it has been migrated here, I'll paste below the function I wrote to replicate ppoints in python. I've tested it and both seem to give back the same results, but I'd be great if someone could clarify the points made above because they are not made at all clear by the function's description.
def ppoints(vector):
'''
Mimics R's function 'ppoints'.
'''
m_range = int(vector[0]) if len(vector)==1 else len(vector)
n = vector[0] if len(vector)==1 else len(vector)
a = 3./8. if n <= 10 else 1./2
m_value = n if len(vector)==1 else m_range
pp_list = [((m+1)-a)/(m_value+(1-a)-a) for m in range(m_range)]
return pp_list
I would implement this with numpy:
import numpy as np
def ppoints(n, a):
""" numpy analogue or `R`'s `ppoints` function
see details at http://stat.ethz.ch/R-manual/R-patched/library/stats/html/ppoints.html
:param n: array type or number"""
try:
n = np.float(len(n))
except TypeError:
n = np.float(n)
return (np.arange(n) + 1 - a)/(n + 1 - 2*a)
Sample output:
>>> ppoints(5, 1./2)
array([ 0.1, 0.3, 0.5, 0.7, 0.9])
>>> ppoints(5, 1./4)
array([ 0.13636364, 0.31818182, 0.5 , 0.68181818, 0.86363636])
>>> n = 10
>>> a = 3./8. if n <= 10 else 1./2
>>> ppoints(n, a)
array([ 0.06097561, 0.15853659, 0.25609756, 0.35365854, 0.45121951,
0.54878049, 0.64634146, 0.74390244, 0.84146341, 0.93902439])
One can use R fiddle to test implementation.
Related
Edit: more details
Hello I found this problem through one of my teachers but I still don't understand how to approach to it, and I would like to know if anyone had any ideas for it:
Create a program capable of generating systems of equations (randomly) that contain between 2 and 8 variables. The program will ask the user for a number of variables in the system of equations using the input function. The range of the coefficients must be between [-10,10], however, no coefficient should be 0. Both the coefficients and the solutions must be integers.
The goal is to print the system and show the solution to the variables (x,y,z,...). NumPy is allowed.
As far as I understand it should work this way:
Enter the number of variables: 2
x + y = 7
4x - y =3
x = 2
y = 5
I'm still learning arrays in python, but do they work the same as in matlab?
Thank you in advance :)!
For k variables, the lhs of the equations will be k number of unknowns and a kxk matrix for the coefficients. The dot product of those two should give you the rhs. Then it's a simple case of printing that however you want.
import numpy as np
def generate_linear_equations(k):
coeffs = [*range(-10, 0), *range(1, 11)]
rng = np.random.default_rng()
return rng.choice(coeffs, size=(k, k)), rng.integers(-10, 11, k)
k = int(input('Enter the number of variables: '))
if not 2 <= k <= 8:
raise ValueError('The number of variables must be between 2 and 8.')
coeffs, variables = generate_linear_equations(k)
solution = coeffs.dot(variables)
symbols = 'abcdefgh'[:k]
for row, sol in zip(coeffs, solution):
lhs = ' '.join(f'{r:+}{s}' for r, s in zip(row, symbols)).lstrip('+')
print(f'{lhs} = {sol}')
print()
for s, v in zip(symbols, variables):
print(f'{s} = {v}')
Which for example can give
Enter the number of variables: 3
8a +6b -4c = -108
9a -9b -4c = 3
10a +10b +9c = -197
a = -9
b = -8
c = -3
If you specifically want the formatting of the lhs to have a space between the sign and to not show a coefficient if it has a value of 1, then you need something more complex. Substitute lhs for the following:
def sign(n):
return '+' if n > 0 else '-'
lhs = ' '.join(f'{sign(r)} {abs(r)}{s}' if r not in (-1, 1) else f'{sign(r)} {s}' for r, s in zip(row, symbols))
lhs = lhs[2:] if lhs.startswith('+') else f'-{lhs[2:]}'
I did this by randomly generating the left hand side and the solution within your constraints, then plugging the solutions into the equations to generate the right hand side. Feel free to ask for clarification about any part of the code.
import numpy as np
num_variables = int(input('Number of variables:'))
valid_integers = np.asarray([x for x in range(-10,11) if x != 0])
lhs = np.random.choice(valid_integers, lhs_shape)
solution = np.random.randint(-10, 11, num_variables)
rhs = lhs.dot(solution)
for i in range(num_variables):
for j in range(num_variables):
symbol = '=' if j == num_variables-1 else '+'
print(f'{lhs[i, j]:3d}*x{j+1} {symbol} ', end='')
print(rhs[i])
for i in range(num_variables):
print(f'x{i+1} = {solution[i]}'
Example output:
Number of variables:2
2*x1 + -7*x2 = -84
-4*x1 + 1*x2 = 38
x1 = -7
x2 = 10
I want to split a decimal number into a random table where the sum of the elements in the array equals the original number
# Call a function which receives a decimal number
from decimal import Decimal
from something import split_random_decimal
split_decimal = split_random_decimal(Decimal('10.00'))
print(split_decimal)
# Output: [1.3, 0.7, 1.2, 0.8, 1.0, 1.5, 0.5, 1.9, 0.1, 1.0]
print(sum(split_decimal))
# Output: Decimal('10.00') - The original decimal value
Has anyone an idea how I could do this in pure Python without using a library?
Solved!
Thks for all who have help me, the final beautiful code who saved my life is this:
import random
def random_by_number(number, min_random, max_random, spaces=1, precision=2):
if spaces <= 0:
return number
random_numbers = [random.uniform(min_random, max_random) for i in range(0, spaces)]
increment_number = (number - sum(random_numbers)) / spaces
return [round(n + increment_number, precision) for n in random_numbers]
number = 2500.50
spaces = 30
max_random = number / spaces
min_random = max_random * 0.6
random_numbers = random_by_number(number, min_random, max_random, spaces=spaces, precision=2)
print(random_numbers)
print(len(random_numbers))
print(sum(random_numbers))
You could start with something like:
numberLeft = 10.0
decList = list()
while numberLeft > 0:
cur = random.uniform(0, numberLeft)
decList.append(cur)
numberLeft -= cur
This implementation would choose higher random numbers at first which wouldn't be that hard to logically change.
numberLeft will never hit exactly 0 so you could do something with rounding. You could also wait for numberLeft to get low enough and that would be your last random number in the list.
The problem is a little under defined: into how many pieces should it be split and how large may any piece be? Should the values only be positive? An approximate solution from what you've said would be to pick a random number of pieces (defaulting to 10) and making the values be distributed normally about the average size of the pieces with a standard deviation of 1/10 of the average:
from decimal import Decimal
def split_random_decimal(x, n=10):
assert n > 0
if n == 1:
return [x]
from random import gauss
mu = float(x)/n
s = mu/10
if '.' in str(x):
p = len(str(x)) - str(x).find('.') - 1
else:
p = 0
rv = [Decimal(str(round(gauss(mu, s), p))) for i in range(n-1)]
rv.append(x - sum(rv))
return rv
>>> splited_decimal = split_random_decimal(Decimal('10.00'))
>>> print(splited_decimal)
[Decimal('0.84'), Decimal('1.08'), Decimal('0.85'), Decimal('1.04'),
Decimal('0.96'), Decimal('1.2'), Decimal('0.9'), Decimal('1.09'),
Decimal('1.08'), Decimal('0.96')]
I think this is what you're looking for:
import random as r
def random_sum_to(n, num_terms = None):
n = n*100
num_terms = (num_terms or r.randint(2, n)) - 1
a = r.sample(range(1, n), num_terms) + [0, n]
list.sort(a)
return [(a[i+1] - a[i])/100.00 for i in range(len(a) - 1)]
print(random_sum_to(20, 3)) # [8.11, 3.21, 8.68] example
print(random_sum_to(20, 5)) # [5.21, 7.57, 0.43, 3.83, 2.96] example
print(random_sum_to(20)) # [1 ,2 ,1 ,4, 4, 2, 2, 1, 3] example
n is the number in which you are summing to, and num_terms is the length of the string you would like as a result. Also if you look at the last example you can see that if you don't want to specify a "num_terms" you don't have to and it will do that for you!
I am trying to complete the following exercise:
https://www.codewars.com/kata/whats-a-perfect-power-anyway/train/python
I tried multiple variations, but my code breaks down when big numbers are involved (I tried multiple variations with solutions involving log and power functions):
Exercise:
Your task is to check wheter a given integer is a perfect power. If it is a perfect power, return a pair m and k with m^k = n as a proof. Otherwise return Nothing, Nil, null, None or your language's equivalent.
Note: For a perfect power, there might be several pairs. For example 81 = 3^4 = 9^2, so (3,4) and (9,2) are valid solutions. However, the tests take care of this, so if a number is a perfect power, return any pair that proves it.
The exercise uses Python 3.4.3
My code:
import math
def isPP(n):
for i in range(2 +n%2,n,2):
a = math.log(n,i)
if int(a) == round(a, 1):
if pow(i, int(a)) == n:
return [i, int(a)]
return None
Question:
How is it possible that I keep getting incorrect answers for bigger numbers? I read that in Python 3, all ints are treated as "long" from Python 2, i.e. they can be very large and still represented accurately. Thus, since i and int(a) are both ints, shouldn't the pow(i, int(a)) == n be assessed correctly? I'm actually baffled.
(edit note: also added integer nth root bellow)
you are in the right track with logarithm but you are doing the math wrong, also you are skipping number you should not and only testing all the even number or all the odd number without considering that a number can be even with a odd power or vice-versa
check this
>>> math.log(170**3,3)
14.02441559235585
>>>
not even close, the correct method is described here Nth root
which is:
let x be the number to calculate the Nth root, n said root and r the result, then we get
rn = x
take the log in any base from both sides, and solve for r
logb( rn ) = logb( x )
n * logb( r ) = logb( x )
logb( r ) = logb( x ) / n
blogb( r ) = blogb( x ) / n
r = blogb( x ) / n
so for instance with log in base 10 we get
>>> pow(10, math.log10(170**3)/3 )
169.9999999999999
>>>
that is much more closer, and with just rounding it we get the answer
>>> round(169.9999999999999)
170
>>>
therefore the function should be something like this
import math
def isPP(x):
for n in range(2, 1+round(math.log2(x)) ):
root = pow( 10, math.log10(x)/n )
result = round(root)
if result**n == x:
return result,n
the upper limit in range is to avoid testing numbers that will certainly fail
test
>>> isPP(170**3)
(170, 3)
>>> isPP(6434856)
(186, 3)
>>> isPP(9**2)
(9, 2)
>>> isPP(23**8)
(279841, 2)
>>> isPP(279841)
(529, 2)
>>> isPP(529)
(23, 2)
>>>
EDIT
or as Tin Peters point out you can use pow(x,1./n) as the nth root of a number is also expressed as x1/n
for example
>>> pow(170**3, 1./3)
169.99999999999994
>>> round(_)
170
>>>
but keep in mind that that will fail for extremely large numbers like for example
>>> pow(8191**107,1./107)
Traceback (most recent call last):
File "<pyshell#90>", line 1, in <module>
pow(8191**107,1./107)
OverflowError: int too large to convert to float
>>>
while the logarithmic approach will success
>>> pow(10, math.log10(8191**107)/107)
8190.999999999999
>>>
the reason is that 8191107 is simple too big, it have 419 digits which is greater that the maximum float representable, but reducing it with a log produce a more reasonable number
EDIT 2
now if you want to work with numbers ridiculously big, or just plain don't want to use floating point arithmetic altogether and use only integer arithmetic, then the best course of action is to use the method of Newton, that the helpful link provided by Tin Peters for the particular case for cube root, show us the way to do it in general alongside the wikipedia article
def inthroot(A,n):
if A<0:
if n%2 == 0:
raise ValueError
return - inthroot(-A,n)
if A==0:
return 0
n1 = n-1
if A.bit_length() < 1024: # float(n) safe from overflow
xk = int( round( pow(A,1/n) ) )
xk = ( n1*xk + A//pow(xk,n1) )//n # Ensure xk >= floor(nthroot(A)).
else:
xk = 1 << -(-A.bit_length()//n) # power of 2 closer but greater than the nth root of A
while True:
sig = A // pow(xk,n1)
if xk <= sig:
return xk
xk = ( n1*xk + sig )//n
check the explanation by Mark Dickinson to understand the working of the algorithm for the case of cube root, which is basically the same for this
now lets compare this with the other one
>>> def nthroot(x,n):
return pow(10, math.log10(x)/n )
>>> n = 2**(2**12) + 1 # a ridiculously big number
>>> r = nthroot(n**2,2)
Traceback (most recent call last):
File "<pyshell#48>", line 1, in <module>
nthroot(n**2,2)
File "<pyshell#47>", line 2, in nthroot
return pow(10, math.log10(x)/n )
OverflowError: (34, 'Result too large')
>>> r = inthroot(n**2,2)
>>> r == n
True
>>>
then the function is now
import math
def isPPv2(x):
for n in range(2,1+round(math.log2(x))):
root = inthroot(x,n)
if root**n == x:
return root,n
test
>>> n = 2**(2**12) + 1 # a ridiculously big number
>>> r,p = isPPv2(n**23)
>>> p
23
>>> r == n
True
>>> isPPv2(170**3)
(170, 3)
>>> isPPv2(8191**107)
(8191, 107)
>>> isPPv2(6434856)
(186, 3)
>>>
now lets check isPP vs isPPv2
>>> x = (1 << 53) + 1
>>> x
9007199254740993
>>> isPP(x**2)
>>> isPPv2(x**2)
(9007199254740993, 2)
>>>
clearly, avoiding floating point is the best choice
I have to write a function, s(x) = x * sin(3/x) in python that is capable of taking single values or vectors/arrays, but I'm having a little trouble handling the cases when x is zero (or has an element that's zero). This is what I have so far:
def s(x):
result = zeros(size(x))
for a in range(0,size(x)):
if (x[a] == 0):
result[a] = 0
else:
result[a] = float(x[a] * sin(3.0/x[a]))
return result
Which...doesn't work for x = 0. And it's kinda messy. Even worse, I'm unable to use sympy's integrate function on it, or use it in my own simpson/trapezoidal rule code. Any ideas?
When I use integrate() on this function, I get the following error message: "Symbol" object does not support indexing.
This takes about 30 seconds per integrate call:
import sympy as sp
x = sp.Symbol('x')
int2 = sp.integrate(x*sp.sin(3./x),(x,0.000001,2)).evalf(8)
print int2
int1 = sp.integrate(x*sp.sin(3./x),(x,0,2)).evalf(8)
print int1
The results are:
1.0996940
-4.5*Si(zoo) + 8.1682775
Clearly you want to start the integration from a small positive number to avoid the problem at x = 0.
You can also assign x*sin(3./x) to a variable, e.g.:
s = x*sin(3./x)
int1 = sp.integrate(s, (x, 0.00001, 2))
My original answer using scipy to compute the integral:
import scipy.integrate
import math
def s(x):
if abs(x) < 0.00001:
return 0
else:
return x*math.sin(3.0/x)
s_exact = scipy.integrate.quad(s, 0, 2)
print s_exact
See the scipy docs for more integration options.
If you want to use SymPy's integrate, you need a symbolic function. A wrong value at a point doesn't really matter for integration (at least mathematically), so you shouldn't worry about it.
It seems there is a bug in SymPy that gives an answer in terms of zoo at 0, because it isn't using limit correctly. You'll need to compute the limits manually. For example, the integral from 0 to 1:
In [14]: res = integrate(x*sin(3/x), x)
In [15]: ans = limit(res, x, 1) - limit(res, x, 0)
In [16]: ans
Out[16]:
9⋅π 3⋅cos(3) sin(3) 9⋅Si(3)
- ─── + ──────── + ────── + ───────
4 2 2 2
In [17]: ans.evalf()
Out[17]: -0.164075835450162
I'm relatively newcomer on programming as I'm educated a mathematician and have no experience on Python. I would like to know how to solve this problem in Python which appeared as I was studying one maths problem on my own:
Program asks a positive integer m. If m is of the form 2^n-1 it returns T(m)=n*2^{n-1}. Otherwise it writes m to the form 2^n+x, where -1 < x < 2^n, and returns T(m)=T(2^n-1)+x+1+T(x). Finally it outputs the answer.
I thought this was a neat problem so I attempted a solution. As far as I can tell, this satisfies the parameters in the original question.
#!/usr/bin/python
import math
def calculate(m: int) -> int:
"""
>>> calculate(10)
20
>>> calculate(100)
329
>>> calculate(1.2)
>>> calculate(-1)
"""
if (m <= 0 or math.modf(m)[0] != 0):
return None
n, x = decompose(m + 1)
if (x == 0):
return n * 2**(n - 1)
else:
return calculate(2**n - 1) + x + 1 + calculate(x)
def decompose(m: int) -> (int, int):
"""
Returns two numbers (n, x), where
m = 2**n + x and -1 < x < 2^n
"""
n = int(math.log(m, 2))
return (n, m - 2**n)
if __name__ == "__main__":
import doctest
doctest.testmod(verbose = True)
Assuming the numbers included in the calculate function's unit tests are the correct results for the problem, this solution should be accurate. Feedback is most welcome, of course.