calculate the mid points of a vector using Python - python

I started to learn python from scratch. I got some issues while doing the following problem.
I have the following vector ,x_vector = (0,1,2,3,4,5,6,7,8,9). Using this vector, I need to create this new vector x1 = (-0.5,0.5,1.5,2.5,3.5,4.5,5.5,6.5,7.5,8.5,9.5).
Basically the desired vector should have first element -0.5, mid points between each elements and the last element +0.5.
The code I tried so far as follows:
import numpy as np
x_vector=np.array([0,1,2,3,4,5,6,7,8,9])
x=len(x_vector)
mid=np.zeros(x+1)
for i in range (0,x):
if i==0 :
mid[i]= x_vector[i]-0.5
else :
mid[i]=(x_vector[i] + x_vector[i+1])/2
i +=1
Seems like this doesn't give the desired output. Can you one help me to figure out what can I do to get correct output?

Using itertools.pairwise:
from itertools import tee
def pairwise(iterable):
"s -> (s0,s1), (s1,s2), (s2, s3), ..."
a, b = tee(iterable)
next(b, None)
return zip(a, b)
res = []
res.append(min(x_vector)-0.5)
res.append(max(x_vector)+0.5)
res.extend([np.mean(z) for z in pairwise(x_vector)])
sorted(res)
Output:
[-0.5, 0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5]

Consider, what will happen for i = 0 and i = 1 in your loop:
mid[0] = x_vector[0] - 0.5 # = -0.5
mid[1] = (x_vector[1] + x_vector[2]) / 2 # (1 + 2) / 2 = 3 / 2 = 1 (or 1.5 if python3)
you mismatched indexes.
Try this:
for i in range (0,x):
if i == 0:
mid[i] = x_vector[i]-0.5
else :
mid[i] = (x_vector[i - 1] + x_vector[i]) / 2.0
Note, that i changed division to divide by 2.0 instead of 2 - this will make sure, that division result will be double (number with fraction) instead of integer (number without fraction, in python 2 division two integers will round to integer).
Also i += 1 is redundant, i variable in for loop will updated (overwriting your += 1 statement) every loop iteration.

It is not clear whether this is a homework, but given that you are using numpy I think it is fair game to use it as its whole potential, in this case you can just do:
import numpy as np
x_vector=np.array([0,1,2,3,4,5,6,7,8,9])
a = np.insert(x, 0, x[0] - 1)
b = np.append(x, x[-1] + 1)
mid = (a + b) / 2

Related

Find keys of the largest 4 values in a dictionary

Here's part of my Python code, I'm trying to find the 4 consecutive values in spec which yields the maximum sum, and then find the weighted average value of the corresponding 4 keys:
spec = {1.5:8, 1.3:9, 4.3:7, 3.2:3, 5.3:5, 4:1, 5.2:6, 4.2:4, 2.5:9}
k = 4
consecutive_elements = zip(*(islice(spec.values(), x, None) for x in range(k)))
max(map(sum, consecutive_elements)) # The maximum sum.
Wavg = np.average(list(???.keys()), weights=list(???.values())) # The weighted average
I'm not sure how I can access the 4 keys. In this case, they should be 1.5, 1.3, 4.3, 3.2 since the sum of their values is 27 (the maximum). Which tools should I use?
Your data is not really a mapping as much as a pair of sequences with corresponding elements. I would recommend visualizing it as such:
keys = list(spec.keys())
vals = list(spec.keys())
From here it should be pretty clear how to do using something like implementing argmax in Python:
consecutive_elements = zip(*(islice(spec.values(), x, None) for x in range(k)))
idx = max(enumerate(map(sum, consecutive_elements)), key=itemgetter(1))[0] # The maximum sum.
Wavg = sum(a * b for a, b in zip(keys[idx:idx + k], vals[idx:idx + k])) / sum(vals[idx:idx + k])
That being said, if you're using numpy anyway, I'd say commit to it:
keys = np.fromiter(spec.keys(), float, count=len(spec))
vals = np.fromiter(spec.values(), float, count=len(spec))
idx = np.convolve(vals, np.ones(k), 'valid').argmax()
Wavg = np.average(keys[idx:idx + k], weights=vals[idx:idx + k])
The results from both versions are exactly identical:
2.348148148148148

Split a Decimal number into a random array where the sum of the numbers equals the split number

I want to split a decimal number into a random table where the sum of the elements in the array equals the original number
# Call a function which receives a decimal number
from decimal import Decimal
from something import split_random_decimal
split_decimal = split_random_decimal(Decimal('10.00'))
print(split_decimal)
# Output: [1.3, 0.7, 1.2, 0.8, 1.0, 1.5, 0.5, 1.9, 0.1, 1.0]
print(sum(split_decimal))
# Output: Decimal('10.00') - The original decimal value
Has anyone an idea how I could do this in pure Python without using a library?
Solved!
Thks for all who have help me, the final beautiful code who saved my life is this:
import random
def random_by_number(number, min_random, max_random, spaces=1, precision=2):
if spaces <= 0:
return number
random_numbers = [random.uniform(min_random, max_random) for i in range(0, spaces)]
increment_number = (number - sum(random_numbers)) / spaces
return [round(n + increment_number, precision) for n in random_numbers]
number = 2500.50
spaces = 30
max_random = number / spaces
min_random = max_random * 0.6
random_numbers = random_by_number(number, min_random, max_random, spaces=spaces, precision=2)
print(random_numbers)
print(len(random_numbers))
print(sum(random_numbers))
You could start with something like:
numberLeft = 10.0
decList = list()
while numberLeft > 0:
cur = random.uniform(0, numberLeft)
decList.append(cur)
numberLeft -= cur
This implementation would choose higher random numbers at first which wouldn't be that hard to logically change.
numberLeft will never hit exactly 0 so you could do something with rounding. You could also wait for numberLeft to get low enough and that would be your last random number in the list.
The problem is a little under defined: into how many pieces should it be split and how large may any piece be? Should the values only be positive? An approximate solution from what you've said would be to pick a random number of pieces (defaulting to 10) and making the values be distributed normally about the average size of the pieces with a standard deviation of 1/10 of the average:
from decimal import Decimal
def split_random_decimal(x, n=10):
assert n > 0
if n == 1:
return [x]
from random import gauss
mu = float(x)/n
s = mu/10
if '.' in str(x):
p = len(str(x)) - str(x).find('.') - 1
else:
p = 0
rv = [Decimal(str(round(gauss(mu, s), p))) for i in range(n-1)]
rv.append(x - sum(rv))
return rv
>>> splited_decimal = split_random_decimal(Decimal('10.00'))
>>> print(splited_decimal)
[Decimal('0.84'), Decimal('1.08'), Decimal('0.85'), Decimal('1.04'),
Decimal('0.96'), Decimal('1.2'), Decimal('0.9'), Decimal('1.09'),
Decimal('1.08'), Decimal('0.96')]
I think this is what you're looking for:
import random as r
def random_sum_to(n, num_terms = None):
n = n*100
num_terms = (num_terms or r.randint(2, n)) - 1
a = r.sample(range(1, n), num_terms) + [0, n]
list.sort(a)
return [(a[i+1] - a[i])/100.00 for i in range(len(a) - 1)]
print(random_sum_to(20, 3)) # [8.11, 3.21, 8.68] example
print(random_sum_to(20, 5)) # [5.21, 7.57, 0.43, 3.83, 2.96] example
print(random_sum_to(20)) # [1 ,2 ,1 ,4, 4, 2, 2, 1, 3] example
n is the number in which you are summing to, and num_terms is the length of the string you would like as a result. Also if you look at the last example you can see that if you don't want to specify a "num_terms" you don't have to and it will do that for you!

Function "Interval_point"

I have a task for my pyton101 course at uni which is as follows:
Create a function interval_point(a, b, x) that takes three numbers and interprets a and b as the start and end point of an interval, and x as a fraction between 0 and 1 that determines how far to go towards b, starting at a.
Examples (IPython):
In [ ]: interval_point(100, 200, 0.5)
Out[ ]: 150.0
In [ ]: interval_point(100, 200, 0.2)
Out[ ]: 120.0
I came up with this:
def interval_point(a, b, x):
"""takes three numbers and interprets a and b as the start and end point
of an interval, and x as a fraction between 0 and 1 that determines how
far to go towards b, starting at a"""
if (a == b):
return a
if (x == 0):
return a
if (x > 0):
return((abs(a - b) + a) * x)
It worked for most of the tests that the automated test system looks for but it could't deal with a or b being a negative value.
Can someone suggest how I could make a function that does negative numbers as well as being able to handle if a is negative and b is positve.
Here was the report from the automated test system:
Test failure report
test_interval_point
def test_interval_point():
#if x=0, we expect to get value a back
assert s.interval_point(1.0, 2.0, 0.0) == 1.0
#if x=1, we expect to get value b back
assert s.interval_point(1.0, 2.0, 1.0) == 2.0
#test half-way, expect (a+b)/2
assert s.interval_point(1.0, 2.0, 0.5) == 1.5
#test trivial case of a=b
a, b = 1., 1.
x = 0.0
assert s.interval_point(a, b, x) == a
x = 1.0
assert s.interval_point(a, b, x) == a
x = 0.5
assert s.interval_point(a, b, x) == a
#test for negative numbers
assert s.interval_point(-2.0, -1.0, 0.5) == -1.5
#test for negative numbers
assert s.interval_point(-2.0, -1.0, 0.0) == -2.0
#test for negative numbers
assert s.interval_point(-2.0, -1.0, 1.0) == -1.0
#test for positive and negative limits
assert s.interval_point(-10, 10, 0.25) == -5.0
In addition to Goyo's comment, I would personally not check for edge-cases:
def interval_point(a, b, x):
return (b - a) * x + a
This should work for all values of a, b and x, including your special cases a == b and x == 0. If there is any performance gain by doing those checks, it will be negligable in most cases.
This version will also work if x is not in [0, 1], for example:
>>> interval_point(100, 200, 2)
300
>>> interval_point(100, 200, -1)
0
Which is probably acceptable. If not, your function should check that 0 <= x <= 1, because now your implementation of the function will return nothing if x < 0

Writing a function for x * sin(3/x) in python

I have to write a function, s(x) = x * sin(3/x) in python that is capable of taking single values or vectors/arrays, but I'm having a little trouble handling the cases when x is zero (or has an element that's zero). This is what I have so far:
def s(x):
result = zeros(size(x))
for a in range(0,size(x)):
if (x[a] == 0):
result[a] = 0
else:
result[a] = float(x[a] * sin(3.0/x[a]))
return result
Which...doesn't work for x = 0. And it's kinda messy. Even worse, I'm unable to use sympy's integrate function on it, or use it in my own simpson/trapezoidal rule code. Any ideas?
When I use integrate() on this function, I get the following error message: "Symbol" object does not support indexing.
This takes about 30 seconds per integrate call:
import sympy as sp
x = sp.Symbol('x')
int2 = sp.integrate(x*sp.sin(3./x),(x,0.000001,2)).evalf(8)
print int2
int1 = sp.integrate(x*sp.sin(3./x),(x,0,2)).evalf(8)
print int1
The results are:
1.0996940
-4.5*Si(zoo) + 8.1682775
Clearly you want to start the integration from a small positive number to avoid the problem at x = 0.
You can also assign x*sin(3./x) to a variable, e.g.:
s = x*sin(3./x)
int1 = sp.integrate(s, (x, 0.00001, 2))
My original answer using scipy to compute the integral:
import scipy.integrate
import math
def s(x):
if abs(x) < 0.00001:
return 0
else:
return x*math.sin(3.0/x)
s_exact = scipy.integrate.quad(s, 0, 2)
print s_exact
See the scipy docs for more integration options.
If you want to use SymPy's integrate, you need a symbolic function. A wrong value at a point doesn't really matter for integration (at least mathematically), so you shouldn't worry about it.
It seems there is a bug in SymPy that gives an answer in terms of zoo at 0, because it isn't using limit correctly. You'll need to compute the limits manually. For example, the integral from 0 to 1:
In [14]: res = integrate(x*sin(3/x), x)
In [15]: ans = limit(res, x, 1) - limit(res, x, 0)
In [16]: ans
Out[16]:
9⋅π 3⋅cos(3) sin(3) 9⋅Si(3)
- ─── + ──────── + ────── + ───────
4 2 2 2
In [17]: ans.evalf()
Out[17]: -0.164075835450162

Imitating 'ppoints' R function in python

The R ppoints function is described as:
Ordinates for Probability Plotting
Description:
Generates the sequence of probability points ‘(1:m - a)/(m +
(1-a)-a)’ where ‘m’ is either ‘n’, if ‘length(n)==1’, or
‘length(n)’.
Usage:
ppoints(n, a = ifelse(n <= 10, 3/8, 1/2))
...
I've been trying to replicate this function in python and I have a couple of doubts.
1- The first m in (1:m - a)/(m + (1-a)-a) is always an integer: int(n) (ie: the integer of n) if length(n)==1 and length(n) otherwise.
2- The second m in the same equation is NOT an integer if length(n)==1 (it assumes the real value of n) and it IS an integer (length(n)) otherwise.
3- The n in a = ifelse(n <= 10, 3/8, 1/2) is the real number n if length(n)==1 and the integer length(n) otherwise.
This points are not made clear at all in the description and I'd very much appreciate if someone could confirm that this is the case.
Add
Well this was initially posted at https://stats.stackexchange.com/ because I was hoping to get the input of staticians who work with the ppoints function. Since it has been migrated here, I'll paste below the function I wrote to replicate ppoints in python. I've tested it and both seem to give back the same results, but I'd be great if someone could clarify the points made above because they are not made at all clear by the function's description.
def ppoints(vector):
'''
Mimics R's function 'ppoints'.
'''
m_range = int(vector[0]) if len(vector)==1 else len(vector)
n = vector[0] if len(vector)==1 else len(vector)
a = 3./8. if n <= 10 else 1./2
m_value = n if len(vector)==1 else m_range
pp_list = [((m+1)-a)/(m_value+(1-a)-a) for m in range(m_range)]
return pp_list
I would implement this with numpy:
import numpy as np
def ppoints(n, a):
""" numpy analogue or `R`'s `ppoints` function
see details at http://stat.ethz.ch/R-manual/R-patched/library/stats/html/ppoints.html
:param n: array type or number"""
try:
n = np.float(len(n))
except TypeError:
n = np.float(n)
return (np.arange(n) + 1 - a)/(n + 1 - 2*a)
Sample output:
>>> ppoints(5, 1./2)
array([ 0.1, 0.3, 0.5, 0.7, 0.9])
>>> ppoints(5, 1./4)
array([ 0.13636364, 0.31818182, 0.5 , 0.68181818, 0.86363636])
>>> n = 10
>>> a = 3./8. if n <= 10 else 1./2
>>> ppoints(n, a)
array([ 0.06097561, 0.15853659, 0.25609756, 0.35365854, 0.45121951,
0.54878049, 0.64634146, 0.74390244, 0.84146341, 0.93902439])
One can use R fiddle to test implementation.

Categories

Resources