Python - How to get integrated values from np.array? - python

I am trying to get integrated values from np.array, list of values. Not the surface under the function, but values. I have values of acceleration and want to get values of velocity.
So let's say I have an arry like:
a_x = np.array([111.2, 323.2, 123.3, 99.38, 65.23, -0.19, -34.67])
And I try to get integrated values from this array to get the values of velocity.
If I use lets say simps, quad, trapz, I get the one number (surface).
So how do you integrate np.array values and get integrated values that you can store in a list?

You can't do it by the way you want it, because you didn't understand the process behind it. If you are given acceleration, then using the following equation:
You are able only to find INDEFINITE integral, you know the acceleration, but you don't know starting conditions, thus your solution can't be empty.
As the solution to each of those questions is: "Find velocity given an acceleration", then the solution would be v(t)=integral of a(t)dt+c, where your acceleration is constant, so it doesn't rely on t and it can be written as v(t)=at+c, but still - we don't know anything about how long acceleration lasted and what is the starting condition.
But answering the question about getting values which can be stored in a list - you do it by indexing your values of np.array:
import numpy as np
a_x = np.array([111.2,323.2,123.3])
#Gets first value
print(a_x[0])
If I use lets say simps, quad, trapz, I get the one number (surface).
Because quad,simps,or trapz are methods used for given points, which return value of integral with those given points with corresponding method, for example:
numpy.trapz(y, x=None, dx=1.0, axis=- 1)
if x isn't specified (as in your case), it assumes that you want to use trapeze to estimate the field under the value y of given points with x equal distribution of x. It has to give one value.

Related

How to avoid numerical overflow in log of absolute value of sum of cubes?

I'm interested to compute a quantity log(| sum_i [(x_i)^3] |); The problem of directly using np.log(abs((x**3).sum())) (where x is a array of elements, and x**3 is to apply cube function element wisely to the array) is that some values in x**3 could be so large and has potential numerical issues.
My plan is to use logsumexp trick. However, the absolute value outside of the sum is hard to get rid of. Any help?
We can use a little bit of math to avoid numerical overflow.
Suppose x is an numpy array.
The problem comes from the abs((x**3).sum()), specifically the cubing operation. We can make the computation more stable by scaling down each number in x by a constant. Because we are dividing by a constant inside the array before the cubing, we need to multiply by the constant cubed outside the summation.
In other words:
abs((x**3).sum()) = (constant**3)*abs(((x/constant)**3).sum())
Using properties of logs, you can simplify your final expression to the following:
np.log(constant**3) + np.log(abs(((x/constant)**3).sum(0)))

Callable variable in a loop

I'm having trouble in a loop.
I have a bunch of points (a 5-D space) saved in an array
Coords=[]
Coords.append(zip(x,y,z,t,w))
Coords=np.array(Coords,float)
so that I can call all coords of 1 particle by, for example
Coords[0,0]=[0.0.0.0.1.]
Coords[0,1]=[0.1,0.,0.,0.,0.9]
Coords[0,1,0]=0.1
Now, I need to calculate some properties for every particle, so I create a dicionary, where for every key(i.e. every particle) I compute somwthing
A={}
Aa=np.arange(0.0,1.0+0.001,0.001)
for s in range(len(Coords[0])):
A[s]=[]
for a in Aa:
if Coords[0,s,2]>=a and np.sqrt(Coords[0,s,0]*Coords[0,s,4])>=a:
A[s].append(a)
Here I get the proper dictionary, so I'm calling the varaibles Coords[0,s,0] and Coords[0,s,4] properly, there is no problem.
Now, this is where I have problems.
I need to compute another property for every particle for every value in A, therefore I create a dictionary of dictionaries.
L={}
for s in range(len(Coords[0])):
L[s]={}
for a in A[s]:
L[s][a]=[]
for i in Aa:
if (Coords[0,s,0]-i)*(Coords[0,s,4]-i)-a**2==0:
L[s][a].append(i)
Now I have a problem. The variables Coords are not called properly, there are missig values.
For example, the Coords[0,2,0]=0.1 and Coords[0,2,4]=0.6 should produce two values in the list: 0.1 and 0.6 (for a=0). However, in the list only appears the value 0.1, like the variable Coords[0,2,4]=0.6 doesn't exist.
However, if I write by hand the if condition like (for a=0)
if (0.1-i)*(0.6-i)-a**2==0
then I get the proper values.
Does anyone know why this is happening? Is it because I have dictionaries inside dictionaries?
Thanks.
In your second condition:
(Coords[0,s,0]-i)*(Coords[0,s,4]-i)-a**2==0:
Try using a tolerance for your comparison, something like:
abs((Coords[0,s,0]-i)*(Coords[0,s,4]-i)-a**2) < 10**-10
There's a more detailed description here:
Rounding errors with floats in Python using Numpy

how to make specific polynomials in a set with python

I am working with sagemath which uses the python language. In the polynomial ring over the integer ring, I want to define a set whose elements have degree are less than a given number and the absolute value of coefficients are less than a given number.
How to achieve this? For a polynomial, I have defined the degree function
and the max_coefficient function already.
For example,
(x^3-3*x-5).degree(x) will return 3
max_coefficient(x^3-3*x-5) will return 5
The following are my codes.
R=Polynomialring(ZZ,x)
def A(deg_bound,coefficient_bound):
S=set()
for poly in R:
if poly.degree(x)<=deg_bound and max_coefficient(poly)<=coefficient_bound:
S=S.add(poly)
return S
But sagemath tells me I can't do for in the polynomial ring.
It is rather unclear what sort of objects are in R, but you seemto know how to manipulate them...
There is one error that may be causing you trouble:
S = S.add(poly)
is first adding poly to S, then assigning None to S, which is rather unfortunate.
try to replace it with:
S.add(poly)
which accumulates distinct objects into S

What does numpy.percentile mean and how to use this for splitting array?

I am trying to understand percentiles in numpy.
import numpy as np
nd_array = np.array([3.6216, 4.5459, -3.5637, -2.5419])
step_intervals = range(100, 0, -5)
for percentile_interval in step_intervals:
threshold_attr_value = np.percentile(np.array(nd_array), percentile_interval)
print "percentile interval ={interval}, threshold_attr_value = {threshold_attr_value}, {arr}".format(interval=percentile_interval, threshold_attr_value=threshold_attr_value, arr=sorted(nd_array))
I get a value of these as
percentile interval =100, threshold_attr_value = 4.5459, [-3.5636999999999999, -2.5419, 3.6215999999999999, 4.5458999999999996]
...
percentile interval =5, threshold_attr_value = -3.41043, [-3.5636999999999999, -2.5419, 3.6215999999999999, 4.5458999999999996]
What does the percentiles value mean?
100% of the values in the array are < 4.5459?
5% of values in the array are < -3.41043?
Is that the correct way to read these?
I want to split the numpy array into small sub-arrays. I want to do it based on the percentile occurances of the elements. How can I do this?
To be more precise, you should say that a = np.percentile(arr, q) indicates that nearly q% of elements of arr are lower than a. Why do I emphasize on nearly?
If q=100, it always returns the maximum of arr. So, you cannot say that q% of elements are "lower than" a.
If q=0, it always returns the minimum of arr. So, you cannot say that q% of elements are "lower than or equal to" a.
In addition, the returned value depends on the type of interpolation.
The following code shows the role of interpolation parameter:
>>> import numpy as np
>>> arr = np.array([1,2,3,4,5])
>>> np.percentile(arr, 90) # default interpolation='linear'
4.5999999999999996
>>> np.percentile(arr, 90, interpolation='lower')
4
>>> np.percentile(arr, 90, interpolation='higher')
5
No, as you can see by inspection, only 75% of the values in your array are strictly less than 4.5459, and 25% of the values are strictly less than -3.41043. If you had written less than or equal to, then you would have been giving one common definition of "Percentile" which however happens to also not be what is applied in your case; instead, what's happening is that numpy is applying a certain interpolation scheme to ensure that the mapping taking a given number in [0, 100] to the corresponding percentile is continuous and piecewise linear, while still giving the "right" value at ranks corresponding to values in the given array. As it turns out, even this you can do in many different ways, all of which are reasonable, as described in the Wikipedia article on the subject. As you can see in the documentation of numpy.percentile, you have some control of the interpolation behaviour and by default it uses what the Wikipedia article calls the "second variant, $C = 1$".
Perhaps the easiest way to understand the implications of this is to simply plot the result of calculating the different values of np.percentile for your fixed length 4 array:
Note how the kinks are spread evenly across [0, 100] and that the percentiles corresponding to the actual values in your array are given by evaluating lambda p: np.percentile(nd_array, p) at 0*100/(4-1), 1*100/(4-1), 2*100/(4-1), and 3*100/(4-1) respectively.

Getting Keys Within Range/Finding Nearest Neighbor From Dictionary Keys Stored As Tuples

I have a dictionary which has coordinates as keys. They are by default in 3 dimensions, like dictionary[(x,y,z)]=values, but may be in any dimension, so the code can't be hard coded for 3.
I need to find if there are other values within a certain radius of a new coordinate, and I ideally need to do it without having to import any plugins such as numpy.
My initial thought was to split the input into a cube and check no points match, but obviously that is limited to integer coordinates, and would grow exponentially slower (radius of 5 would require 729x the processing), and with my initial code taking at least a minute for relatively small values, I can't really afford this.
I heard finding the nearest neighbor may be the best way, and ideally, cutting down the keys used to a range of +- a certain amount would be good, but I don't know how you'd do that when there's more the one point being used.Here's how I'd do it with my current knowledge:
dimensions = 3
minimumDistance = 0.9
#example dictionary + input
dictionary[(0,0,0)]=[]
dictionary[(0,0,1)]=[]
keyToAdd = [0,1,1]
closestMatch = 2**1000
tooClose = False
for keys in dictionary:
#calculate distance to new point
originalCoordinates = str(split( dictionary[keys], "," ) ).replace("(","").replace(")","")
for i in range(dimensions):
distanceToPoint = #do pythagors with originalCoordinates and keyToAdd
#if you want the overall closest match
if distanceToPoint < closestMatch:
closestMatch = distanceToPoint
#if you want to just check it's not within that radius
if distanceToPoint < minimumDistance:
tooClose = True
break
However, performing calculations this way may still run very slow (it must do this to millions of values). I've searched the problem, but most people seem to have simpler sets of data to do this to. If anyone can offer any tips I'd be grateful.
You say you need to determine IF there are any keys within a given radius of a particular point. Thus, you only need to scan the keys, computing the distance of each to the point until you find one within the specified radius. (And if you do comparisons to the square of the radius, you can avoid the square roots needed for the actual distance.)
One optimization would be to sort the keys based on their "Manhattan distance" from the point (that is, add the component offsets), since the Euclidean distance will never be less than this. This would avoid some of the more expensive calculations (though I don't think you need and trigonometry).
If, as you suggest later in the question, you need to handle multiple points, you can obviously process each individually, or you could find the center of those points and sort based on that.

Categories

Resources