step = [0.1,0.2,0.3,0.4,0.5]
static = []
for x in step:
range = np.arrange(5,10 + x, x)
static.append(range)
# this return a list that looks something like this [[5.,5.1,5.2,...],[5.,5.2,5.4,...],[5.,5.3,5.6,...],...]
Im trying to create standard and dynamic stop/step ranges from 5.0-10. For the standard ranges I used a list with the steps and then looped it to get the different interval lists.
What I want now is to get varying step sizes within the 5.0-10.0 interval. So for example from 5.0-7.3, the step size is 0.2, from 7.3-8.3, the range is 0.5 and then from 8.3-10.0 the lets say the step is 0.8. What I don't understand how to do is to make the dynamic run through and get all the possible combinations.
Using a list of steps and a list of "milestones" that we are going to use to determine the start and end points of each np.arange, we can do this:
import numpy as np
def dynamic_range(milestones, steps) -> list:
start = milestones[0]
dynamic_range = []
for end, step in zip(milestones[1:], steps):
dynamic_range += np.arange(start, end, step).tolist()
start = end
return dynamic_range
print(dynamic_range(milestones=(5.0, 7.3, 8.3, 10.0), steps=(0.2, 0.5, 0.8)))
# [5.0, 5.2, 5.4, 5.6, 5.8, 6.0, 6.2, 6.4, 6.6, 6.8, 7.0,
# 7.2, 7.3, 7.8, 8.3, 8.3, 9.1, 9.9]
Note on performance: this answer assumes that you are going to use a few hundred points in your dynamic range. If you want millions of points, we should try another approach with pure numpy and no list concatenation.
if you want to be it within <5,10> interval then dont add x to 10:
import numpy as np
step = [0.1, 0.2, 0.3, 0.4, 0.5]
static = []
for x in step:
range = np.arange(5, 10, x)
static.append(range)
print(static)
Dinamic:
import numpy as np
step = [0.1, 0.2, 0.3, 0.4, 0.5]
breakingpoints=[6,7,8,9,10]
dinamic = []
i=0
startingPoint=5
for x in step:
#print(breakingpoints[i])
range = np.arange(startingPoint, breakingpoints[i], x)
dinamic.append(range)
i+=1
#print(range[-1])
startingPoint=range[-1]
print(dinamic)
Related
I need a function to find all float numbers that have at least two multiples in a given list.
Do you know if an already existing and efficient function exists in pandas, scipy or numpy for this purpose?
Example of expected behavior
Given the list [3.3, 3.4, 4.4, 5.1], I want a function that returns [.2, .3, 1.1, 1.7]
You can do something like:
import itertools
from itertools import chain
from math import sqrt
l = [3.3, 3.4, 4.4, 5.1]
def divisors(n):
# Find all divisors
return set(chain.from_iterable((i,n//i) for i in range(1,int(sqrt(n))+1) if n%i == 0))
# Multiply all numbers by 10, make integer, concatenate all divisors from all numbers
divisors_all = list(itertools.chain(*[list(divisors(int(x*10))) for x in l]))
# Take divisors with more than 2 multiples, then multiply by 10 back
div, counts = np.unique(divisors_all, return_counts=True)
result = div[counts > 1]/10
Output:
array([0.1, 0.2, 0.3, 1.1, 1.7])
This makes the hypothesis that all number have one decimal maximum in the original set.
This keeps 1 as it divides everything, but can be removed easily.
I think numpy.gcd() can be used to do what your question asks subject to the following clarifying constraints:
the input numbers will be examined to 1 decimal precision
inputs must be > 0
results must be > 1.0
import numpy as np
a = [3.3, 3.4, 4.4, 5.1]
b = [int(10*x) for x in a]
res = {np.gcd(x, y) for i, x in enumerate(b) for j, y in enumerate(b) if i != j}
res = [x/10 for x in res if x > 10]
Output:
[1.1, 1.7]
UPDATE:
To exactly match the results in the question after edit by OP (namely: [.2, .3, 1.1, 1.7]), we can do this:
import numpy as np
a = [3.3, 3.4, 4.4, 5.1]
b = [int(10*x) for x in a]
res = sorted({np.gcd(x, y) / 10 for i, x in enumerate(b) for j, y in enumerate(b) if i != j} - {1 / 10})
Output:
[0.2, 0.3, 1.1, 1.7]
I'm trying to compute the standard deviation of a list vr. The list size is 32, containing an array of size 3980. This array represents a value at a given height (3980 heights).
First I split the data into 15 minute chunks, where the minutes are given in raytimes. raytimes is a list of size 32 as well (containing just the time of the observation, vr).
I want the standard deviation computed at each height level, such that I end up with one final array of size 3980. This happens OK in my code. However my code does not produce the correct standard deviation value when I test it — that is to say the values that are output to w1sd, w2sd etc, are not correct (however the array is the correct size: an array of 3980 elements). I assume I am mixing up the wrong indices when computing the standard deviation.
Below are example values from the dataset. All data should fall into w1 and w1sd as the raytimes provided in this example are all within 15 minutes (< 0.25). I want to compute the standard deviation of the first element of vr, that is, the standard deviation of 2.0 + 3.1 + 2.1, then the second element, or standard deviation of 3.1 + 4.1 + nan etc.
The result for w1sd SHOULD BE [0.497, 0.499, 1.0, 7.5] but instead the code as below gives a nanstd in w1sd = [0.497, 0.77, 1.31, 5.301]. Is it something wrong with nanstd or my indexing?
vr = [
[2.0, 3.1, 4.1, nan],
[3.1, 4.1, nan, 5.1],
[2.1, nan, 6.1, 20.1]
]
Height = [10.0, 20.0, 30.0, 40]
raytimes = [0, 0.1, 0.2]
for j, h in enumerate(Height):
for i, t in enumerate(raytimes):
if raytimes[i] < 0.25:
w1.append(float(vr[i][j]))
elif 0.25 <= raytimes[i] < 0.5:
w2.append(float(vr[i][j]))
elif 0.5 <= raytimes[i] < 0.75:
w3.append(float(vr[i][j]))
else:
w4.append(float(vr[i][j]))
w1sd.append(round(nanstd(w1), 3))
w2sd.append(round(nanstd(w2), 3))
w3sd.append(round(nanstd(w3), 3))
w4sd.append(round(nanstd(w4), 3))
w1 = []
w2 = []
w3 = []
w4 = []
I would consider using pandas for this. It is a library that allows for efficient processing of datasets in numpy arrays and takes all the looping and indexing out of your hands.
In this case I would define a dataframe with N_raytimes rows and N_Height columns, which would allow to easily slice and aggregate the data any way you like.
This code gives the expected output.
import pandas as pd
import numpy as np
vr = [
[2.0, 3.1, 4.1, np.nan],
[3.1, 4.1, np.nan, 5.1],
[2.1, np.nan, 6.1, 20.1]
]
Height = [10.0, 20.0, 30.0, 40]
raytimes = [0, 0.1, 0.2]
# Define a dataframe with the data
df = pd.DataFrame(vr, columns=Height, index=raytimes)
df.columns.name = "Height"
df.index.name = "raytimes"
# Split it out (this could be more elegant)
w1 = df[df.index < 0.25]
w2 = df[(df.index >= 0.25) & (df.index < 0.5)]
w3 = df[(df.index >= 0.5) & (df.index < 0.75)]
w4 = df[df.index >= 0.75]
# Compute standard deviations
w1sd = w1.std(axis=0, ddof=0).values
w2sd = w2.std(axis=0, ddof=0).values
w3sd = w3.std(axis=0, ddof=0).values
w4sd = w4.std(axis=0, ddof=0).values
Hi i have an array of float [time,position] coordinates in a sparse format, eg
times = [0.1, 0.1, 1.5, 1.9, 1.9, 1.9]
posit = [2.1, 3.5, 0.4, 1.3, 2.7, 3.5]
and an array of velocities, eg
vel = [0.5,0.7,1.0]
I have to multiply each positions at the i-th time with the i-th element of vel.
In numpy is quite simple with a for:
import numpy
times = numpy.array([0.1, 0.1, 1.5, 1.9, 1.9, 1.9])
posit = numpy.array([2.1, 3.5, 0.4, 1.3, 2.7, 3.5])
vel = numpy.array([0.5,0.7,1.0])
uniqueTimes = numpy.unique(times, return_index=True)
uniqueIndices = uniqueTimes[1]
uniqueTimes = uniqueTimes[0]
numIndices = numpy.size(uniqueTimes)-1
iterator = numpy.arange(numIndices)+1
for i in iterator:
posit[uniqueIndices[i-1]:uniqueIndices[i]] = posit[uniqueIndices[i-1]:uniqueIndices[i]]*vel[i-1]
In tensorflow i can gather every information i need with
import tensorflow as tf
times = tf.constant([0.1, 0.1, 1.5, 1.9, 1.9, 1.9])
posit = tf.constant([2.1, 3.5, 0.4, 1.3, 2.7, 3.5])
vel = tf.constant([0.5,0.7,1.0])
uniqueTimes, uniqueIndices, counts = tf.unique_with_counts(times)
uniqueIndices = tf.cumsum(tf.pad(tf.unique_with_counts(uniqueIndices)[2],[[1,0]]))[:-1]
but i can't figure how to do the product. With int elements i could use sparse to dense tensors and use tf.matmul, but with float i can't.
Moreover, looping is difficult, since map_fn and while_loop require same size of each 'row', but i have different number of positions at each times. For the same reason i can't work separately each time and update the final positions tensor with tf.concat. Any help? Maybe with scatter_update or Variable assignment?
Following answer from vijai m, i have differences up to 1.5% between numpy and tensorflow code. You can check it using these data
times [0.1, 0.1, 0.2, 0.2]
posit [58.98962402, 58.9921875, 60.00390625, 60.00878906]
vel [0.99705114,0.99974157]
They return
np: [ 58.81567188 58.8182278 60.00390625 60.00878906]
tf: [ 58.81567001 58.81822586 59.98839951 59.9932785 ]
differences: [ 1.86388465e-06 1.93737304e-06 1.55067444e-02 1.55105566e-02]
Your numpy code doesn't work. I hope this is what you are looking for:
uniqueTimes, uniqueIndices, counts = tf.unique_with_counts(times)
out = tf.gather_nd(vel,uniqueIndices[:,None])*posit
I wanted to use the built-in range function for floats, but apparently it doesn't work and from a quick research, i understood that there isn't a built in option for that and that I'll need to code my own function for this. So I did:
def fltrange(mini, maxi, step):
lst = []
while mini < maxi:
lst.append(mini)
mini += step
return lst
rang = fltrange(-20.0, 20.1, 0.1)
print(rang)
input()
but this is what I get:
result
the step should be just 0.1000000..., but instead it's about (sometimes it changes) 0.100000000000001.
Thanks in advance.
Fun fact: 1/10 can't be exactly represented by floating point numbers. The closest you can get is 0.1000000000000000055511151231257827021181583404541015625. The rightmost digits usually get left out when you print them, but they're still there. This explains the accumulation of errors as you continually add more 0.1s to the sum.
You can eliminate some inaccuracy (but not all of it) by using a multiplication approach instead of a cumulative sum:
def fltrange(mini, maxi, step):
lst = []
width = maxi - mini
num_steps = int(width/step)
for i in range(num_steps):
lst.append(mini + i*step)
return lst
rang = fltrange(-20.0, 20.1, 0.1)
print(rang)
Result (newlines added by me for clarity):
[-20.0, -19.9, -19.8, -19.7, -19.6, -19.5, -19.4, -19.3, -19.2, -19.1,
-19.0, -18.9, -18.8, -18.7, -18.6, -18.5, -18.4, -18.3, -18.2, -18.1,
-18.0, -17.9, -17.8, -17.7, -17.6, -17.5, -17.4, -17.3, -17.2, -17.1,
-17.0, -16.9, -16.8, -16.7, -16.6, -16.5, -16.4, -16.3, -16.2, -16.1,
-16.0, -15.899999999999999, -15.8, -15.7, -15.6, -15.5, -15.399999999999999, -15.3, -15.2, -15.1, -15.0,
...
19.1, 19.200000000000003, 19.300000000000004, 19.400000000000006, 19.5, 19.6, 19.700000000000003, 19.800000000000004, 19.900000000000006, 20.0]
You can use numpy for it. There are a few functions for your needs.
import numpy as np # of course :)
linspace :
np.linspace(1, 10, num=200)
array([ 1. , 1.04522613, 1.09045226, 1.13567839,
1.18090452, 1.22613065, 1.27135678, 1.31658291,
...
9.68341709, 9.72864322, 9.77386935, 9.81909548,
9.86432161, 9.90954774, 9.95477387, 10. ])
arange :
np.arange(1., 10., 0.1)
array([ 1. , 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2. ,
2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3. , 3.1,
...
8.7, 8.8, 8.9, 9. , 9.1, 9.2, 9.3, 9.4, 9.5, 9.6, 9.7,
9.8, 9.9])
P.S. However, it's not technically a generator, which is a range in Python3 (xrange for Python2.x).
R well-known library for permutation test i.e. perm.
The example I'm interested in is this:
x <- c(12.6, 11.4, 13.2, 11.2, 9.4, 12.0)
y <- c(16.4, 14.1, 13.4, 15.4, 14.0, 11.3)
permTS(x,y, alternative="two.sided", method="exact.mc", control=permControl(nmc=30000))$p.value
Which prints result with p-value: 0.01999933.
Note there the function permTS allows us to input number of permutation = 30000.
Is there such similar implmentation in Python?
I was looking at Python's perm_stat, but it's not what I'm looking for and seems
to be buggy.
This is a possible implementation of permutation test using monte-carlo method:
def exact_mc_perm_test(xs, ys, nmc):
n, k = len(xs), 0
diff = np.abs(np.mean(xs) - np.mean(ys))
zs = np.concatenate([xs, ys])
for j in range(nmc):
np.random.shuffle(zs)
k += diff < np.abs(np.mean(zs[:n]) - np.mean(zs[n:]))
return k / nmc
note that given the monte-carlo nature of the algorithm you will not get exact same number on each run:
>>> xs = np.array([12.6, 11.4, 13.2, 11.2, 9.4, 12.0])
>>> ys = np.array([16.4, 14.1, 13.4, 15.4, 14.0, 11.3])
>>> exact_mc_perm_test(xs, ys, 30000)
0.019466666666666667