Related
I have an array with approximately 12000 length, something like array([0.3, 0.6, 0.3, 0.5, 0.1, 0.9, 0.4...]). Also, I have a column in a dataframe that provides values like 2,3,7,3,2,7.... The length of the column is 48, and the sum of those values is 36.
I want to distribute the values, which means the 12000 lengths of array is distributed by specific every value. For example, the first value in that column( = 2) gets its own array of 12000*(2/36) (maybe [0.3, 0.6, 0.3]), and the second value ( = 3) gets its array of 12000*(3/36), and its value continues after the first value(something like [0.5, 0.1, 0.9, 0.4]) and so on.
import pandas as pd
import numpy as np
# mock some data
a = np.random.random(12000)
df = pd.DataFrame({'col': np.random.randint(1, 5, 48)})
indices = (len(a) * df.col.to_numpy() / sum(df.col)).cumsum()
indices = np.concatenate(([0], indices)).round().astype(int)
res = []
for s, e in zip(indices[:-1], indices[1:]):
res.append(a[round(s):round(e)])
# some tests
target_pcts = df.col.to_numpy() / sum(df.col)
realized_pcts = np.array([len(sl) / len(a) for sl in res])
diffs = target_pcts / realized_pcts
assert 0.99 < np.min(diffs) and np.max(diffs) < 1.01
assert all(np.concatenate([*res]) == a)
How do I iterate between 0 and 1 by a step of 0.1?
This says that the step argument cannot be zero:
for i in range(0, 1, 0.1):
print(i)
Rather than using a decimal step directly, it's much safer to express this in terms of how many points you want. Otherwise, floating-point rounding error is likely to give you a wrong result.
Use the linspace function from the NumPy library (which isn't part of the standard library but is relatively easy to obtain). linspace takes a number of points to return, and also lets you specify whether or not to include the right endpoint:
>>> np.linspace(0,1,11)
array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])
>>> np.linspace(0,1,10,endpoint=False)
array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
If you really want to use a floating-point step value, use numpy.arange:
>>> import numpy as np
>>> np.arange(0.0, 1.0, 0.1)
array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
Floating-point rounding error will cause problems, though. Here's a simple case where rounding error causes arange to produce a length-4 array when it should only produce 3 numbers:
>>> numpy.arange(1, 1.3, 0.1)
array([1. , 1.1, 1.2, 1.3])
range() can only do integers, not floating point.
Use a list comprehension instead to obtain a list of steps:
[x * 0.1 for x in range(0, 10)]
More generally, a generator comprehension minimizes memory allocations:
xs = (x * 0.1 for x in range(0, 10))
for x in xs:
print(x)
Building on 'xrange([start], stop[, step])', you can define a generator that accepts and produces any type you choose (stick to types supporting + and <):
>>> def drange(start, stop, step):
... r = start
... while r < stop:
... yield r
... r += step
...
>>> i0=drange(0.0, 1.0, 0.1)
>>> ["%g" % x for x in i0]
['0', '0.1', '0.2', '0.3', '0.4', '0.5', '0.6', '0.7', '0.8', '0.9', '1']
>>>
Increase the magnitude of i for the loop and then reduce it when you need it.
for i * 100 in range(0, 100, 10):
print i / 100.0
EDIT: I honestly cannot remember why I thought that would work syntactically
for i in range(0, 11, 1):
print i / 10.0
That should have the desired output.
NumPy is a bit overkill, I think.
[p/10 for p in range(0, 10)]
[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
Generally speaking, to do a step-by-1/x up to y you would do
x=100
y=2
[p/x for p in range(0, int(x*y))]
[0.0, 0.01, 0.02, 0.03, ..., 1.97, 1.98, 1.99]
(1/x produced less rounding noise when I tested).
scipy has a built in function arange which generalizes Python's range() constructor to satisfy your requirement of float handling.
from scipy import arange
Similar to R's seq function, this one returns a sequence in any order given the correct step value. The last value is equal to the stop value.
def seq(start, stop, step=1):
n = int(round((stop - start)/float(step)))
if n > 1:
return([start + step*i for i in range(n+1)])
elif n == 1:
return([start])
else:
return([])
Results
seq(1, 5, 0.5)
[1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0]
seq(10, 0, -1)
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
seq(10, 0, -2)
[10, 8, 6, 4, 2, 0]
seq(1, 1)
[ 1 ]
The range() built-in function returns a sequence of integer values, I'm afraid, so you can't use it to do a decimal step.
I'd say just use a while loop:
i = 0.0
while i <= 1.0:
print i
i += 0.1
If you're curious, Python is converting your 0.1 to 0, which is why it's telling you the argument can't be zero.
Here's a solution using itertools:
import itertools
def seq(start, end, step):
if step == 0:
raise ValueError("step must not be 0")
sample_count = int(abs(end - start) / step)
return itertools.islice(itertools.count(start, step), sample_count)
Usage Example:
for i in seq(0, 1, 0.1):
print(i)
[x * 0.1 for x in range(0, 10)]
in Python 2.7x gives you the result of:
[0.0, 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6000000000000001, 0.7000000000000001, 0.8, 0.9]
but if you use:
[ round(x * 0.1, 1) for x in range(0, 10)]
gives you the desired:
[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
import numpy as np
for i in np.arange(0, 1, 0.1):
print i
Best Solution: no rounding error
>>> step = .1
>>> N = 10 # number of data points
>>> [ x / pow(step, -1) for x in range(0, N + 1) ]
[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
Or, for a set range instead of set data points (e.g. continuous function), use:
>>> step = .1
>>> rnge = 1 # NOTE range = 1, i.e. span of data points
>>> N = int(rnge / step
>>> [ x / pow(step,-1) for x in range(0, N + 1) ]
[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
To implement a function: replace x / pow(step, -1) with f( x / pow(step, -1) ), and define f.
For example:
>>> import math
>>> def f(x):
return math.sin(x)
>>> step = .1
>>> rnge = 1 # NOTE range = 1, i.e. span of data points
>>> N = int(rnge / step)
>>> [ f( x / pow(step,-1) ) for x in range(0, N + 1) ]
[0.0, 0.09983341664682815, 0.19866933079506122, 0.29552020666133955, 0.3894183423086505,
0.479425538604203, 0.5646424733950354, 0.644217687237691, 0.7173560908995228,
0.7833269096274834, 0.8414709848078965]
And if you do this often, you might want to save the generated list r
r=map(lambda x: x/10.0,range(0,10))
for i in r:
print i
more_itertools is a third-party library that implements a numeric_range tool:
import more_itertools as mit
for x in mit.numeric_range(0, 1, 0.1):
print("{:.1f}".format(x))
Output
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
This tool also works for Decimal and Fraction.
My versions use the original range function to create multiplicative indices for the shift. This allows same syntax to the original range function.
I have made two versions, one using float, and one using Decimal, because I found that in some cases I wanted to avoid the roundoff drift introduced by the floating point arithmetic.
It is consistent with empty set results as in range/xrange.
Passing only a single numeric value to either function will return the standard range output to the integer ceiling value of the input parameter (so if you gave it 5.5, it would return range(6).)
Edit: the code below is now available as package on pypi: Franges
## frange.py
from math import ceil
# find best range function available to version (2.7.x / 3.x.x)
try:
_xrange = xrange
except NameError:
_xrange = range
def frange(start, stop = None, step = 1):
"""frange generates a set of floating point values over the
range [start, stop) with step size step
frange([start,] stop [, step ])"""
if stop is None:
for x in _xrange(int(ceil(start))):
yield x
else:
# create a generator expression for the index values
indices = (i for i in _xrange(0, int((stop-start)/step)))
# yield results
for i in indices:
yield start + step*i
## drange.py
import decimal
from math import ceil
# find best range function available to version (2.7.x / 3.x.x)
try:
_xrange = xrange
except NameError:
_xrange = range
def drange(start, stop = None, step = 1, precision = None):
"""drange generates a set of Decimal values over the
range [start, stop) with step size step
drange([start,] stop, [step [,precision]])"""
if stop is None:
for x in _xrange(int(ceil(start))):
yield x
else:
# find precision
if precision is not None:
decimal.getcontext().prec = precision
# convert values to decimals
start = decimal.Decimal(start)
stop = decimal.Decimal(stop)
step = decimal.Decimal(step)
# create a generator expression for the index values
indices = (
i for i in _xrange(
0,
((stop-start)/step).to_integral_value()
)
)
# yield results
for i in indices:
yield float(start + step*i)
## testranges.py
import frange
import drange
list(frange.frange(0, 2, 0.5)) # [0.0, 0.5, 1.0, 1.5]
list(drange.drange(0, 2, 0.5, precision = 6)) # [0.0, 0.5, 1.0, 1.5]
list(frange.frange(3)) # [0, 1, 2]
list(frange.frange(3.5)) # [0, 1, 2, 3]
list(frange.frange(0,10, -1)) # []
Lots of the solutions here still had floating point errors in Python 3.6 and didnt do exactly what I personally needed.
Function below takes integers or floats, doesnt require imports and doesnt return floating point errors.
def frange(x, y, step):
if int(x + y + step) == (x + y + step):
r = list(range(int(x), int(y), int(step)))
else:
f = 10 ** (len(str(step)) - str(step).find('.') - 1)
rf = list(range(int(x * f), int(y * f), int(step * f)))
r = [i / f for i in rf]
return r
Suprised no-one has yet mentioned the recommended solution in the Python 3 docs:
See also:
The linspace recipe shows how to implement a lazy version of range that suitable for floating point applications.
Once defined, the recipe is easy to use and does not require numpy or any other external libraries, but functions like numpy.linspace(). Note that rather than a step argument, the third num argument specifies the number of desired values, for example:
print(linspace(0, 10, 5))
# linspace(0, 10, 5)
print(list(linspace(0, 10, 5)))
# [0.0, 2.5, 5.0, 7.5, 10]
I quote a modified version of the full Python 3 recipe from Andrew Barnert below:
import collections.abc
import numbers
class linspace(collections.abc.Sequence):
"""linspace(start, stop, num) -> linspace object
Return a virtual sequence of num numbers from start to stop (inclusive).
If you need a half-open range, use linspace(start, stop, num+1)[:-1].
"""
def __init__(self, start, stop, num):
if not isinstance(num, numbers.Integral) or num <= 1:
raise ValueError('num must be an integer > 1')
self.start, self.stop, self.num = start, stop, num
self.step = (stop-start)/(num-1)
def __len__(self):
return self.num
def __getitem__(self, i):
if isinstance(i, slice):
return [self[x] for x in range(*i.indices(len(self)))]
if i < 0:
i = self.num + i
if i >= self.num:
raise IndexError('linspace object index out of range')
if i == self.num-1:
return self.stop
return self.start + i*self.step
def __repr__(self):
return '{}({}, {}, {})'.format(type(self).__name__,
self.start, self.stop, self.num)
def __eq__(self, other):
if not isinstance(other, linspace):
return False
return ((self.start, self.stop, self.num) ==
(other.start, other.stop, other.num))
def __ne__(self, other):
return not self==other
def __hash__(self):
return hash((type(self), self.start, self.stop, self.num))
This is my solution to get ranges with float steps.
Using this function it's not necessary to import numpy, nor install it.
I'm pretty sure that it could be improved and optimized. Feel free to do it and post it here.
from __future__ import division
from math import log
def xfrange(start, stop, step):
old_start = start #backup this value
digits = int(round(log(10000, 10)))+1 #get number of digits
magnitude = 10**digits
stop = int(magnitude * stop) #convert from
step = int(magnitude * step) #0.1 to 10 (e.g.)
if start == 0:
start = 10**(digits-1)
else:
start = 10**(digits)*start
data = [] #create array
#calc number of iterations
end_loop = int((stop-start)//step)
if old_start == 0:
end_loop += 1
acc = start
for i in xrange(0, end_loop):
data.append(acc/magnitude)
acc += step
return data
print xfrange(1, 2.1, 0.1)
print xfrange(0, 1.1, 0.1)
print xfrange(-1, 0.1, 0.1)
The output is:
[1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0]
[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1]
[-1.0, -0.9, -0.8, -0.7, -0.6, -0.5, -0.4, -0.3, -0.2, -0.1, 0.0]
For completeness of boutique, a functional solution:
def frange(a,b,s):
return [] if s > 0 and a > b or s < 0 and a < b or s==0 else [a]+frange(a+s,b,s)
You can use this function:
def frange(start,end,step):
return map(lambda x: x*step, range(int(start*1./step),int(end*1./step)))
It can be done using Numpy library. arange() function allows steps in float. But, it returns a numpy array which can be converted to list using tolist() for our convenience.
for i in np.arange(0, 1, 0.1).tolist():
print i
start and stop are inclusive rather than one or the other (usually stop is excluded) and without imports, and using generators
def rangef(start, stop, step, fround=5):
"""
Yields sequence of numbers from start (inclusive) to stop (inclusive)
by step (increment) with rounding set to n digits.
:param start: start of sequence
:param stop: end of sequence
:param step: int or float increment (e.g. 1 or 0.001)
:param fround: float rounding, n decimal places
:return:
"""
try:
i = 0
while stop >= start and step > 0:
if i==0:
yield start
elif start >= stop:
yield stop
elif start < stop:
if start == 0:
yield 0
if start != 0:
yield start
i += 1
start += step
start = round(start, fround)
else:
pass
except TypeError as e:
yield "type-error({})".format(e)
else:
pass
# passing
print(list(rangef(-100.0,10.0,1)))
print(list(rangef(-100,0,0.5)))
print(list(rangef(-1,1,0.2)))
print(list(rangef(-1,1,0.1)))
print(list(rangef(-1,1,0.05)))
print(list(rangef(-1,1,0.02)))
print(list(rangef(-1,1,0.01)))
print(list(rangef(-1,1,0.005)))
# failing: type-error:
print(list(rangef("1","10","1")))
print(list(rangef(1,10,"1")))
Python 3.6.2 (v3.6.2:5fd33b5, Jul 8 2017, 04:57:36) [MSC v.1900 64
bit (AMD64)]
I know I'm late to the party here, but here's a trivial generator solution that's working in 3.6:
def floatRange(*args):
start, step = 0, 1
if len(args) == 1:
stop = args[0]
elif len(args) == 2:
start, stop = args[0], args[1]
elif len(args) == 3:
start, stop, step = args[0], args[1], args[2]
else:
raise TypeError("floatRange accepts 1, 2, or 3 arguments. ({0} given)".format(len(args)))
for num in start, step, stop:
if not isinstance(num, (int, float)):
raise TypeError("floatRange only accepts float and integer arguments. ({0} : {1} given)".format(type(num), str(num)))
for x in range(int((stop-start)/step)):
yield start + (x * step)
return
then you can call it just like the original range()... there's no error handling, but let me know if there is an error that can be reasonably caught, and I'll update. or you can update it. this is StackOverflow.
To counter the float precision issues, you could use the Decimal module.
This demands an extra effort of converting to Decimal from int or float while writing the code, but you can instead pass str and modify the function if that sort of convenience is indeed necessary.
from decimal import Decimal
def decimal_range(*args):
zero, one = Decimal('0'), Decimal('1')
if len(args) == 1:
start, stop, step = zero, args[0], one
elif len(args) == 2:
start, stop, step = args + (one,)
elif len(args) == 3:
start, stop, step = args
else:
raise ValueError('Expected 1 or 2 arguments, got %s' % len(args))
if not all([type(arg) == Decimal for arg in (start, stop, step)]):
raise ValueError('Arguments must be passed as <type: Decimal>')
# neglect bad cases
if (start == stop) or (start > stop and step >= zero) or \
(start < stop and step <= zero):
return []
current = start
while abs(current) < abs(stop):
yield current
current += step
Sample outputs -
from decimal import Decimal as D
list(decimal_range(D('2')))
# [Decimal('0'), Decimal('1')]
list(decimal_range(D('2'), D('4.5')))
# [Decimal('2'), Decimal('3'), Decimal('4')]
list(decimal_range(D('2'), D('4.5'), D('0.5')))
# [Decimal('2'), Decimal('2.5'), Decimal('3.0'), Decimal('3.5'), Decimal('4.0')]
list(decimal_range(D('2'), D('4.5'), D('-0.5')))
# []
list(decimal_range(D('2'), D('-4.5'), D('-0.5')))
# [Decimal('2'),
# Decimal('1.5'),
# Decimal('1.0'),
# Decimal('0.5'),
# Decimal('0.0'),
# Decimal('-0.5'),
# Decimal('-1.0'),
# Decimal('-1.5'),
# Decimal('-2.0'),
# Decimal('-2.5'),
# Decimal('-3.0'),
# Decimal('-3.5'),
# Decimal('-4.0')]
Add auto-correction for the possibility of an incorrect sign on step:
def frange(start,step,stop):
step *= 2*((stop>start)^(step<0))-1
return [start+i*step for i in range(int((stop-start)/step))]
My solution:
def seq(start, stop, step=1, digit=0):
x = float(start)
v = []
while x <= stop:
v.append(round(x,digit))
x += step
return v
Here is my solution which works fine with float_range(-1, 0, 0.01) and works without floating point representation errors. It is not very fast, but works fine:
from decimal import Decimal
def get_multiplier(_from, _to, step):
digits = []
for number in [_from, _to, step]:
pre = Decimal(str(number)) % 1
digit = len(str(pre)) - 2
digits.append(digit)
max_digits = max(digits)
return float(10 ** (max_digits))
def float_range(_from, _to, step, include=False):
"""Generates a range list of floating point values over the Range [start, stop]
with step size step
include=True - allows to include right value to if possible
!! Works fine with floating point representation !!
"""
mult = get_multiplier(_from, _to, step)
# print mult
int_from = int(round(_from * mult))
int_to = int(round(_to * mult))
int_step = int(round(step * mult))
# print int_from,int_to,int_step
if include:
result = range(int_from, int_to + int_step, int_step)
result = [r for r in result if r <= int_to]
else:
result = range(int_from, int_to, int_step)
# print result
float_result = [r / mult for r in result]
return float_result
print float_range(-1, 0, 0.01,include=False)
assert float_range(1.01, 2.06, 5.05 % 1, True) ==\
[1.01, 1.06, 1.11, 1.16, 1.21, 1.26, 1.31, 1.36, 1.41, 1.46, 1.51, 1.56, 1.61, 1.66, 1.71, 1.76, 1.81, 1.86, 1.91, 1.96, 2.01, 2.06]
assert float_range(1.01, 2.06, 5.05 % 1, False)==\
[1.01, 1.06, 1.11, 1.16, 1.21, 1.26, 1.31, 1.36, 1.41, 1.46, 1.51, 1.56, 1.61, 1.66, 1.71, 1.76, 1.81, 1.86, 1.91, 1.96, 2.01]
I am only a beginner, but I had the same problem, when simulating some calculations. Here is how I attempted to work this out, which seems to be working with decimal steps.
I am also quite lazy and so I found it hard to write my own range function.
Basically what I did is changed my xrange(0.0, 1.0, 0.01) to xrange(0, 100, 1) and used the division by 100.0 inside the loop.
I was also concerned, if there will be rounding mistakes. So I decided to test, whether there are any. Now I heard, that if for example 0.01 from a calculation isn't exactly the float 0.01 comparing them should return False (if I am wrong, please let me know).
So I decided to test if my solution will work for my range by running a short test:
for d100 in xrange(0, 100, 1):
d = d100 / 100.0
fl = float("0.00"[:4 - len(str(d100))] + str(d100))
print d, "=", fl , d == fl
And it printed True for each.
Now, if I'm getting it totally wrong, please let me know.
The trick to avoid round-off problem is to use a separate number to move through the range, that starts and half the step ahead of start.
# floating point range
def frange(a, b, stp=1.0):
i = a+stp/2.0
while i<b:
yield a
a += stp
i += stp
Alternatively, numpy.arange can be used.
My answer is similar to others using map(), without need of NumPy, and without using lambda (though you could). To get a list of float values from 0.0 to t_max in steps of dt:
def xdt(n):
return dt*float(n)
tlist = map(xdt, range(int(t_max/dt)+1))
So I have an array with 2 columns (x, y).
I need to find values in the y column matching some other set of numbers, say [0.5, 0.5, 0.99] and return the values from the x column with the same indices into a new variable.
x=np.linspace(50,70,20)
y=np.linspace(0,1,20)
c=np.zeros((2,len(x)))
x=np.around(x,3)
y=np.around(y,3)
for ii, (left, right) in enumerate(zip(x[1:], y[1:])):
print(left, right)
c[0, ii] = left
c[1, ii] = right
q=[0.05,0.5,0.99]
So I need to compare c[1,:] to q and then return the values from c[0,:] with the corresponding indices.
I tried for and enumerate but I can't figure out whether I need to use iterator once or twice (for c and q).
Thanks!
You could use np.nonzero to find the values of q in y.
The question is what is the expected behaviour if the value is not present in y.
Right now, the values for this case are `-1'.
import numpy as np
n = 100
x = np.linspace(50, 70, n)
y = np.linspace(0, 1, n)
x = np.around(x, 2)
y = np.around(y, 2)
q = [0.05, 0.50, 0.99]
res = np.full((len(q), 2), -1)
for i, qq in enumerate(q):
j = np.nonzero(y == qq)[0]
if np.size(j) == 1:
res[i] = (j, x[j])
res
# index, value
# array([[ 5, 51],
# [-1, -1],
# [98, 69]])
Is it any fast way to merge two numpy histograms with different bin ranges and bin number?
For example:
x = [1,2,2,3]
y = [4,5,5,6]
a = np.histogram(x, bins=10)
# a[0] = [1, 0, 0, 0, 0, 2, 0, 0, 0, 1]
# a[1] = [ 1. , 1.2, 1.4, 1.6, 1.8, 2. , 2.2, 2.4, 2.6, 2.8, 3. ]
b = np.histogram(y, bins=5)
# b[0] = [1, 0, 2, 0, 1]
# b[1] = [ 4. , 4.4, 4.8, 5.2, 5.6, 6. ]
Now I want to have some function like this:
def merge(a, b):
# some actions here #
return merged_a_b_values, merged_a_b_bins
Actually I have not x and y, a and b are known only.
But the result of merge(a, b) must be equal to np.histogram(x+y, bins=10):
m = merge(a, b)
# m[0] = [1, 0, 2, 0, 1, 0, 1, 0, 2, 1]
# m[1] = [ 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. , 5.5, 6. ]
I'd actually have added a comment to dangom's answer, but I lack the reputation required.
I'm a little confused by your example. You're plotting the histogram of the histogram bins if I'm not mistaken. It should rather be this, right?
plt.figure()
plt.plot(a[1][:-1], a[0], marker='.', label='a')
plt.plot(b[1][:-1], b[0], marker='.', label='b')
plt.plot(c[1][:-1], c[0], marker='.', label='c')
plt.legend()
plt.show()
Also a note to your suggestion for combining the histogram. You are of course right, that there's no unique solution as you simply don't know, where the samples would've have been in the finer grid you use for the combination. When having two histograms, which have a significantly differing bin width the suggested merging function may result in a sparse and artificial looking histogram.
I tried combining the histograms by interpolation (assuming the samples within the count bin were distributed uniformly in the original bin - which is of course also only an assumption).
This leads however to a more natural looking result, at least for data sampled from distributions I typically encounter.
import numpy as np
def merge_hist(a, b):
edgesa = a[1]
edgesb = b[1]
da = edgesa[1]-edgesa[0]
db = edgesb[1]-edgesb[0]
dint = np.min([da, db])
min = np.min(np.hstack([edgesa, edgesb]))
max = np.max(np.hstack([edgesa, edgesb]))
edgesc = np.arange(min, max, dint)
def interpolate_hist(edgesint, edges, hist):
cumhist = np.hstack([0, np.cumsum(hist)])
cumhistint = np.interp(edgesint, edges, cumhist)
histint = np.diff(cumhistint)
return histint
histaint = interpolate_hist(edgesc, edgesa, a[0])
histbint = interpolate_hist(edgesc, edgesb, b[0])
c = histaint + histbint
return c, edgesc
An example for two gaussian distributions:
import numpy as np
a = 5 + 1*np.random.randn(100)
b = 10 + 2*np.random.randn(100)
hista, edgesa = np.histogram(a, bins=10)
histb, edgesb = np.histogram(b, bins=5)
histc, edgesc = merge_hist([hista, edgesa], [histb, edgesb])
plt.figure()
width = edgesa[1]-edgesa[0]
plt.bar(edgesa[:-1], hista, width=width)
width = edgesb[1]-edgesb[0]
plt.bar(edgesb[:-1], histb, width=width)
plt.figure()
width = edgesc[1]-edgesc[0]
plt.bar(edgesc[:-1], histc, width=width)
plt.show()
I, however, am no statistician, so please let me know if the suggestes approach is viable.
There is no unique solution to the problem of merging two different histograms. I propose here a simple and quick solution based on two design assumptions necessary to deal with the loss of information inherent from binning sequences:
Recovered values are represented by the start of the bin they belong to.
The merge shall keep the highest bin resolution to avoid further loss of information and shall completely encompass the intervals of the children histograms.
Here's the code:
import numpy as np
def merge(a, b):
def extract_vals(hist):
# Recover values based on assumption 1.
values = [[y]*x for x, y in zip(hist[0], hist[1])]
# Return flattened list.
return [z for s in values for z in s]
def extract_bin_resolution(hist):
return hist[1][1] - hist[1][0]
def generate_num_bins(minval, maxval, bin_resolution):
# Generate number of bins necessary to satisfy assumption 2
return int(np.ceil((maxval - minval) / bin_resolution))
vals = extract_vals(a) + extract_vals(b)
bin_resolution = min(map(extract_bin_resolution, [a, b]))
num_bins = generate_num_bins(min(vals), max(vals), bin_resolution)
return np.histogram(vals, bins=num_bins)
Here's the example code:
import matplotlib.pyplot as plt
x = [1,2,2,3]
y = [4,5,5,6]
a = np.histogram(x, bins=10)
# a[0] = [1, 0, 0, 0, 0, 2, 0, 0, 0, 1]
# a[1] = [ 1. , 1.2, 1.4, 1.6, 1.8, 2. , 2.2, 2.4, 2.6, 2.8, 3. ]
b = np.histogram(y, bins=5)
# b[0] = [1, 0, 2, 0, 1]
# b[1] = [ 4. , 4.4, 4.8, 5.2, 5.6, 6. ]
# Merge and plot results
c = merge(a, b)
c_num_bins = c[1].size - 1
plt.hist(a[0], bins=5, label='a')
plt.hist(b[0], bins=10, label='b')
plt.hist(c[0], bins=c_num_bins, label='c')
plt.legend()
plt.show()
How do I iterate between 0 and 1 by a step of 0.1?
This says that the step argument cannot be zero:
for i in range(0, 1, 0.1):
print(i)
Rather than using a decimal step directly, it's much safer to express this in terms of how many points you want. Otherwise, floating-point rounding error is likely to give you a wrong result.
Use the linspace function from the NumPy library (which isn't part of the standard library but is relatively easy to obtain). linspace takes a number of points to return, and also lets you specify whether or not to include the right endpoint:
>>> np.linspace(0,1,11)
array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])
>>> np.linspace(0,1,10,endpoint=False)
array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
If you really want to use a floating-point step value, use numpy.arange:
>>> import numpy as np
>>> np.arange(0.0, 1.0, 0.1)
array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
Floating-point rounding error will cause problems, though. Here's a simple case where rounding error causes arange to produce a length-4 array when it should only produce 3 numbers:
>>> numpy.arange(1, 1.3, 0.1)
array([1. , 1.1, 1.2, 1.3])
range() can only do integers, not floating point.
Use a list comprehension instead to obtain a list of steps:
[x * 0.1 for x in range(0, 10)]
More generally, a generator comprehension minimizes memory allocations:
xs = (x * 0.1 for x in range(0, 10))
for x in xs:
print(x)
Building on 'xrange([start], stop[, step])', you can define a generator that accepts and produces any type you choose (stick to types supporting + and <):
>>> def drange(start, stop, step):
... r = start
... while r < stop:
... yield r
... r += step
...
>>> i0=drange(0.0, 1.0, 0.1)
>>> ["%g" % x for x in i0]
['0', '0.1', '0.2', '0.3', '0.4', '0.5', '0.6', '0.7', '0.8', '0.9', '1']
>>>
Increase the magnitude of i for the loop and then reduce it when you need it.
for i * 100 in range(0, 100, 10):
print i / 100.0
EDIT: I honestly cannot remember why I thought that would work syntactically
for i in range(0, 11, 1):
print i / 10.0
That should have the desired output.
NumPy is a bit overkill, I think.
[p/10 for p in range(0, 10)]
[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
Generally speaking, to do a step-by-1/x up to y you would do
x=100
y=2
[p/x for p in range(0, int(x*y))]
[0.0, 0.01, 0.02, 0.03, ..., 1.97, 1.98, 1.99]
(1/x produced less rounding noise when I tested).
scipy has a built in function arange which generalizes Python's range() constructor to satisfy your requirement of float handling.
from scipy import arange
Similar to R's seq function, this one returns a sequence in any order given the correct step value. The last value is equal to the stop value.
def seq(start, stop, step=1):
n = int(round((stop - start)/float(step)))
if n > 1:
return([start + step*i for i in range(n+1)])
elif n == 1:
return([start])
else:
return([])
Results
seq(1, 5, 0.5)
[1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0]
seq(10, 0, -1)
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
seq(10, 0, -2)
[10, 8, 6, 4, 2, 0]
seq(1, 1)
[ 1 ]
The range() built-in function returns a sequence of integer values, I'm afraid, so you can't use it to do a decimal step.
I'd say just use a while loop:
i = 0.0
while i <= 1.0:
print i
i += 0.1
If you're curious, Python is converting your 0.1 to 0, which is why it's telling you the argument can't be zero.
Here's a solution using itertools:
import itertools
def seq(start, end, step):
if step == 0:
raise ValueError("step must not be 0")
sample_count = int(abs(end - start) / step)
return itertools.islice(itertools.count(start, step), sample_count)
Usage Example:
for i in seq(0, 1, 0.1):
print(i)
[x * 0.1 for x in range(0, 10)]
in Python 2.7x gives you the result of:
[0.0, 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6000000000000001, 0.7000000000000001, 0.8, 0.9]
but if you use:
[ round(x * 0.1, 1) for x in range(0, 10)]
gives you the desired:
[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
import numpy as np
for i in np.arange(0, 1, 0.1):
print i
Best Solution: no rounding error
>>> step = .1
>>> N = 10 # number of data points
>>> [ x / pow(step, -1) for x in range(0, N + 1) ]
[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
Or, for a set range instead of set data points (e.g. continuous function), use:
>>> step = .1
>>> rnge = 1 # NOTE range = 1, i.e. span of data points
>>> N = int(rnge / step
>>> [ x / pow(step,-1) for x in range(0, N + 1) ]
[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
To implement a function: replace x / pow(step, -1) with f( x / pow(step, -1) ), and define f.
For example:
>>> import math
>>> def f(x):
return math.sin(x)
>>> step = .1
>>> rnge = 1 # NOTE range = 1, i.e. span of data points
>>> N = int(rnge / step)
>>> [ f( x / pow(step,-1) ) for x in range(0, N + 1) ]
[0.0, 0.09983341664682815, 0.19866933079506122, 0.29552020666133955, 0.3894183423086505,
0.479425538604203, 0.5646424733950354, 0.644217687237691, 0.7173560908995228,
0.7833269096274834, 0.8414709848078965]
And if you do this often, you might want to save the generated list r
r=map(lambda x: x/10.0,range(0,10))
for i in r:
print i
more_itertools is a third-party library that implements a numeric_range tool:
import more_itertools as mit
for x in mit.numeric_range(0, 1, 0.1):
print("{:.1f}".format(x))
Output
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
This tool also works for Decimal and Fraction.
My versions use the original range function to create multiplicative indices for the shift. This allows same syntax to the original range function.
I have made two versions, one using float, and one using Decimal, because I found that in some cases I wanted to avoid the roundoff drift introduced by the floating point arithmetic.
It is consistent with empty set results as in range/xrange.
Passing only a single numeric value to either function will return the standard range output to the integer ceiling value of the input parameter (so if you gave it 5.5, it would return range(6).)
Edit: the code below is now available as package on pypi: Franges
## frange.py
from math import ceil
# find best range function available to version (2.7.x / 3.x.x)
try:
_xrange = xrange
except NameError:
_xrange = range
def frange(start, stop = None, step = 1):
"""frange generates a set of floating point values over the
range [start, stop) with step size step
frange([start,] stop [, step ])"""
if stop is None:
for x in _xrange(int(ceil(start))):
yield x
else:
# create a generator expression for the index values
indices = (i for i in _xrange(0, int((stop-start)/step)))
# yield results
for i in indices:
yield start + step*i
## drange.py
import decimal
from math import ceil
# find best range function available to version (2.7.x / 3.x.x)
try:
_xrange = xrange
except NameError:
_xrange = range
def drange(start, stop = None, step = 1, precision = None):
"""drange generates a set of Decimal values over the
range [start, stop) with step size step
drange([start,] stop, [step [,precision]])"""
if stop is None:
for x in _xrange(int(ceil(start))):
yield x
else:
# find precision
if precision is not None:
decimal.getcontext().prec = precision
# convert values to decimals
start = decimal.Decimal(start)
stop = decimal.Decimal(stop)
step = decimal.Decimal(step)
# create a generator expression for the index values
indices = (
i for i in _xrange(
0,
((stop-start)/step).to_integral_value()
)
)
# yield results
for i in indices:
yield float(start + step*i)
## testranges.py
import frange
import drange
list(frange.frange(0, 2, 0.5)) # [0.0, 0.5, 1.0, 1.5]
list(drange.drange(0, 2, 0.5, precision = 6)) # [0.0, 0.5, 1.0, 1.5]
list(frange.frange(3)) # [0, 1, 2]
list(frange.frange(3.5)) # [0, 1, 2, 3]
list(frange.frange(0,10, -1)) # []
Lots of the solutions here still had floating point errors in Python 3.6 and didnt do exactly what I personally needed.
Function below takes integers or floats, doesnt require imports and doesnt return floating point errors.
def frange(x, y, step):
if int(x + y + step) == (x + y + step):
r = list(range(int(x), int(y), int(step)))
else:
f = 10 ** (len(str(step)) - str(step).find('.') - 1)
rf = list(range(int(x * f), int(y * f), int(step * f)))
r = [i / f for i in rf]
return r
Suprised no-one has yet mentioned the recommended solution in the Python 3 docs:
See also:
The linspace recipe shows how to implement a lazy version of range that suitable for floating point applications.
Once defined, the recipe is easy to use and does not require numpy or any other external libraries, but functions like numpy.linspace(). Note that rather than a step argument, the third num argument specifies the number of desired values, for example:
print(linspace(0, 10, 5))
# linspace(0, 10, 5)
print(list(linspace(0, 10, 5)))
# [0.0, 2.5, 5.0, 7.5, 10]
I quote a modified version of the full Python 3 recipe from Andrew Barnert below:
import collections.abc
import numbers
class linspace(collections.abc.Sequence):
"""linspace(start, stop, num) -> linspace object
Return a virtual sequence of num numbers from start to stop (inclusive).
If you need a half-open range, use linspace(start, stop, num+1)[:-1].
"""
def __init__(self, start, stop, num):
if not isinstance(num, numbers.Integral) or num <= 1:
raise ValueError('num must be an integer > 1')
self.start, self.stop, self.num = start, stop, num
self.step = (stop-start)/(num-1)
def __len__(self):
return self.num
def __getitem__(self, i):
if isinstance(i, slice):
return [self[x] for x in range(*i.indices(len(self)))]
if i < 0:
i = self.num + i
if i >= self.num:
raise IndexError('linspace object index out of range')
if i == self.num-1:
return self.stop
return self.start + i*self.step
def __repr__(self):
return '{}({}, {}, {})'.format(type(self).__name__,
self.start, self.stop, self.num)
def __eq__(self, other):
if not isinstance(other, linspace):
return False
return ((self.start, self.stop, self.num) ==
(other.start, other.stop, other.num))
def __ne__(self, other):
return not self==other
def __hash__(self):
return hash((type(self), self.start, self.stop, self.num))
This is my solution to get ranges with float steps.
Using this function it's not necessary to import numpy, nor install it.
I'm pretty sure that it could be improved and optimized. Feel free to do it and post it here.
from __future__ import division
from math import log
def xfrange(start, stop, step):
old_start = start #backup this value
digits = int(round(log(10000, 10)))+1 #get number of digits
magnitude = 10**digits
stop = int(magnitude * stop) #convert from
step = int(magnitude * step) #0.1 to 10 (e.g.)
if start == 0:
start = 10**(digits-1)
else:
start = 10**(digits)*start
data = [] #create array
#calc number of iterations
end_loop = int((stop-start)//step)
if old_start == 0:
end_loop += 1
acc = start
for i in xrange(0, end_loop):
data.append(acc/magnitude)
acc += step
return data
print xfrange(1, 2.1, 0.1)
print xfrange(0, 1.1, 0.1)
print xfrange(-1, 0.1, 0.1)
The output is:
[1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0]
[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1]
[-1.0, -0.9, -0.8, -0.7, -0.6, -0.5, -0.4, -0.3, -0.2, -0.1, 0.0]
For completeness of boutique, a functional solution:
def frange(a,b,s):
return [] if s > 0 and a > b or s < 0 and a < b or s==0 else [a]+frange(a+s,b,s)
You can use this function:
def frange(start,end,step):
return map(lambda x: x*step, range(int(start*1./step),int(end*1./step)))
It can be done using Numpy library. arange() function allows steps in float. But, it returns a numpy array which can be converted to list using tolist() for our convenience.
for i in np.arange(0, 1, 0.1).tolist():
print i
start and stop are inclusive rather than one or the other (usually stop is excluded) and without imports, and using generators
def rangef(start, stop, step, fround=5):
"""
Yields sequence of numbers from start (inclusive) to stop (inclusive)
by step (increment) with rounding set to n digits.
:param start: start of sequence
:param stop: end of sequence
:param step: int or float increment (e.g. 1 or 0.001)
:param fround: float rounding, n decimal places
:return:
"""
try:
i = 0
while stop >= start and step > 0:
if i==0:
yield start
elif start >= stop:
yield stop
elif start < stop:
if start == 0:
yield 0
if start != 0:
yield start
i += 1
start += step
start = round(start, fround)
else:
pass
except TypeError as e:
yield "type-error({})".format(e)
else:
pass
# passing
print(list(rangef(-100.0,10.0,1)))
print(list(rangef(-100,0,0.5)))
print(list(rangef(-1,1,0.2)))
print(list(rangef(-1,1,0.1)))
print(list(rangef(-1,1,0.05)))
print(list(rangef(-1,1,0.02)))
print(list(rangef(-1,1,0.01)))
print(list(rangef(-1,1,0.005)))
# failing: type-error:
print(list(rangef("1","10","1")))
print(list(rangef(1,10,"1")))
Python 3.6.2 (v3.6.2:5fd33b5, Jul 8 2017, 04:57:36) [MSC v.1900 64
bit (AMD64)]
I know I'm late to the party here, but here's a trivial generator solution that's working in 3.6:
def floatRange(*args):
start, step = 0, 1
if len(args) == 1:
stop = args[0]
elif len(args) == 2:
start, stop = args[0], args[1]
elif len(args) == 3:
start, stop, step = args[0], args[1], args[2]
else:
raise TypeError("floatRange accepts 1, 2, or 3 arguments. ({0} given)".format(len(args)))
for num in start, step, stop:
if not isinstance(num, (int, float)):
raise TypeError("floatRange only accepts float and integer arguments. ({0} : {1} given)".format(type(num), str(num)))
for x in range(int((stop-start)/step)):
yield start + (x * step)
return
then you can call it just like the original range()... there's no error handling, but let me know if there is an error that can be reasonably caught, and I'll update. or you can update it. this is StackOverflow.
To counter the float precision issues, you could use the Decimal module.
This demands an extra effort of converting to Decimal from int or float while writing the code, but you can instead pass str and modify the function if that sort of convenience is indeed necessary.
from decimal import Decimal
def decimal_range(*args):
zero, one = Decimal('0'), Decimal('1')
if len(args) == 1:
start, stop, step = zero, args[0], one
elif len(args) == 2:
start, stop, step = args + (one,)
elif len(args) == 3:
start, stop, step = args
else:
raise ValueError('Expected 1 or 2 arguments, got %s' % len(args))
if not all([type(arg) == Decimal for arg in (start, stop, step)]):
raise ValueError('Arguments must be passed as <type: Decimal>')
# neglect bad cases
if (start == stop) or (start > stop and step >= zero) or \
(start < stop and step <= zero):
return []
current = start
while abs(current) < abs(stop):
yield current
current += step
Sample outputs -
from decimal import Decimal as D
list(decimal_range(D('2')))
# [Decimal('0'), Decimal('1')]
list(decimal_range(D('2'), D('4.5')))
# [Decimal('2'), Decimal('3'), Decimal('4')]
list(decimal_range(D('2'), D('4.5'), D('0.5')))
# [Decimal('2'), Decimal('2.5'), Decimal('3.0'), Decimal('3.5'), Decimal('4.0')]
list(decimal_range(D('2'), D('4.5'), D('-0.5')))
# []
list(decimal_range(D('2'), D('-4.5'), D('-0.5')))
# [Decimal('2'),
# Decimal('1.5'),
# Decimal('1.0'),
# Decimal('0.5'),
# Decimal('0.0'),
# Decimal('-0.5'),
# Decimal('-1.0'),
# Decimal('-1.5'),
# Decimal('-2.0'),
# Decimal('-2.5'),
# Decimal('-3.0'),
# Decimal('-3.5'),
# Decimal('-4.0')]
Add auto-correction for the possibility of an incorrect sign on step:
def frange(start,step,stop):
step *= 2*((stop>start)^(step<0))-1
return [start+i*step for i in range(int((stop-start)/step))]
My solution:
def seq(start, stop, step=1, digit=0):
x = float(start)
v = []
while x <= stop:
v.append(round(x,digit))
x += step
return v
Here is my solution which works fine with float_range(-1, 0, 0.01) and works without floating point representation errors. It is not very fast, but works fine:
from decimal import Decimal
def get_multiplier(_from, _to, step):
digits = []
for number in [_from, _to, step]:
pre = Decimal(str(number)) % 1
digit = len(str(pre)) - 2
digits.append(digit)
max_digits = max(digits)
return float(10 ** (max_digits))
def float_range(_from, _to, step, include=False):
"""Generates a range list of floating point values over the Range [start, stop]
with step size step
include=True - allows to include right value to if possible
!! Works fine with floating point representation !!
"""
mult = get_multiplier(_from, _to, step)
# print mult
int_from = int(round(_from * mult))
int_to = int(round(_to * mult))
int_step = int(round(step * mult))
# print int_from,int_to,int_step
if include:
result = range(int_from, int_to + int_step, int_step)
result = [r for r in result if r <= int_to]
else:
result = range(int_from, int_to, int_step)
# print result
float_result = [r / mult for r in result]
return float_result
print float_range(-1, 0, 0.01,include=False)
assert float_range(1.01, 2.06, 5.05 % 1, True) ==\
[1.01, 1.06, 1.11, 1.16, 1.21, 1.26, 1.31, 1.36, 1.41, 1.46, 1.51, 1.56, 1.61, 1.66, 1.71, 1.76, 1.81, 1.86, 1.91, 1.96, 2.01, 2.06]
assert float_range(1.01, 2.06, 5.05 % 1, False)==\
[1.01, 1.06, 1.11, 1.16, 1.21, 1.26, 1.31, 1.36, 1.41, 1.46, 1.51, 1.56, 1.61, 1.66, 1.71, 1.76, 1.81, 1.86, 1.91, 1.96, 2.01]
I am only a beginner, but I had the same problem, when simulating some calculations. Here is how I attempted to work this out, which seems to be working with decimal steps.
I am also quite lazy and so I found it hard to write my own range function.
Basically what I did is changed my xrange(0.0, 1.0, 0.01) to xrange(0, 100, 1) and used the division by 100.0 inside the loop.
I was also concerned, if there will be rounding mistakes. So I decided to test, whether there are any. Now I heard, that if for example 0.01 from a calculation isn't exactly the float 0.01 comparing them should return False (if I am wrong, please let me know).
So I decided to test if my solution will work for my range by running a short test:
for d100 in xrange(0, 100, 1):
d = d100 / 100.0
fl = float("0.00"[:4 - len(str(d100))] + str(d100))
print d, "=", fl , d == fl
And it printed True for each.
Now, if I'm getting it totally wrong, please let me know.
The trick to avoid round-off problem is to use a separate number to move through the range, that starts and half the step ahead of start.
# floating point range
def frange(a, b, stp=1.0):
i = a+stp/2.0
while i<b:
yield a
a += stp
i += stp
Alternatively, numpy.arange can be used.
My answer is similar to others using map(), without need of NumPy, and without using lambda (though you could). To get a list of float values from 0.0 to t_max in steps of dt:
def xdt(n):
return dt*float(n)
tlist = map(xdt, range(int(t_max/dt)+1))