improve performance of list creation

improve performance of list creation - python

How can I improve significantly the speed of the following code? Can mapping, numpy, matrix operations be efficiently used and/or something else to omit the for loop?
import time
def func(x):
if x%2 == 0:
return 'even'
else:
return 'odd'
starttime = time.time()
MAX=1000000
y=list(range(MAX))
for n in range(MAX):
y[n]=[n,n**2,func(n)]
print('That took {} seconds'.format(time.time() - starttime))
The following replacement does not improve the speed:
import numpy as np
r = np.array(range(MAX))
str = ['odd', 'even']
result = np.array([r, r ** 2, list(map(lambda x: str[x % 2], r))])
y = result.T

I think you can do it this way, the idea is to use as many numpy built-in functions as possible
%%timeit
y = np.arange(MAX)
y_2 = y**2
y_str = np.where(y%2==0,'even','odd')
res = np.rec.fromarrays((y,y_2,y_str), names=('y', 'y_2', 'y_str'))
#
# Some examples for working with the record array
res[3]
# (3, 9, 'odd')
res[:3]
# rec.array([(0, 0, 'even'), (1, 1, 'odd'), (2, 4, 'even')],
# dtype=[('y', '<i8'), ('y_2', '<i8'), ('y_str', '<U4')])
res['y_str'][:7]
# array(['even', 'odd', 'even', 'odd', 'even', 'odd', 'even'], dtype='<U4')
res.y_2[:7]
# array([ 0, 1, 4, 9, 16, 25, 36])
I have ran several tests, and it is significantly faster.

For large arrays of the same type, numpy is the way to go. But numpy where is slow, so if you just want 'odd' and 'even', you can use np.tile or something like it:
MAX = 1000000
%%timeit
y = np.arange(MAX)
ystr = np.where(y%2==0,'even','odd')
# 14.9 ms ± 61.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%%timeit
temp = np.array(['even', 'odd'])
ystr = np.tile(temp, MAX//2)
# 4.1 ms ± 112 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
So tile is about 3-4x faster.
If you want something more complex, I'd still try to avoid where if speed is important. There's almost always a way because the where logic is usually simple so it's easy to take the logical expression that was inside the where and write it as an expression between numpy arrays. (Also, to be sure, using numpy and where will be much faster than pure Python lists, it's just usually slow relative to other numpy options.)
The others are fairly obvious:
y = np.arange(MAX)
y2 = y**2
Personally, I'd just stick these together in a list,
result = [y, y2, ystr]
Putting this all together (using tile), I get:
# 6.82 ms ± 84.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

The short answer would be you can't.
Let's explore this a bit through examples.
# Original code
y=list(range(MAX))
for n in range(MAX):
y[n]=[n,n**2,func(n)]
# That took 0.86 seconds
This is the result on my machine so we have a baseline for comparison.
Let us make that a single line and shave off some time.
y = [[n, n ** 2, func(n)] for n in range(MAX)]
# That took 0.74 seconds
We are creating a list of lists and the Python interpreter needs to allocate an empty list MAX times.
In case you don't need to change the number of elements after initialization it might be better to use tuples instead of lists.
y = [(n, n ** 2, func(n)) for n in range(MAX)]
# That took 0.43 seconds
This is twice as fast as the original method.
Let's now assume that we can optimize even more by using some special library and then we just need to parse the result to populate the list. To simulate this we can pickle the list to a binary format and then measure the time it takes to load it.
import pickle
b = pickle.dumps([(n, n ** 2, func(n)) for n in range(MAX)])
starttime = time.time()
y = pickle.loads(b)
print('That took {:.2f} seconds'.format(time.time() - starttime))
# That took 0.23 seconds
This is probably close to what is possible to achieve without coding anything in a lower level language like C and creating Python objects from that language.
Alternative approach
If there is no requirement to create exactly the same object as in the original example and if it is enough that we can read y[10] or y[100:1000] we can do something completely different.
class LazyList():
def __init__(self, size):
self.size = size
def __getitem__(self, key):
if isinstance(key, slice):
r = range(self.size)[key]
return [(n, n ** 2, func(n)) for n in r]
return (key, key ** 2, func(key))
starttime = time.time()
y = LazyList(MAX)
print('That took {:.6f} seconds'.format(time.time() - starttime))
# That took 0.000005 seconds
This is multiple orders of magnitude faster. Of course, this is not a list and the results of the computation are not in memory. We created an object that will in some cases act like a list, but not in other cases (e.g. y[MAX*2] will work, even though it shouldn't). Note that with more work, the object can become even more similar to a list and also use a list as its base class.
If the object we got is converted to a list, the process spends the time that was saved by the alternative approach and the result is the same as in one of the previous examples.
y = y[:]
# That took 0.43 seconds
The longer answer is that it depends on the type of the result that is expected.

See below (it takes ~ 1 sec on my mac)
import time
MAX = 1000000
starttime = time.time()
y = [[n, n ** 2, 'even' if n % 2 == 0 else 'odd'] for n in range(MAX)]
print('That took {} seconds'.format(time.time() - starttime))

I would recommend you to run it with multiple processes in parallel:
import time
import multiprocessing
def func(x):
return [x, x ** 2, "even" if x % 2 == 0 else "odd"]
if __name__ == '__main__':
starttime = time.time()
MAX = 1000000
pool = multiprocessing.Pool(10)
y = pool.map(func, range(MAX))
print('That took {} seconds'.format(time.time() - starttime))
Try to tune the number of processes to get the optimal value for your environment. On mine, it took ~0.8 secs with 20 processes while your original snippet took ~1.1 secs.

Here is a numpy approach:
import time
import numpy as np
starttime = time.time()
r = np.arange(MAX)
res = [r, r ** 2, np.where(r % 2, 'odd', 'even')]
print('That took {:.4} seconds'.format(time.time() - starttime))
# That took 0.05125 seconds || original function took 1.5s
As #Divakar pointed out, how to move on from here depends on what end result you want.
One option would be to have an object array with mixed types:
res = np.array(res, dtype=object).T
print('That took {:.4} seconds'.format(time.time() - starttime))
# That took 0.1863 seconds
res[17]
# array([17, 289, 'odd'], dtype=object)
res[18] + res[17]
# array([35, 613, 'evenodd'], dtype=object) # add for int and str
Unfortunately it is quite expansive to combine the 3 different arrays. It is still way faster than using loops but depending on your next steps you could maybe make further improvements.

On my computer:
the original loop took about 1.01 seconds
NumPy solution took 10.3 ms
Numba solution took 4.25 ms
from numba import njit
import numpy as np
def f(n_max = 1_000_000):
y = x ** 2
z = x % 2
return y, z
#njit
def g(x):
y = x ** 2
z = x % 2
return y, z
n_max = 1_000_000
x = np.arange(n_max, dtype=int)
NumPy:
%%timeit
y, z = f(x)
10.3 ms ± 296 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
And Numba:
y, z = g(x) # don't time first run, which does compile AND execute
%%timeit
y, z = g(x)
4.25 ms ± 85.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Related

Building a numpy array as a function of previous element

I would like to create a numpy array where the first element is a defined constant, and every next element is defined as the function of the previous element in the following way:
import numpy as np
def build_array_recursively(length, V_0, function):
returnList = np.empty(length)
returnList[0] = V_0
for i in range(1,length):
returnList[i] = function(returnList[i-1])
return returnList
d_t = 0.05
print(build_array_recursively(20, 0.3, lambda x: x-x*d_t+x*x/2*d_t*d_t-x*x*x/6*d_t*d_t*d_t))
The print method above outputs
[0.3 0.28511194 0.27095747 0.25750095 0.24470843 0.23254756 0.22098752
0.20999896 0.19955394 0.18962586 0.18018937 0.17122037 0.16269589
0.15459409 0.14689418 0.13957638 0.13262186 0.1260127 0.11973187 0.11376316]
Is there a fast way of doing this in numpy without a for loop?
If so is there a way to handle two elements before the current one, e.g. can a Fibonacci array be constructed similarly?
I found a similar question here
Is it possible to vectorize recursive calculation of a NumPy array where each element depends on the previous one?
but was not answered in general. In my example, the difference equation is difficult to solve manually.

This is faster for what you want to do. You don't have to use recursion for the function.
Calculate the element based on previous element. Append calculated element to a list, and then change the list to numpy.
def method2(length, V_0, d_t):
k = [V_0]
x = V_0
for i in range(1, length):
x = x - x * d_t + x * x / 2 * d_t * d_t - x * x * x / 6 * d_t * d_t * d_t
k.append(x)
return np.asarray(k)
print(method2(20,0.3, 0.05))
Running you existing method 10000 times takes 0.438 seconds, while method2 takes 0.097 seconds.

Using a function to make the code clearer (instead of the inline lambda):
def fn(x):
return x-x*d_t+x*x/2*d_t*d_t-x*x*x/6*d_t*d_t*d_t
And a function that combines elements of build_array_recursively and method2:
def foo1(length, V_0, function):
returnList = np.empty(length)
returnList[0] = x = V_0
for i in range(1,length):
returnList[i] = x = function(x)
return returnList
In [887]: timeit build_array_recursively(20,0.3, fn);
61.4 µs ± 63 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [888]: timeit method2(20,0.3, fn);
16.9 µs ± 103 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [889]: timeit foo1(20,0.3, fn);
13 µs ± 29.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
The main time saver in method2 and foo2 is carrying over x, the last value, from one iteration to the next, rather than indexing with returnList[i-1].
The accumulation method, assigning to a preallocated array, or list append, is less important. Performance is usually similar.
Here the calculation is simple enough that details of what you do in the loop makes a big difference in the overall time.
All of these are loops. Some ufunc have a reduce (and accumulate) method, that can apply the function repeatedly to a elements of the input array. np.sum, np.cumsum, etc make use of this. But you can't do that with a general Python function.
You have to use some sort of compilation tool like numba to perform this sort of loop much faster.

Iterative outer addition Numpy

I want to apply outer addition of multiple vectors/matrices. Let's say four times:
import numpy as np
x = np.arange(100)
B = np.add.outer(x,x)
B = np.add.outer(B,x)
B = np.add.outer(B,x)
I would like best if the number of additions could be a variable, like a=4 --> 4 times the addition. Is this possible?

Approach #1
Here's one with array-initialization -
n = 4 # number of iterations to add outer versions
l = len(x)
out = np.zeros([l]*n,dtype=x.dtype)
for i in range(n):
out += x.reshape(np.insert([1]*(n-1),i,l))
Why this approach and not iterative addition to create new arrays at each iteration?
Iteratively creating new arrays at each iteration would require more memory and hence memory-overhead there. With array-initialization, we are adding element off x into an already initialized array. Hence, it tries to be memory-efficient with it.
Alternative #1
We can remove one iteration with initializing with x. Hence, the changes would be -
out = np.broadcast_to(x,[l]*n).copy()
for i in range(n-1):
Approach # 2: With np.add.reduce -
Another way would be with np.add.reduce, which again doesn't create any intermediate arrays, but being a reduction method might be better here as that's what it's implemented for -
l = len(x); n = 4
np.add.reduce([x.reshape(np.insert([1]*(n-1),i,l)) for i in range(n)])
Timings -
In [17]: x = np.arange(100)
In [18]: %%timeit
...: n = 4 # number of iterations to add outer versions
...: l = len(x)
...: out = np.zeros([l]*n,dtype=x.dtype)
...: for i in range(n):
...: out += x.reshape(np.insert([1]*(n-1),i,l))
829 ms ± 28.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [19]: l = len(x); n = 4
In [20]: %timeit np.add.reduce([x.reshape(np.insert([1]*(n-1),i,l)) for i in range(n)])
183 ms ± 2.52 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

I don't think there's a builtin argument to repeat this procedure several times, but you can define a custom function for it fairly easily
def recursive_outer_add(arr, num):
if num == 1:
return arr
x = np.add.outer(arr, arr)
for i in range(num - 1):
x = np.add.outer(x, arr)
return x
Just as a warning: the array gets really big really fast

Short and reasonably fast:
n = 4
l = 10
x = np.arange(l)
sum(np.ix_(*n*(x,)))
timeit(lambda:sum(np.ix_(*n*(x,))),number=1000)
# 0.049082988989539444
We can speed this up a little by going back to front:
timeit(lambda:sum(reversed(np.ix_(*n*(x,)))),number=1000)
# 0.03847671199764591
We can also build our own reversed np.ix_:
from operator import getitem
from itertools import accumulate,chain,repeat
sum(accumulate(chain((x,),repeat((slice(None),None),n-1)),getitem))
timeit(lambda:sum(accumulate(chain((x,),repeat((slice(None),None),n-1)),getitem)),number=1000)
# 0.02427654700295534

How to effectively work on combinations along one dimension in np array

Given S being an n x m matrix, as a numpy array, I want to call function f on pairs of (S[i], S[j]) to calculate a particular value of interest, stored in a matrix with dimensions n x n. In my particular case the function f is commutative so f(x,y) = f(y,x).
With all this in mind I am wondering if I can do any tricks to speed this up as much as I can, n can be fairly large.
When I time the function f, it's around a couple of microseconds, which is as expected. It's a pretty straightforward calculation. Below I show you the timings I got, compared with max() and sum() for reference.
In [19]: %timeit sum(s[11])
4.68 µs ± 56.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [20]: %timeit max(s[11])
3.61 µs ± 64.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [21]: %timeit f(s[11], s[21], 50, 10, 1e-5)
1.23 µs ± 7.25 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [22]: %timeit f(s[121], s[321], 50, 10, 1e-5)
1.26 µs ± 31.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
However when I time the overall processing time for a 500x50 sample data (resulting in 500 x 500 /2 = 125K comparisons), the overall time blows up significantly (into minutes). I would have expected something like 0.2-0.3 seconds (1.25E5 * 2E-6 sec/calc).
In [12]: #jit
...: def testf(s, n, m, p):
...: tol = 1e-5
...: sim = np.zeros((n,n))
...: for i in range(n):
...: for j in range(n):
...: if i > j:
...: delta = p[i] - p[j]
...: if delta < 0:
...: res = f(s[i], s[j], m, abs(delta), tol) # <-- max(s[i])
...: else:
...: res = f(s[j], s[i], m, delta, tol) # <-- sum(s[j])
...: sim[i][j] = res
...: sim[j][i] = res
...: return sim
In code above I have changed the lines where res is assigned to max() and sum() (commented out parts) for testing and the code executes approx 100 times faster, even though the functions themselves are slower compared to my function f()
Which brings me to my questions:
Can I avoid the double loop to speed this up? Ideally I want to be able to run this for matrices where n = 1E5 size. (Comment: since the max and sum, functions work considerably faster, my guess is that the for loops isn't the bottleneck here, but still good to know if there is a better way)
What may cause the severe slowdown with my function, if it's not the double for loop?
EDIT
The specifics of the function f was asked, by some comments. It's iterating over two arrays and checks the number of values in the two arrays that are "close enough". I removed the comments and changes some variable names but the logic is as shown below. It was interesting to note that math.isclose(x,y,rel_tol) which is equivalent to the if-statements i have below, makes the code significantly slower, probably due to library call?
from numba import jit
#jit
def f(arr1, arr2, n, d, rel_tol):
counter = 0
i,j,k = 0,0,0
while (i < n and j < n and k < n):
val = arr1[j] + d
if abs(arr1[i] - arr2[k]) < rel_tol * max(arr1[i], arr2[k]):
counter += 1
i += 1
k += 1
elif abs(val - arr2[k]) < rel_tol * max(val, arr2[k]):
counter += 1
j += 1
k += 1
else:
# incremenet the index corresponding to the lightest
if arr1[i] <= arr2[k] and arr1[i] <= val:
if i < n:
i += 1
elif val <= arr1[i] and val <= arr2[k]:
if j < n:
j += 1
else:
k += 1
return counter

How can I apply a vectorized function to the previous element of a numpy array?

I want to apply a function like this:
s[i] = a*x[i] + (1 - a)*s[i-1]
where s and x are both arrays of the same length.
I don't want to use a for loop as these arrays are very large (>50 mil). I have tried doing something like this
def f(a,x):
s = [0]*len(x)
s[i] = a*x[i] + (1 - a)*s[i-1]
return s
but of course i isn't defined so this doesn't work.
Is there a way to do this using map or numpy.apply_along_axis or some other vectorized method?
I haven't come across a method that applies functions to current and previous elements of an array without using for loops, and that is really what I want to understand how to do here.
EDIT
To be unambiguous, here is the for loop implementation which works but that I want to avoid
s = [0]*len(x)
a=0.45
for i in range(len(x)):
s[i] = a*x[i] + (1-a)*s[i-1]
s[0] = x[0] # reset value of s[0]

As I wrote in an answer to basically the same question, you can't:
There is no other way (in general) except for an explicit for loop.
This is because there is no way to parallelize this task across the
rows (since every row depends on some other row).
What makes this even harder is that you can easily generate chaotic
behavior, for example with the seemingly innocent looking
logistic map: x_{n+1} = r * x_n * (1 - x_{n-1}).
You can only find a way around this if you manage to find a closed
form, essentially eliminating the recurrence relation. But this has to
be done for each recurrence relation and I am pretty sure you are not
even guaranteed that a closed form exists...

You can avoid the loop, although rather than vectorizing is more like "computing everything for each value".
import numpy as np
# Loop
def fun(x, a):
s = [0] * len(x)
for i in range(len(x)):
s[i] = a * x[i] + (1 - a) * s[i - 1]
s[0] = x[0]
return s
# Vectorized
def fun_vec(x, a):
x = np.asarray(x, dtype=np.float32)
n = np.arange(len(x))
p = a * (1 - a) ** n
# Trick from here: https://stackoverflow.com/q/49532575/1782792
pz = np.concatenate((np.zeros(len(p) - 1, dtype=p.dtype), p))
pp = np.lib.stride_tricks.as_strided(
pz[len(p) - 1:], (len(p), len(p)),
(p.strides[0], -p.strides[0]), writeable=False)
t = x[np.newaxis] * pp
s = np.sum(t, axis=1)
s[0] = x[0]
return s
x = list(range(1, 11))
a = 0.45
print(np.allclose(fun(x, a), fun_vec(x, a)))
# True
This kind of strategy is takes O(n2) memory though, and it is more computation. Depending on the case it can be faster due to parallelism (I did something similar to eliminate a tf.while_loop in TensorFlow to great success), but in this case it is actually slower:
x = list(range(1, 101))
a = 0.45
%timeit fun(x, a)
# 31 µs ± 85.2 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit fun_vec(x, a)
# 147 µs ± 2.42 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
So, there can be a non-loop version, but it is more of a curiosity than anything else.

How to generate list of unique random floats in Python

I know that there are easy ways to generate lists of unique random integers (e.g. random.sample(range(1, 100), 10)).
I wonder whether there is some better way of generating a list of unique random floats, apart from writing a function that acts like a range, but accepts floats like this:
import random
def float_range(start, stop, step):
vals = []
i = 0
current_val = start
while current_val < stop:
vals.append(current_val)
i += 1
current_val = start + i * step
return vals
unique_floats = random.sample(float_range(0, 2, 0.2), 3)
Is there a better way to do this?

Answer
One easy way is to keep a set of all random values seen so far and reselect if there is a repeat:
import random
def sample_floats(low, high, k=1):
""" Return a k-length list of unique random floats
in the range of low <= x <= high
"""
result = []
seen = set()
for i in range(k):
x = random.uniform(low, high)
while x in seen:
x = random.uniform(low, high)
seen.add(x)
result.append(x)
return result
Notes
This technique is how Python's own random.sample() is implemented.
The function uses a set to track previous selections because searching a set is O(1) while searching a list is O(n).
Computing the probability of a duplicate selection is equivalent to the famous Birthday Problem.
Given 2**53 distinct possible values from random(), duplicates are infrequent.
On average, you can expect a duplicate float at about 120,000,000 samples.
Variant: Limited float range
If the population is limited to just a range of evenly spaced floats, then it is possible to use random.sample() directly. The only requirement is that the population be a Sequence:
from __future__ import division
from collections import Sequence
class FRange(Sequence):
""" Lazily evaluated floating point range of evenly spaced floats
(inclusive at both ends)
>>> list(FRange(low=10, high=20, num_points=5))
[10.0, 12.5, 15.0, 17.5, 20.0]
"""
def __init__(self, low, high, num_points):
self.low = low
self.high = high
self.num_points = num_points
def __len__(self):
return self.num_points
def __getitem__(self, index):
if index < 0:
index += len(self)
if index < 0 or index >= len(self):
raise IndexError('Out of range')
p = index / (self.num_points - 1)
return self.low * (1.0 - p) + self.high * p
Here is a example of choosing ten random samples without replacement from a range of 41 evenly spaced floats from 10.0 to 20.0.
>>> import random
>>> random.sample(FRange(low=10.0, high=20.0, num_points=41), k=10)
[13.25, 12.0, 15.25, 18.5, 19.75, 12.25, 15.75, 18.75, 13.0, 17.75]

You can easily use your list of integers to generate floats:
int_list = random.sample(range(1, 100), 10)
float_list = [x/10 for x in int_list]
Check out this Stack Overflow question about generating random floats.
If you want it to work with python2, add this import:
from __future__ import division

If you need to guarantee uniqueness, it may be more efficient to
Try and generate n random floats in [lo, hi] at once.
If the length of the unique floats is not n, try and generate however many floats are still needed
and continue accordingly until you have enough, as opposed to generating them 1-by-1 in a Python level loop checking against a set.
If you can afford NumPy doing so with np.random.uniform can be a huge speed-up.
import numpy as np
def gen_uniq_floats(lo, hi, n):
out = np.empty(n)
needed = n
while needed != 0:
arr = np.random.uniform(lo, hi, needed)
uniqs = np.setdiff1d(np.unique(arr), out[:n-needed])
out[n-needed: n-needed+uniqs.size] = uniqs
needed -= uniqs.size
np.random.shuffle(out)
return out.tolist()
If you cannot use NumPy, it still may be more efficient depending on your data needs to apply the same concept of checking for dupes afterwards, maintaining a set.
def no_depend_gen_uniq_floats(lo, hi, n):
seen = set()
needed = n
while needed != 0:
uniqs = {random.uniform(lo, hi) for _ in range(needed)}
seen.update(uniqs)
needed -= len(uniqs)
return list(seen)
Rough benchmark
Extreme degenerate case
# Mitch's NumPy solution
%timeit gen_uniq_floats(0, 2**-50, 1000)
153 µs ± 3.71 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
# Mitch's Python-only solution
%timeit no_depend_gen_uniq_floats(0, 2**-50, 1000)
495 µs ± 43.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
# Raymond Hettinger's solution (single number generation)
%timeit sample_floats(0, 2**-50, 1000)
618 µs ± 13 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
More "normal" case (with larger sample)
# Mitch's NumPy solution
%timeit gen_uniq_floats(0, 1, 10**5)
15.6 ms ± 1.12 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
# Mitch's Python-only solution
%timeit no_depend_gen_uniq_floats(0, 1, 10**5)
65.7 ms ± 2.31 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
# Raymond Hettinger's solution (single number generation)
%timeit sample_floats(0, 1, 10**5)
78.8 ms ± 4.22 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

You could just use random.uniform(start, stop). With double precision floats, you can be relatively sure that they are unique if your set is small. If you want to generate a big number of random floats and need to avoid that you have a number twice, check before adding them to the list.
However, if you are looking for a selection of specific numbers, this is not the solution.

min_val=-5
max_val=15
numpy.random.random_sample(15)*(max_val-min_val) + min_val
or use uniform
numpy.random.uniform(min_val,max_val,size=15)

As stated in the documentation Python has the random.random() function:
import random
random.random()
Then you will get a float val as: 0.672807098390448
So all you need to do is make a for loop and print out random.random():
>>> for i in range(10):
print(random.random())

more_itertools has a generic numeric_range that handles both integers and floats.
import random
import more_itertools as mit
random.sample(list(mit.numeric_range(0, 2, 0.2)), 3)
# [0.8, 1.0, 0.4]
random.sample(list(mit.numeric_range(10.0, 20.0, 0.25)), 10)
# [17.25, 12.0, 19.75, 14.25, 15.25, 12.75, 14.5, 15.75, 13.5, 18.25]

random.uniform generate float values
import random
def get_random(low,high,length):
lst = []
while len(lst) < length:
lst.append(random.uniform(low,high))
lst = list(set(lst))
return lst

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

improve performance of list creation - python

See below (it takes ~ 1 sec on my mac) import time MAX = 1000000 starttime = time.time() y = [[n, n ** 2, 'even' if n % 2 == 0 else 'odd'] for n in range(MAX)] print('That took {} seconds'.format(time.time() - starttime))

Related

Building a numpy array as a function of previous element

Iterative outer addition Numpy

How to effectively work on combinations along one dimension in np array

How can I apply a vectorized function to the previous element of a numpy array?

How to generate list of unique random floats in Python

Categories

Resources