Array initialization in Python - python

I want to initialize an array with 10 values starting at X and incrementing by Y. I cannot directly use range() as it requires to give the maximum value, not the number of values.
I can do this in a loop, as follows:
a = []
v = X
for i in range(10):
a.append(v)
v = v + Y
But I'm certain there's a cute python one liner to do this ...

>>> x = 2
>>> y = 3
>>> [i*y + x for i in range(10)]
[2, 5, 8, 11, 14, 17, 20, 23, 26, 29]

You can use this:
>>> x = 3
>>> y = 4
>>> range(x, x+10*y, y)
[3, 7, 11, 15, 19, 23, 27, 31, 35, 39]

Just another way of doing it
Y=6
X=10
N=10
[y for x,y in zip(range(0,N),itertools.count(X,Y))]
[10, 16, 22, 28, 34, 40, 46, 52, 58, 64]
And yet another way
map(lambda (x,y):y,zip(range(0,N),itertools.count(10,Y)))
[10, 16, 22, 28, 34, 40, 46, 52, 58, 64]
And yet another way
import numpy
numpy.array(range(0,N))*Y+X
array([10, 16, 22, 28, 34, 40, 46, 52, 58, 64])
And even this
C=itertools.count(10,Y)
[C.next() for i in xrange(10)]
[10, 16, 22, 28, 34, 40, 46, 52, 58, 64]

[x+i*y for i in xrange(1,10)]
will do the job

If I understood your question correctly:
Y = 6
a = [x + Y for x in range(10)]
Edit: Oh, I see I misunderstood the question. Carry on.

Related

How to generate sequential subsets of integers?

I have the following start and end values:
start = 0
end = 54
I need to generate subsets of 4 sequential integers starting from start until end with a space of 20 between each subset. The result should be this one:
0, 1, 2, 3, 24, 25, 26, 27, 48, 49, 50, 51
In this example, we obtained 3 subsets:
0, 1, 2, 3
24, 25, 26, 27
48, 49, 50, 51
How can I do it using numpy or pandas?
If I do r = [i for i in range(0,54,4)], I get [0, 4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52].
This should get you what you want:
j = 20
k = 4
result = [split for i in range(0,55, j+k) for split in range(i, k+i)]
print (result)
Output:
[0, 1, 2, 3, 24, 25, 26, 27, 48, 49, 50, 51]
Maybe something like this:
r = [j for i in range(0, 54, 24) for j in range(i, i + 4)]
print(r)
[0, 1, 2, 3, 24, 25, 26, 27, 48, 49, 50, 51]
you can use numpy.arange which returns an ndarray object containing evenly spaced values within a given range
import numpy as np
r = np.arange(0, 54, 4)
print(r)
Result
[0, 4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52]
Numpy approach
You can use np.arange to generate number with a step value of 20 + 4, where 20 is for space between each interval and 4 for each sequential sub array.
start = 0
end = 54
out = np.arange(0, 54, 24) # array([ 0, 24, 48]) These are the starting points
# for each subarray
step = np.tile(np.arange(4), (len(out), 1))
# [[0 1 2 3]
# [0 1 2 3]
# [0 1 2 3]]
res = out[:, None] + step
# array([[ 0, 1, 2, 3],
# [24, 25, 26, 27],
# [48, 49, 50, 51]])
This can be done with plane python:
rangeStart = 0
rangeStop = 54
setLen = 4
step = 20
stepTot = step + setLen
a = list( list(i+s for s in range(setLen)) for i in range(rangeStart,rangeStop,stepTot))
In this case you will get the subsets as sublists in the array.
I dont think you need to use numpy or pandas to do what you want. I achieved it with a simple while loop
num = 0
end = 54
sequence = []
while num <= end:
sequence.append(num)
num += 1
if num%4 == 0: //If four numbers have been added
num += 20
//output: [0, 1, 2, 3, 24, 25, 26, 27, 48, 49, 50, 51]

Multiprocessing of two for loops

I'm struggling with the implementation of an algorithm in python (2.7) to parallelize the computation of a physics problem. There's a parameter space over two variables (let's say a and b) over which I would like to run my written program f(a,b) which returns two other variables c and d.
Up to now, I worked with two for-loops over a and b to calculate two arrays for c and d which are then saved as txt documents. Since the parameter space is relatively large and each calculation of a point f(a,b) in it takes relatively long, it would be great to use all of my 8 CPU cores for the parameter space scan.
I've read about multithreading and multiprocessing and it seems that multiprocessing is what I'm searching for. Do you know of a good code example for this application or resources to learn about the basics of multiprocessing for my rather simple application?
Here is an example of how you might use multiprocessing with a simple function that takes two arguments and returns a tuple of two numbers, and a parameter space over which you want to do the calculation:
from itertools import product
from multiprocessing import Pool
import numpy as np
def f(a, b):
c = a + b
d = a * b
return (c, d)
a_vals = [1, 2, 3, 4, 5, 6]
b_vals = [10, 11, 12, 13, 14, 15, 16, 17]
na = len(a_vals)
nb = len(b_vals)
p = Pool(8) # <== maximum number of simultaneous worker processes
answers = np.array(p.starmap(f, product(a_vals, b_vals))).reshape(na, nb, 2)
c_vals = answers[:,:,0]
d_vals = answers[:,:,1]
This gives the following:
>>> c_vals
array([[11, 12, 13, 14, 15, 16, 17, 18],
[12, 13, 14, 15, 16, 17, 18, 19],
[13, 14, 15, 16, 17, 18, 19, 20],
[14, 15, 16, 17, 18, 19, 20, 21],
[15, 16, 17, 18, 19, 20, 21, 22],
[16, 17, 18, 19, 20, 21, 22, 23]])
>>> d_vals
array([[ 10, 11, 12, 13, 14, 15, 16, 17],
[ 20, 22, 24, 26, 28, 30, 32, 34],
[ 30, 33, 36, 39, 42, 45, 48, 51],
[ 40, 44, 48, 52, 56, 60, 64, 68],
[ 50, 55, 60, 65, 70, 75, 80, 85],
[ 60, 66, 72, 78, 84, 90, 96, 102]])
The p.starmap returns a list of 2-tuples, from which the c and d values are then extracted.
This assumes that you will do your file I/O in the main program after getting back all the results.
Addendum:
If p.starmap is unavailable (Python 2), then instead you can change your function to take a single input (a 2-element tuple):
def f(inputs):
a, b = inputs
# ... etc as before ...
and then use p.map in place of p.starmap in the above code.
If it is not convenient to change the function (e.g. it is also called from elsewhere), then you can of course write a wrapper function:
def f_wrap(inputs):
a, b = inputs
return f(a, b)
and call that instead.

Print list in specified range Python

I'm new to Python and I have this problem
I have a list of numbers like this:
n = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]
I want to print from 11 to 37, that means the output = 11, 13,.... 37.
I tried to print(n[11:37]) but of course it will print [37, 41, 43, 47]
because that is range index.
Any ideas or does Python have any built-in method for this ?
This should do the job...
n = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]
n.sort()
mylist = [x for x in n if x in range(11, 38)]
print(mylist)
Want to print that as comma separated string:
print(mylist.strip('[]'))
This will work. (Assuming list is sorted)
print n[n.index(11): n.index(37)+1]
Output:
[11, 13, 17, 19, 23, 29, 31, 37]
Considering your list is ordered and it has no duplicates:
n = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]
print(",".join(map(str,n[n.index(11): n.index(37)+1])))
Here you have a live example
Using numpy:
import numpy as np
narr = np.array(n)
m = (narr >= 11) & (narr <= 37)
for v in narr[m]:
print(v)
# or, to get rid of the loop:
print('\n'.join(map(str, narr[m])))
it pretty simple, since your list is already sorted you can write
my_list = [x for x in n if x in range(11, 38)]
print(*my_list)
what the '*' does is that it unpacks the array into individual elements, a term known as unpacking.This will produce the actual result you wanted and not an array
If your data is sorted, you can use a generator expression with either a range object or chained comparisons:
n = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]
print(*(i for i in n if i in range(11, 38)), sep=', ')
print(*(i for i in n if 11 <= i <= 37), sep=', ')
If your data is unsorted and you can use indices of the first occurrences of each value, you can slice your list:
print(*n[n.index(11): n.index(37)+1], sep=', ')
Result with the data you have provided:
11, 13, 17, 19, 23, 29, 31, 37

Periodically slice an list/array

Suppose i have a a = range(1,51). How can i slice a to create a new list that look like this:
[1,2,3,11,12,13,21,22,23,31,32,33,41,42,43]
Is there a pythonic way that can help me do this without writing function?
I know that [start:stop:step] for periodically slicing one element but i'm not sure if i'm missing something obvious.
EDIT: The suggested duplicate question/answer is not the same as mine question. I simply asked to slice/extract periodically elements from a larger list/array. The suggested duplicate modifies elements of existing array.
Another option you can go with logical vector subsetting, something like:
a[(a - 1) % 10 < 3]
# array([ 1, 2, 3, 11, 12, 13, 21, 22, 23, 31, 32, 33, 41, 42, 43])
(a - 1) % 10 finds the remainder of array by 10 (period); and (a - 1) % 10 < 3 gives a logical vector which gives true for the first three elements of every ten elements.
What you want is more complicated than a simple slice, so you're going to need some kind of (likely fairly simple) function to do it. I'd look at using zip to combine multiple slices, something like:
reduce(lambda a,b:a+b, map(list, zip(a[1::10], a[2::10], a[3::10])))
Given:
>>> li=range(1,52)
You can do:
>>> [l for sl in [li[i:i+3] for i in range(0,len(li),10)] for l in sl]
[1, 2, 3, 11, 12, 13, 21, 22, 23, 31, 32, 33, 41, 42, 43, 51]
Or, if you want only full sublists:
>>> [l for sl in [li[i:i+3] for i in range(0,len(li),10)] for l in sl if len(sl)==3]
[1, 2, 3, 11, 12, 13, 21, 22, 23, 31, 32, 33, 41, 42, 43]
Or, given:
>>> li=range(1,51)
Then you do not need to test sublists:
>>> [l for sl in [li[i:i+3] for i in range(0,len(li),10)] for l in sl]
[1, 2, 3, 11, 12, 13, 21, 22, 23, 31, 32, 33, 41, 42, 43]
Psidom's answer's index math can be adapted to a list comprehension too
a = range(1,51)
[n for n in a if (n - 1) % 10 < 3]
Out[23]: [1, 2, 3, 11, 12, 13, 21, 22, 23, 31, 32, 33, 41, 42, 43]

fast categorization (binning)

I've a huge number of entries, every one is a float number. These data x are accesible with an iterator. I need to classify all the entries using selection like 10<y<=20, 20<y<=50, .... where y are data from an other iterables. The number of entries is much more than the number of selections. At the end I want a dictionary like:
{ 0: [all events with 10<x<=20],
1: [all events with 20<x<=50], ... }
or something similar. For example I'm doing:
for x, y in itertools.izip(variable_values, binning_values):
thebin = binner_function(y)
self.data[tuple(thebin)].append(x)
in general y is multidimensional.
This is very slow, is there a faster solution, for example with numpy? I think the problem cames from the list.append method I'm using and not from the binner_function
A fast way to get the assignments in numpy is using np.digitize:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.digitize.html
You'd still have to split the resulting assignments up into groups. If x or y is multidimensional, you will have to flatten the arrays first. You could then get the unique bin assignments, and then iterate over those in conjunction with np.where to split the the assigments up into groups. This will probably be faster if the number of bins is much smaller than the number of elements that need to be binned.
As a somewhat trivial example that you will need to tweak/elaborate on for your particular problem (but is hopefully enough to get you started with with a numpy solution):
In [1]: import numpy as np
In [2]: x = np.random.normal(size=(50,))
In [3]: b = np.linspace(-20,20,50)
In [4]: assign = np.digitize(x,b)
In [5]: assign
Out[5]:
array([23, 25, 25, 25, 24, 26, 24, 26, 23, 24, 25, 23, 26, 25, 27, 25, 25,
25, 25, 26, 26, 25, 25, 26, 24, 23, 25, 26, 26, 24, 24, 26, 27, 24,
25, 24, 23, 23, 26, 25, 24, 25, 25, 27, 26, 25, 27, 26, 26, 24])
In [6]: uid = np.unique(assign)
In [7]: adict = {}
In [8]: for ii in uid:
...: adict[ii] = np.where(assign == ii)[0]
...:
In [9]: adict
Out[9]:
{23: array([ 0, 8, 11, 25, 36, 37]),
24: array([ 4, 6, 9, 24, 29, 30, 33, 35, 40, 49]),
25: array([ 1, 2, 3, 10, 13, 15, 16, 17, 18, 21, 22, 26, 34, 39, 41, 42, 45]),
26: array([ 5, 7, 12, 19, 20, 23, 27, 28, 31, 38, 44, 47, 48]),
27: array([14, 32, 43, 46])}
For dealing with flattening and then unflattening numpy arrays, see:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.unravel_index.html
http://docs.scipy.org/doc/numpy/reference/generated/numpy.ravel_multi_index.html
np.searchsorted is your friend. As I read somewhere here in another answer to the same topic, it's currently a good bit faster than digitize, and does the same job.
http://docs.scipy.org/doc/numpy/reference/generated/numpy.searchsorted.html

Categories

Resources