Generating all possible combinations of a zeros and b ones - python

Is there an efficient way to generate a list (or an array) of all possible combinations of say 2 ones and 8 zeros? E.g.
[[0,0,0,0,0,0,0,0,1,1],
[0,0,0,0,0,0,0,1,0,1,],
...]
This works, but there could be a better way?
import numpy as np
result = []
for subset in itertools.combinations(range(10), 2):
subset = list(subset)
c = np.zeros(10)
c[subset] = 1
result.append(c)
Would love to have some ideas on how to optimize this code.

Well, it's not much different but doing bulk operations on Numpy arrays is bound to have much less overhead:
import itertools
import numpy
which = numpy.array(list(itertools.combinations(range(10), 2)))
grid = numpy.zeros((len(which), 10), dtype="int8")
# Magic
grid[numpy.arange(len(which))[None].T, which] = 1
grid
#>>> array([[1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
#>>> [1, 0, 1, 0, 0, 0, 0, 0, 0, 0],
#>>> [1, 0, 0, 1, 0, 0, 0, 0, 0, 0],
#>>> [1, 0, 0, 0, 1, 0, 0, 0, 0, 0],
#>>> [1, 0, 0, 0, 0, 1, 0, 0, 0, 0],
#>>> ...
The bulk of the time is then spent doing numpy.array(list(itertools.combinations(range(10), 2))). I tried using numpy.fromiter but I didn't get any speed improvements. Since half the time is literally generating the tuples, the only real way to improve further is to generate the combinations in something like C or Cython.

Alternative using numpy.bincount:
>>> [np.bincount(xs, minlength=10) for xs in itertools.combinations(range(10), 2)]
[array([1, 1, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int64),
array([1, 0, 1, 0, 0, 0, 0, 0, 0, 0], dtype=int64),
array([1, 0, 0, 1, 0, 0, 0, 0, 0, 0], dtype=int64),
array([1, 0, 0, 0, 1, 0, 0, 0, 0, 0], dtype=int64),
...]

Shouldn't we be using permutations for this? Eg,
from itertools import permutations as perm
a, b = 6, 2
print '\n'.join(sorted([''.join(s) for s in set(t for t in perm(a*'0' + b*'1'))]))

Related

place the mydata_array into the random location of Big_array of zeros

mydata is an numpy array of shape(10,100,100) of the form(z,y,x). And i have created the empty array of shape(10,800,800). Now i need to place the mydata_array into some random locations of empty_array such that if I would plot the output, it should look like mydata is placed randomly in the ouput plot of array(10,800,800).
I used the np.hstack() and np.vstack().
But it places the mydata_array side by side. I need to place my_data_array in random location.
How could i do this? Any Suggestions please..
Regards
Raj
Here's a demonstration of placing several copies of one array inside another, using slice indexing:
In [802]: out = np.zeros((10,10),int)
In [803]: src = np.arange(6).reshape(2,3)
In [804]: out
Out[804]:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
...
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
One copy in the upper left:
In [805]: out[:2,:3] = src
In [806]: out
Out[806]:
array([[0, 1, 2, 0, 0, 0, 0, 0, 0, 0],
[3, 4, 5, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
....
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
Several more copies:
In [808]: out[4:6, 6:9] = src
In [809]: out[1:3, 4:7] = src
In [810]: out
Out[810]:
array([[0, 1, 2, 0, 0, 0, 0, 0, 0, 0],
[3, 4, 5, 0, 0, 1, 2, 0, 0, 0],
[0, 0, 0, 0, 3, 4, 5, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 2, 0],
[0, 0, 0, 0, 0, 0, 3, 4, 5, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
Just repeat that kind of action for a selection of random locations. Make sure that the slice ranges match the src shape, and that they lie within the dimensions of the target array.
While may be possible to insert many copies at once (the flattening of the answer may be needed), let's start with understanding how to insert one copy at a time.
=========
#alvis' answer places the src items in shuffled order on one row of the out (or wrapped rows):
array([[2, 4, 5, 3, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
...
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
===================
Looped placement of multiple blocks:
def foo1(src, idx, NM):
out = np.zeros(NM, dtype=src.dtype)
n,m = src.shape
for i,j in idx:
out[i:i+n, j:j+m] = src
return out
idx=np.array([[0,0],[1,4],[4,4],[8,7],[7,2]])
In [940]: out1 = foo1(src, idx, (10,10))
In [941]: out1
Out[941]:
array([[0, 1, 2, 0, 0, 0, 0, 0, 0, 0],
[3, 4, 5, 0, 0, 1, 2, 0, 0, 0],
[0, 0, 0, 0, 3, 4, 5, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 2, 0, 0, 0],
[0, 0, 0, 0, 3, 4, 5, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 2, 0, 0, 0, 0, 0],
[0, 0, 3, 4, 5, 0, 0, 0, 1, 2],
[0, 0, 0, 0, 0, 0, 0, 3, 4, 5]])
================
Placement of a block with advanced indexing (arrays instead of slices):
In [880]: I = np.array([1,1,1,2,2,2])
In [881]: J = np.array([3,4,5,3,4,5])
In [882]: out[I,J] = src.flat
In [883]: out
Out[883]:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 2, 0, 0, 0, 0],
[0, 0, 0, 3, 4, 5, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
...
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
And for multiple blocks
def foo2(src, idx, NM):
out = np.zeros(NM, dtype=src.dtype)
n,m = src.shape
ni = len(idx)
IJ = [np.mgrid[i:i+n, j:j+m] for i,j in idx]
IJ = np.concatenate(IJ, axis=1).reshape(2,-1)
out[IJ[0,:], IJ[1,:]] = np.tile(src,(ni,1)).flat
return out
In this small example the alternate is considerably slower (14x). For (1000,1000) out it is still slow (6x). Most of the time is spent in generating IJ.
This handles the I,J index calculation much faster (it needs to be generalize), but it is still slower than the looped slicing:
def foo3(src, idx, NM):
out = np.zeros(NM, dtype=src.dtype)
n,m = src.shape
ni = len(idx)
I = np.repeat((idx[:,[0]]+np.arange(2)).flatten(),3)
J = np.repeat((idx[:,[1]]+np.arange(3)),2,axis=0).flatten()
out[I, J] = np.tile(src,(ni,1)).flat
return out
This reminds me of work I did years ago to speed up the creation of a finite element stiffness matrix in MATLAB. There it was per-element stiffness blocks that needed to be placed in a large sparse global stiffness matrix.
==================
Regular pattern with broadcasting (see edit history)
According to your question, you don't need to preserve elements relatively to the first dimension of your array. For example, if there is one non-zero element a in (100,100) matrix z=0, and two elements b and c in the matrix z=1, then in your output all a, b, c can appear in z=0. In this case I suggest the following solution:
import numpy as np
#replace this with your input data
mydata = np.ones((10,100,100))
mydata_large = np.zeros((10,800,800))
mydata_flatten = mydata.flatten()
ind = np.array([i for i in range(len(mydata_flatten))])
np.random.shuffle(ind)
mydata_large_f = mydata_large.flatten()
np.put(mydata_large_f,ind[:len(mydata_flatten)],mydata_flatten)
mydata_large = np.reshape(mydata_large_f, (10,800,800))

How can I manipulate a list of lists?

How can i iterate through a list of lists so as to make any of the lists with a "1" have the top(0), top left(0), top right(0), bottom(0), bottom right(0),bottom left(0) also become a "1" as shown below? making list 1 become list 2
list_1 =[[0,0,0,0,0,0,0,0],
[0,0,0,0,0,0,0,0],
[0,0,0,1,0,0,0,0],
[0,0,0,0,0,0,0,0]]
list_2 =[[0,0,0,0,0,0,0,0],
[0,0,1,1,1,0,0,0],
[0,0,1,1,1,0,0,0],
[0,0,1,1,1,0,0,0]]
This is a common operation known as "dilation" in image processing. Your problem is 2-dimensional, so you would be best served using
a more appropriate 2-d data structure than a list of lists, and
an already available library function, rather than reinvent the wheel
Here is an example using a numpy ndarray and scipy's binary_dilation respectively:
>>> import numpy as np
>>> from scipy import ndimage
>>> a = np.array([[0,0,0,0,0,0,0,0],
[0,0,0,0,0,0,0,0],
[0,0,0,1,0,0,0,0],
[0,0,0,0,0,0,0,0]], dtype=int)
>>> ndimage.binary_dilation(a, structure=ndimage.generate_binary_structure(2, 2)).astype(a.dtype)
array([[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0, 0]])
With numpy, which is more suitable to manipulate 2D list in general. If you're doing image analysis, see #wim answer. Otherwise here is how you could manage it with numpy only.
> import numpy as np
> list_1 =[[0,0,0,0,0,0,0,0],
[0,0,0,0,0,0,0,0],
[0,0,0,1,0,0,0,0],
[0,0,0,0,0,0,0,0]]
> l = np.array(list_1) # convert the list into a numpy array
> pos = np.where(l==1) # get the position where the array is equal to one
> pos
(array([2]), array([3]))
# make a lambda function to limit the lower indexes:
get_low = lambda x: x-1 if x>0 else x
# get_high is not needed.
# slice the array around that position and set the value to one
> l[get_low(pos[0]):pos[0]+2,
get_low(pos[1]):pos[1]+2] = 1
> l
array([[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0, 0]])
> corner
array([[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1]])
> p = np.where(corner==1)
> corner[get_low(p[0]):p[0]+2,
get_low(p[1]):p[1]+2] = 1
> corner
array([[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 1],
[0, 0, 0, 0, 0, 0, 1, 1]])
HTH

How to use combinations

I have 6 lists, say,
a=[1,1,0,0]
b=[0,1,1,0]
c=[0,0,1,1]
d .... until f.
I want to generate the results of the sum for all possible combination of the lists starting from 2 lists till 6 lists. For example, I want to calculate the result of a+b, a+c, .. a+f. Then, a+b+c, a+b+d, ...etc. I know hoe to compute the result of two or three lists but I am stuck in how to generate the combinations for lists. I tried to define list of lists and use combinations with argument 2 to generate all possible 2 combinations for 3 lists (as example) as follows:
import itertools
alphabet = [[0,0,0],[0,0,1],[0,1,0]]
combos = itertools.combinations(alphabet, 2)
usable_combos = []
for e in combos:
usable_combos.append(e)
But this simply does not produce anything. When I print usable_combos, I get:
[[0,0,0],[0,0,1],[0,1,0]]
My question is: using combinations, how can I produce all possible combinations (from 2 to 6 combinations) for the 6 different sets I have?
Use range(1, len(lis)+1) to get the value for the second parameter(r) that is passed to combinations. or range(2, len(lis)+1) if you want to start from 2.
>>> from itertools import combinations
>>> lis = [[0,0,0],[0,0,1],[0,1,0]]
>>> for i in range(1, len(lis)+1):
... for c in combinations(lis,i):
... print c
...
([0, 0, 0],)
([0, 0, 1],)
([0, 1, 0],)
([0, 0, 0], [0, 0, 1])
([0, 0, 0], [0, 1, 0])
([0, 0, 1], [0, 1, 0])
([0, 0, 0], [0, 0, 1], [0, 1, 0])
As pointed out may #abarnert in the comment, may be you want this:
>>> from pprint import pprint
>>> from itertools import chain
>>> flatten = chain.from_iterable
>>> ans = [list(flatten(c)) for i in range(2, len(lis)+1) for c in permutations(lis,i)]
>>> pprint(ans)
[[0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 1, 0],
[0, 0, 1, 0, 0, 0],
[0, 0, 1, 0, 1, 0],
[0, 1, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 1, 0, 1, 0],
[0, 0, 0, 0, 1, 0, 0, 0, 1],
[0, 0, 1, 0, 0, 0, 0, 1, 0],
[0, 0, 1, 0, 1, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0, 1],
[0, 1, 0, 0, 0, 1, 0, 0, 0]]

2d array of zeros

There is no array type in python, but to emulate it we can use lists. I want to have 2d array-like structure filled in with zeros. My question is: what is the difference, if any, in this two expressions:
zeros = [[0 for i in xrange(M)] for j in xrange(M)]
and
zeros = [[0]*M]*N
Will zeros be same? which one is better to use by means of speed and readability?
You should use numpy.zeros. If that isn't an option, you want the first version. In the second version, if you change one value, it will be changed elsewhere in the list -- e.g.:
>>> a = [[0]*10]*10
>>> a
[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]
>>> a[0][0] = 1
>>> a
[[1, 0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 0, 0, 0, 0, 0, 0, 0, 0, 0]]
This is because (as you read the expression from the inside out), you create a list of 10 zeros. You then create a list of 10 references to that initial list of 10 zeros.
Note that:
zeros = [ [0]*M for _ in range(N) ] # Use xrange if you're still stuck in the python2.x dark ages :).
will also work and it avoids the nested list comprehension. If numpy isn't on the table, this is the form I would use.
for Python 3 (no more xrange), the preferred answer
zeros = [ [0] * N for _ in range(M)]
for M x N array of zeros
In second case you create a list of references to the same list. If you have code like:
[lst] * N
where the lst is a reference to a list, you will have the following list:
[lst, lst, lst, lst, ..., lst]
But because the result list contains references to the same object, if you change a value in one row it will be changed in all other rows.
Zhe Hu's answer is the safer one and should have been the best answer. This is because if we use the accepted answer method
a = [[0] * 2] * 2
a[0][0] = 1
print(a)
will give the answer
[[1,0],[1,0]]
So even though you just want to update the first row first column value, all the values in the same column get updated. However
a = [[0] * 2 for _ in range(2)]
a[0][0] = 1
print(a)
gives the correct answer
[[1,0],[0,0]]

Selecting specific column in each row from array

I am trying to select specific column elements for each row of a numpy array. For example, in the following example:
In [1]: a = np.random.random((3,2))
Out[1]:
array([[ 0.75670668, 0.1283942 ],
[ 0.51326555, 0.59378083],
[ 0.03219789, 0.53612603]])
I would like to select the first element of the first row, the second element of the second row, and the first element of the third row. So I tried to do the following:
In [2]: b = np.array([0,1,0])
In [3]: a[:,b]
But this produces the following output:
Out[3]:
array([[ 0.75670668, 0.1283942 , 0.75670668],
[ 0.51326555, 0.59378083, 0.51326555],
[ 0.03219789, 0.53612603, 0.03219789]])
which clearly is not what I am looking for. Is there an easy way to do what I would like to do without using loops?
You can use:
a[np.arange(3), (0,1,0)]
in your example above.
OK, just to clarify here, lets do a simple example
A=diag(arange(0,10,1))
gives
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 2, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 3, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 4, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 5, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 6, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 7, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 8, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 9]])
then
A[0][0:4]
gives
array([0, 0, 0, 0])
that is first row, elements 0 to 3. But
A[0:4][1]
doesn't give the first 4 rows, the 2nd element in each. Instead we get
array([0, 1, 0, 0, 0, 0, 0, 0, 0, 0])
i.e the entire 2nd column.
A[0:4,1]
gives
array([0, 1, 0, 0])
I'm sure there is a very good reason for this and which makes perfect sense to programmers
but for those of us uninitiated in that great religion it can be quite confusing.
This isn't an answer so much as an attempt to document this a bit. For the answer above, we would have:
>>> import numpy as np
>>> A = np.array(range(6))
>>> A
array([0, 1, 2, 3, 4, 5])
>>> A.shape = (3,2)
>>> A
array([[0, 1],
[2, 3],
[4, 5]])
>>> A[(0,1,2),(0,1,0)]
array([0, 3, 4])
Specifying a list (or tuple) of individual row and column coordinates allows fancy indexing of the array. The first example in the comment looks similar at first, but the indices are slices. They don't extend over the whole range, and the shape of the array that is returned is different:
>>> A[0:2,0:2]
array([[0, 1],
[2, 3]])
For the second example in the comment
>>> A[[0,1],[0,1]]
array([0, 3])
So it seems that slices are different, but except for that, regardless of how indices are constructed, you can specify a tuple or list of (x-values, y-values), and recover those specific elements from the array.

Categories

Resources