How to concatenate an empty array with Numpy.concatenate? - python

I need to create an array of a specific size mxn filled with empty values so that when I concatenate to that array the initial values will be overwritten with the added values.
My current code:
a = numpy.empty([2,2]) # Make empty 2x2 matrix
b = numpy.array([[1,2],[3,4]]) # Make example 2x2 matrix
myArray = numpy.concatenate((a,b)) # Combine empty and example arrays
Unfortunately, I end up making a 4x2 matrix instead of a 2x2 matrix with the values of b.
Is there anyway to make an actually empty array of a certain size so when I concatenate to it, the values of it become my added values instead of the default + added values?

Like Oniow said, concatenate does exactly what you saw.
If you want 'default values' that will differ from regular scalar elements, I would suggest you to initialize your array with NaNs (as your 'default value'). If I understand your question, you want to merge matrices so that regular scalars will override your 'default value' elements.
Anyway I suggest you to add the following:
def get_default(size_x,size_y):
# returns a new matrix filled with 'default values'
tmp = np.empty([size_x,size_y])
tmp.fill(np.nan)
return tmp
And also:
def merge(a, b):
l = lambda x, y: y if np.isnan(x) else x
l = np.vectorize(l)
return map(l, a, b)
Note that if you merge 2 matrices, and both values are non 'default' then it will take the value of the left matrix.
Using NaNs as default value, will result the expected behavior from a default value, for example all math ops will result 'default' as this value indicates that you don't really care about this index in the matrix.

If I understand your question correctly - concatenate is not what you are looking for. Concatenate does as you saw: joins along an axis.
If you are trying to have an empty matrix that becomes the values of another you could do the following:
import numpy as np
a = np.zeros([2,2])
b = np.array([[1,2],[3,4]])
my_array = a + b
--or--
import numpy as np
my_array = np.zeros([2,2]) # you can use empty here instead in this case.
my_array[0,0] = float(input('Enter first value: ')) # However you get your data to put them into the arrays.
But, I am guessing that is not what you really want as you could just use my_array = b. If you edit your question with more info I may be able to help more.
If you are worried about values adding over time to your array...
import numpy as np
a = np.zeros([2,2])
my_array = b # b is some other 2x2 matrix
''' Do stuff '''
new_b # New array has appeared
my_array = new_b # update your array to these new values. values will not add.
# Note: if you make changes to my_array those changes will carry onto new_b. To avoid this at the cost of some memory:
my_array = np.copy(new_b)

Related

Python : equivalent of Matlab ismember on rows for large arrays

I can't find an efficient way to conduct Matlab's "ismember(a,b,'rows')" with Python where a and b are arrays of size (ma,2) and (mb,2) respectively and m is the number of couples.
The ismember module (https://pypi.org/project/ismember/) crashes because at some point i.e. when doing np.all(a[:, None] == b, axis=2).any(axis=1) it needs to create an array of size (ma,mb,2) and it is too big. Moreover, even when the function works (because arrays are small enough), it is about a 100times slower than in Matlab. I guess it is because Matlab uses a built-in mex function. Why python does not have what I would think to be such an important function ? I use it countless times in my calculations...
ps : the solution proposed here Python version of ismember with 'rows' and index does not correspond to the true matlab's ismember function since it does not work element by element i.e. it does not verify that a couple of values of 'a' exists in 'b' but only if values of each column of 'a' exist in each columns of 'b'.
You can use np.unique(array,axis=0) in order to find the identical row of an array. So with this function you can simplify your 2D problem to a 1D problem which can be easily solve with np.isin():
import numpy as np
# Dummy example array:
a = np.array([[1,2],[3,4]])
b = np.array([[3,5],[2,3],[3,4]])
# ismember_row function, which rows of a are in b:
def ismember_row(a,b):
# Get the unique row index
_, rev = np.unique(np.concatenate((b,a)),axis=0,return_inverse=True)
# Split the index
a_rev = rev[len(b):]
b_rev = rev[:len(b)]
# Return the result:
return np.isin(a_rev,b_rev)
res = ismember_row(a,b)
# res = array([False, True])

How can i iteratively stack rows in numpy using for loop

I am trying to iteratively add rows to my two dimensional np.array
A = np.zeros((1,14),dtype = float)
for i in arr:
A = np.vstack(fn(i))# function returns array
And as a result i always get only last array i stacked
Can someone please explain me how to stack all rows and why is this not working
You should not vstack while iterating, as it will artificially increase memory usage, as explained in this similar question but related to pandas.
Secondly, assuming fn(i) returns a new array that you want to append to A, then that line should be A = np.vstack((A, fn(i))
Considering all this, a better option would be to create and collect all your arrays into a list that you can later stack.
A = np.zeros((1, 14), dtype=float)
arrays = [A] + [fn(i) for i in arr] # assuming `arr` is an iterable
A = np.vstack(tuple(arrays))
You can read more in the numpy.vstack docs
You must add A on vstack:
A = np.zeros((1,14),dtype = float)
for i in arr:
A = np.vstack([A,fn(i)])# function returns array

Efficiently convert a vector of bin counts to a vector of bin indices [duplicate]

Given an array of integer counts c, how can I transform that into an array of integers inds such that np.all(np.bincount(inds) == c) is true?
For example:
>>> c = np.array([1,3,2,2])
>>> inverse_bincount(c) # <-- what I need
array([0,1,1,1,2,2,3,3])
Context: I'm trying to keep track of the location of multiple sets of data, while performing computation on all of them at once. I concatenate all the data together for batch processing, but I need an index array to extract the results back out.
Current workaround:
def inverse_bincount(c):
return np.array(list(chain.from_iterable([i]*n for i,n in enumerate(c))))
using numpy.repeat :
np.repeat(np.arange(c.size), c)
no numpy needed :
c = [1,3,2,2]
reduce(lambda x,y: x + [y] * c[y], range(len(c)), [])
The following is about twice as fast on my machine than the currently accepted answer; although I must say I am surprised by how well np.repeat does. I would expect it to suffer a lot from temporary object creation, but it does pretty well.
import numpy as np
c = np.array([1,3,2,2])
p = np.cumsum(c)
i = np.zeros(p[-1],np.int)
np.add.at(i, p[:-1], 1)
print np.cumsum(i)

indexing numpy array with logical operator

I have a 2d numpy array, for instance as:
import numpy as np
a1 = np.zeros( (500,2) )
a1[:,0]=np.arange(0,500)
a1[:,1]=np.arange(0.5,1000,2)
# could be also read from txt
then I want to select the indexes corresponding to a slice that matches a criteria such as all the value a1[:,1] included in the range (l1,l2):
l1=20.0; l2=900.0; #as example
I'd like to do in a condensed expression. However, neither:
np.where(a1[:,1]>l1 and a1[:,1]<l2)
(it gives ValueError and it suggests to use np.all, which it is not clear to me in such a case); neither:
np.intersect1d(np.where(a1[:,1]>l1),np.where(a1[:,1]<l2))
is working (it gives unhashable type: 'numpy.ndarray')
My idea is then to use these indexes to map another array of size (500,n).
Is there any reasonable way to select indexes in such way? Or: is it necessary to use some mask in such case?
This should work
np.where((a1[:,1]>l1) & (a1[:,1]<l2))
or
np.where(np.logical_and(a1[:,1]>l1, a1[:,1]<l2))
Does this do what you want?
import numpy as np
a1 = np.zeros( (500,2) )
a1[:,0]=np.arange(0,500)
a1[:,1]=np.arange(0.5,1000,2)
c=(a1[:,1]>l1)*(a1[:,1]<l2) # boolean array, true if the item at that position is ok according to the criteria stated, false otherwise
print a1[c] # prints all the points in a1 that correspond to the criteria
afterwards you can than just select from your new array that you make, the points that you need (assuming your new array has dimensions (500,n)) , by doing
print newarray[c,:]

How to represent matrices in python

How can I represent matrices in python?
Take a look at this answer:
from numpy import matrix
from numpy import linalg
A = matrix( [[1,2,3],[11,12,13],[21,22,23]]) # Creates a matrix.
x = matrix( [[1],[2],[3]] ) # Creates a matrix (like a column vector).
y = matrix( [[1,2,3]] ) # Creates a matrix (like a row vector).
print A.T # Transpose of A.
print A*x # Matrix multiplication of A and x.
print A.I # Inverse of A.
print linalg.solve(A, x) # Solve the linear equation system.
Python doesn't have matrices. You can use a list of lists or NumPy
If you are not going to use the NumPy library, you can use the nested list. This is code to implement the dynamic nested list (2-dimensional lists).
Let r is the number of rows
let r=3
m=[]
for i in range(r):
m.append([int(x) for x in raw_input().split()])
Any time you can append a row using
m.append([int(x) for x in raw_input().split()])
Above, you have to enter the matrix row-wise. To insert a column:
for i in m:
i.append(x) # x is the value to be added in column
To print the matrix:
print m # all in single row
for i in m:
print i # each row in a different line
((1,2,3,4),
(5,6,7,8),
(9,0,1,2))
Using tuples instead of lists makes it marginally harder to change the data structure in unwanted ways.
If you are going to do extensive use of those, you are best off wrapping a true number array in a class, so you can define methods and properties on them. (Or, you could NumPy, SciPy, ... if you are going to do your processing with those libraries.)

Categories

Resources