numpy matrix from arrays - python - python

i have created matrix with 4 main informations about the nodes of a graph that i want to manipulate and i'm trying to save them as an array of arrays.
and with an associative array ordered to iterate on a certain information
this is the matrix with my informations
nodes = [[0 for x in range(4)] for y in range(n)]
for i in range(nodeNumber+1):
nodes[i] = info1[i], info2[i] , info[3] , i
How do i create the same matrix with numpy?
i've tried to create a matrix from my 'nodes' but it's like i've an array of tuples and not a matrix, as numpy does not see it as one

In [114]: n=3
In [115]: nodes = [[0 for x in range(4)] for y in range(n)]
In [116]: nodes
Out[116]: [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
So you have created a list, which contains lists
In [117]: for i in range(3):
...: nodes[i] = 1,2,3,i
...:
In [118]: nodes
Out[118]: [(1, 2, 3, 0), (1, 2, 3, 1), (1, 2, 3, 2)]
Now you have replaced each element of nodes list with a tuple (1,2,3,1). The is a complete replacement. It does not modify the sublists of nodes. So now nodes is a list of tuples.
In [119]: np.array(nodes)
Out[119]:
array([[1, 2, 3, 0],
[1, 2, 3, 1],
[1, 2, 3, 2]])
Passing that through np.array creates a 2d array, regardless of whether it is a list of lists or list of tuples.
If the sublists or tuples differ in length you'll get something else - a 1d array of dtype object.
You'll have to be more specific as to what you mean by array of arrays and associative array.

Related

Get Numpy ndarray value from list of nd points

How can I obtain value of a ndarray from a list that contains coordinates of a n-D point as efficient as possible.
Here an implementation for 3D :
1 arr = np.array([[[0, 1]]])
2 points = [[0, 0, 1], [0, 0, 0]]
3 values = []
4 for point in points:
5 x, y, z = point
6 value.append(arr[x, y, z])
7 # values -> [1, 0]
If this is not possible, is there a way to generalize lines 5-6 to nD?
I am sure there is way to achieve this using fancy indexing. Here is a way to do without the for-loop:
arr = np.array([[[0, 1]]])
points = np.array([[0, 0, 1], [0, 0, 0]])
x,y,z = np.split(points, 3, axis=1)
arr[x,y,z]
output (values):
array([[1],
[0]])
Alternatively, you could use tuple unpacking as suggested by the comment:
arr[(*points.T,)]
output:
array([1, 0])
Based on the Numpy documentation for indexing, you can easily do that, as long as you use tuples instead of lists:
arr = np.array([[[0, 1]]])
points = [(0, 0, 1), (0, 0, 0)]
values = []
for point in points:
value.append(arr[point])
# values -> [1, 0]
This works independent of dimensionality of the Numpy array involved.
Bonus: In addition to appending to a list, you can also use the Python slice function to extract ranges directly:
arr = np.array([[[0, 1]]])
points = (0, 0, slice(2) )
vals = arr[points]
# --> [0 1] (a Numpy array!)

Create an sparse matrix from a list of tuples having the indexes of the column where is a 1

Problem:
I have a list of tuples, which each tuple represents a column of a 2D-array and each element of the tuple represents the index of that column of the array that is a 1; the other entries that aren't in that tuple, are 0.
I want to create an sparse matrix with this list of tuples in an efficient way (trying to not use for loops).
Example:
# init values
list_tuples = [
(0, 2, 4),
(0, 2, 3),
(1, 3, 4)
]
n = length(list_tuples) + 1
m = 5 # arbritrary, however n >= max([ei for ei in list_tuples]) + 1
# what I need is a function which accepts this tuples and give the shape of the array
# (at least the row size, because the column size can be infered from the list of tuples)
A = some_function(list_tuples, array_shape = (m, n))
Then what I expect to have is an array of the form:
[
[1, 1, 0]
[0, 0, 1]
[1, 1, 0]
[0, 1, 1]
[1, 0, 1]
]
Your values are the indices that are required for the compressed sparse column format. You'll also need the indptr array, which for your data is the cumulative sum of the lengths of the tuples (prepended with 0). The data array would be an array of ones with the same length as the sum of the lengths of the tuples, which you can get from the last element of the cumulative sum. Here's how that looks with your example:
In [45]: from scipy.sparse import csc_matrix
In [46]: list_tuples = [
...: (0, 2, 4),
...: (0, 2, 3),
...: (1, 3, 4)
...: ]
In [47]: indices = sum(list_tuples, ()) # Flatten the tuples into one sequence.
In [48]: indptr = np.cumsum([0] + [len(t) for t in list_tuples])
In [49]: a = csc_matrix((np.ones(indptr[-1], dtype=int), indices, indptr))
In [50]: a
Out[50]:
<5x3 sparse matrix of type '<class 'numpy.int64'>'
with 9 stored elements in Compressed Sparse Column format>
In [51]: a.A
Out[51]:
array([[1, 1, 0],
[0, 0, 1],
[1, 1, 0],
[0, 1, 1],
[1, 0, 1]])
Note that csc_matrix inferred the number of rows from the maximum that it found in the indices. You can use the shape parameter to override that, e.g.
In [52]: b = csc_matrix((np.ones(indptr[-1], dtype=int), indices, indptr), shape=(7, len(list_tuples)))
In [53]: b
Out[53]:
<7x3 sparse matrix of type '<class 'numpy.int64'>'
with 9 stored elements in Compressed Sparse Column format>
In [54]: b.A
Out[54]:
array([[1, 1, 0],
[0, 0, 1],
[1, 1, 0],
[0, 1, 1],
[1, 0, 1],
[0, 0, 0],
[0, 0, 0]])
You can also generate a coo_matrix pretty easily. The flattened list_tuples gives the row indices, and np.repeat can be used to create the column indices:
In [63]: from scipy.sparse import coo_matrix
In [64]: i = sum(list_tuples, ()) # row indices
In [65]: j = np.repeat(range(len(list_tuples)), [len(t) for t in list_tuples])
In [66]: c = coo_matrix((np.ones(len(i), dtype=int), (i, j)))
In [67]: c
Out[67]:
<5x3 sparse matrix of type '<class 'numpy.int64'>'
with 9 stored elements in COOrdinate format>
In [68]: c.A
Out[68]:
array([[1, 1, 0],
[0, 0, 1],
[1, 1, 0],
[0, 1, 1],
[1, 0, 1]])

Convert n 2D arrays into a single array with lookup table

I have a series of n 2D arrays that are presented to a function as a 3D array of depth n. I want to generate a tuple of each set of values along the third axis, then replace each of these tuples with a single index value and a lookup table.
I'm working in python, with some large datasets so it needs to be scalable, so will probably use numpy. Other solutions are accepted though.
Here's what I've got so far:
In [313]: arr=np.array([[[0,0,0],[1,2,2],[3,0,0]],[[0,1,0],[1,3,2],[0,0,0]]])
In [314]: stacked = np.stack((arr[0], arr[1]), axis=2)
In [315]: pairs = stacked.reshape(-1, arr.shape[0])
In [316]: pairs
Out[316]:
array([[0, 0],
[0, 1],
[0, 0],
[1, 1],
[2, 3],
[2, 2],
[3, 0],
[0, 0],
[0, 0]])
In [317]: unique = set([tuple(a) for a in pairs])
In [318]: lookup = sorted(list(unique))
In [319]: lookup
Out[319]: [(0, 0), (0, 1), (1, 1), (2, 2), (2, 3), (3, 0)]
Now, I want to create an output array, using the indexes of the values in the lookup table, so the output would be:
[0, 1, 0, 2, 4, 3, 5, 0, 0]
This example is just with two input 2D arrays, but there could be many more.
So, I've come up with a solution that produces the outputs I want, but is it the most efficient method of doing this? In particular, the lookup.index call is a bit costly. Does anyone have a better way?
def squash_array(arr):
tuples = arr.T.reshape(-1, arr.shape[0])
lookup = sorted(list(set([tuple(a) for a in tuples])))
out_arr = np.array([lookup.index(tuple(a)) for a in tuples]).reshape(arr.shape[1:][::-1]).T
return out_arr, lookup

Using a numpy array to assign values to another array

I have the following numpy array matrix ,
matrix = np.zeros((3,5), dtype = int)
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
Suppose I have this numpy array indices as well
indices = np.array([[1,3], [2,4], [0,4]])
array([[1, 3],
[2, 4],
[0, 4]])
Question: How can I assign 1s to the elements in the matrix where their indices are specified by the indices array. A vectorized implementation is expected.
For more clarity, the output should look like:
array([[0, 1, 0, 1, 0], #[1,3] elements are changed
[0, 0, 1, 0, 1], #[2,4] elements are changed
[1, 0, 0, 0, 1]]) #[0,4] elements are changed
Here's one approach using NumPy's fancy-indexing -
matrix[np.arange(matrix.shape[0])[:,None],indices] = 1
Explanation
We create the row indices with np.arange(matrix.shape[0]) -
In [16]: idx = np.arange(matrix.shape[0])
In [17]: idx
Out[17]: array([0, 1, 2])
In [18]: idx.shape
Out[18]: (3,)
The column indices are already given as indices -
In [19]: indices
Out[19]:
array([[1, 3],
[2, 4],
[0, 4]])
In [20]: indices.shape
Out[20]: (3, 2)
Let's make a schematic diagram of the shapes of row and column indices, idx and indices -
idx (row) : 3
indices (col) : 3 x 2
For using the row and column indices for indexing into input array matrix, we need to make them broadcastable against each other. One way would be to introduce a new axis into idx, making it 2D by pushing the elements into the first axis and allowing a singleton dim as the last axis with idx[:,None], as shown below -
idx (row) : 3 x 1
indices (col) : 3 x 2
Internally, idx would be broadcasted, like so -
In [22]: idx[:,None]
Out[22]:
array([[0],
[1],
[2]])
In [23]: indices
Out[23]:
array([[1, 3],
[2, 4],
[0, 4]])
In [24]: np.repeat(idx[:,None],2,axis=1) # indices has length of 2 along cols
Out[24]:
array([[0, 0], # Internally broadcasting would be like this
[1, 1],
[2, 2]])
Thus, the broadcasted elements from idx would be used as row indices and column indices from indices for indexing into matrix for setting elements in it. Since, we had -
idx = np.arange(matrix.shape[0]),
Thus, we would end up with -
matrix[np.arange(matrix.shape[0])[:,None],indices] for setting elements.
this involves loop and hence may not be very efficient for large arrays
for i in range(len(indices)):
matrix[i,indices[i]] = 1
> matrix
Out[73]:
array([[0, 1, 0, 1, 0],
[0, 0, 1, 0, 1],
[1, 0, 0, 0, 1]])

Python reshape list to ndim array

Hi I have a list flat which is length 2800, it contains 100 results for each of 28 variables: Below is an example of 4 results for 2 variables
[0,
0,
1,
1,
2,
2,
3,
3]
I would like to reshape the list to an array (2,4) so that the results for each variable are in a single element.
[[0,1,2,3],
[0,1,2,3]]
You can think of reshaping that the new shape is filled row by row (last dimension varies fastest) from the flattened original list/array.
If you want to fill an array by column instead, an easy solution is to shape the list into an array with reversed dimensions and then transpose it:
x = np.reshape(list_data, (100, 28)).T
Above snippet results in a 28x100 array, filled column-wise.
To illustrate, here are the two options of shaping a list into a 2x4 array:
np.reshape([0, 0, 1, 1, 2, 2, 3, 3], (4, 2)).T
# array([[0, 1, 2, 3],
# [0, 1, 2, 3]])
np.reshape([0, 0, 1, 1, 2, 2, 3, 3], (2, 4))
# array([[0, 0, 1, 1],
# [2, 2, 3, 3]])
You can specify the interpretation order of the axes using the order parameter:
np.reshape(arr, (2, -1), order='F')
Step by step:
# import numpy library
import numpy as np
# create list
my_list = [0,0,1,1,2,2,3,3]
# convert list to numpy array
np_array=np.asarray(my_list)
# reshape array into 4 rows x 2 columns, and transpose the result
reshaped_array = np_array.reshape(4, 2).T
#check the result
reshaped_array
array([[0, 1, 2, 3],
[0, 1, 2, 3]])
The answers above are good. Adding a case that I used.
Just if you don't want to use numpy and keep it as list without changing the contents.
You can run a small loop and change the dimension from 1xN to Nx1.
tmp=[]
for b in bus:
tmp.append([b])
bus=tmp
It maybe not efficient in case of very large numbers. But it works for small set of numbers.
Thanks

Categories

Resources