Python Numpy array creation from multiple lists - python

I am learning more about numpy and need help creating an numpy array from multiple lists. Say I have 3 lists,
a = [1, 1, 1]
b = [2, 2, 2]
c = [3, 3, 3]
How can I create a new numpy array with each list as a column? Meaning that the new array would be [[1, 2, 3], [1, 2, 3], [1, 2, 3]]. I know how to do this by looping through the lists but I am not sure if there is an easier way to accomplish this. The numpy concatenate function seems to be close but I couldn't figure out how to get it to do what I'm after.
Thanks

Try with np.column_stack:
d = np.column_stack([a, b, c])

No need to use numpy. Python zip does a nice job:
In [606]: a = [1, 1, 1]
...: b = [2, 2, 2]
...: c = [3, 3, 3]
In [607]: abc = list(zip(a,b,c))
In [608]: abc
Out[608]: [(1, 2, 3), (1, 2, 3), (1, 2, 3)]
But if your heart is set on using numpy, a good way is to make a 2d array, and transpose it:
In [609]: np.array((a,b,c))
Out[609]:
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
In [610]: np.array((a,b,c)).T
Out[610]:
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
Others show how to do this with stack and column_stack, but underlying these is a concatenate. In one way or other they turn the lists into 2d arrays that can be joined on axis=1, e.g.
In [616]: np.concatenate([np.array(x)[:,None] for x in [a,b,c]], axis=1)
Out[616]:
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])

Related

How do I convert the rows of a matrix to columns in Python without NumPy?

I am working on a Sudoku Solver in Python, and I need to create functions in it. One of them checks the Sudoku Matrix and returns the number of rows in it that contains a repeating number.
def findRepeatsInColumn(matrix):
numRepeats = 0
for row in matrix:
safeChars=['[', ']', '/']
usedChars=[]
for char in str(row):
if char in usedChars and char not in safeChars:
numRepeats += 1
break
else:
usedChars.append(char)
return numRepeats
If I pass a matrix [[1, 1, 1], [2, 2, 2], [3, 3, 3]] to it, it functions fine and gives me the output 3, but for checking all columns for repeated numbers, I need to convert the rows into columns, which means I would need something like: Input: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
Output: [[1, 2, 3], [1, 2, 3], [1, 2, 3]]
Any thoughts on how I could do this without NumPy?
One simple way is to make use of zip and *:
>>> ip = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
>>> print(list(zip(*ip)))
[(1, 2, 3), (1, 2, 3), (1, 2, 3)]
Suppose you have a matrix named m
transposed = zip(*m)
if your input isnumpy.array you may use the transpose:
i = np.array([[1,1,1], [2,2,2], [3,3,3]])
print(i.T)
output:
[[1 2 3]
[1 2 3]
[1 2 3]]
or you could use:
list(map(list, zip(*i)))

Can someone explain what this Numpy array property is called?

The code that I have in place goes something as follows:
import numpy as np
z = np.array([
[1, 2],
[3]
])
x = np.array([
[4, 5]
])
print(np.multiply(x,z))
The output of this program creates a list of lists. This is different than the regular broadcasting rules that apply on arrays with equal dimensions. Is there a name for this property? Also why does it explicitly mention the word list in the output?
[[list([1, 2, 1, 2, 1, 2, 1, 2]) list([3, 3, 3, 3, 3])]]
[Finished in 0.244s]
This is just normal cell-by-cell multiplication. Because your z array is not a true matrix (it does not have a square shape), Numpy interprets it as a row of two objects:
>>> z
array([[1, 2], [3]], dtype=object)
>>> z.shape
(2,)
From here here you multiply normally - the first object is multiplied by 4, the second by 5:
>>> [1, 2]*4
[1, 2, 1, 2, 1, 2, 1, 2]
>>> [3]*5
[3, 3, 3, 3, 3]
just normal Python list multiplication - this is the result you get. Indeed, your result is not a "list of lists". It's an array of shape (1, 2) of dtype=object, so a row of two objects (which happen to be lists):
>>> np.multiply(x,z)
array([[[1, 2, 1, 2, 1, 2, 1, 2], [3, 3, 3, 3, 3]]], dtype=object)
>>> np.multiply(x,z).shape
(1, 2)

How to create a blocks matrix on Python?

I wanna create something like this:
import numpy as np
M=np.matrix([[1,2],[3,4]])
A=np.matrix([[M,M],[M,M]])
print(A)
But it doesn't work
It's a bit tricky, you have to construct each column separately, then combine the columns:
A = np.concatenate([np.concatenate([M, M]),
np.concatenate([M, M])], axis=1)
#matrix([[1, 2, 1, 2],
# [3, 4, 3, 4],
# [1, 2, 1, 2],
# [3, 4, 3, 4]])

How to find missing combinations/sequences in a 2D array with finite element values

In the case of the set np.array([1, 2, 3]), there are only 9 possible combinations/sequences of its constituent elements: [1, 1], [1, 2], [1, 3], [2, 1], [2, 2], [2, 3], [3, 1], [3, 2], [3, 3].
If we have the following array:
np.array([1, 1],
[1, 2],
[1, 3],
[2, 2],
[2, 3],
[3, 1],
[3, 2])
What is the best way, with NumPy/SciPy, to determine that [2, 1] and [3, 3] are missing? Put another way, how do we find the inverse list of sequences (when we know all of the possible element values)? Manually doing this with a couple of for loops is easy to figure out, but that would negate whatever speed gains we get from using NumPy over native Python (especially with larger datasets).
Your can generate a list of all possible pairs using itertools.product and collect all of them which are not in your array:
from itertools import product
pairs = [ [1, 1], [1, 2], [1, 3], [2, 2], [2, 3], [3, 1], [3, 2] ]
allPairs = list(map(list, product([1, 2, 3], repeat=2)))
missingPairs = [ pair for pair in allPairs if pair not in pairs ]
print(missingPairs)
Result:
[[2, 1], [3, 3]]
Note that map(list, ...) is needed to convert your list of list to a list of tuples that can be compared to the list of tuples returned by product. This can be simplified if your input array already was a list of tuples.
This is one way using itertools.product and set.
The trick here is to note that sets may only contain immutable types such as tuples.
import numpy as np
from itertools import product
x = np.array([1, 2, 3])
y = np.array([[1, 1], [1, 2], [1, 3], [2, 2],
[2, 3], [3, 1], [3, 2]])
set(product(x, repeat=2)) - set(map(tuple, y))
{(2, 1), (3, 3)}
If you want to stay in numpy instead of going back to raw python sets, you can do it using void views (based on #Jaime's answer here) and numpy's built in set methods like in1d
def vview(a):
return np.ascontiguousarray(a).view(np.dtype((np.void, a.dtype.itemsize * a.shape[1])))
x = np.array([1, 2, 3])
y = np.array([[1, 1], [1, 2], [1, 3], [2, 2],
[2, 3], [3, 1], [3, 2]])
xx = np.array([i.ravel() for i in np.meshgrid(x, x)]).T
xx[~np.in1d(vview(xx), vview(y))]
array([[2, 1],
[3, 3]])
a = np.array([1, 2, 3])
b = np.array([[1, 1],
[1, 2],
[1, 3],
[2, 2],
[2, 3],
[3, 1],
[3, 2]])
c = np.array(list(itertools.product(a, repeat=2)))
If you want to use numpy methods, try this...
Compare the array being tested against the product using broadcasting
d = b == c[:,None,:]
#d.shape is (9,7,2)
Check if both elements of a pair matched
e = np.all(d, -1)
#e.shape is (9,7)
Check if any of the test items match an item of the product.
f = np.any(e, 1)
#f.shape is (9,)
Use f as a boolean index into the product to see what is missing.
>>> print(c[np.logical_not(f)])
[[2 1]
[3 3]]
>>>
Every combination corresponds to the number in range 0..L^2-1 where L=len(array). For example, [2, 2]=>3*(2-1)+(2-1)=4. Off by -1 arises because elements start from 1, not from zero. Such mapping might be considered as natural perfect hashing for this data type.
If operations on integer sets in NumPy are faster than operations on pairs - for example, integer set of known size might be represented by bit sequence (integer sequence) - then it is worth to traverse pair list, mark corresponding bits in integer set, then look for unset ones and retrieve corresponding pairs.

how to search for unique elements by the first column of a multidimensional array

I am trying to find a way how to create a new array from a multidimensional array by taking only elements that are unique in the first column, for example if I have an array
[[1,2,3],
[1,2,3],
[5,2,3]]
After the operation I would like to get this output
[[1,2,3],
[5,2,3]]
Obviously the second an third columns do not need to be unique.
Thanks
Since you are looking to keep the first row of first column uniqueness, you can just use np.unique with its optional return_index argument which will give you the first occurring index (thus fulfils the first row criteria) among the uniqueness on A[:,0] elements, where A is the input array. Thus, we would have a vectorized solution, like so -
_,idx = np.unique(A[:,0],return_index=True)
out = A[idx]
Sample run -
In [16]: A
Out[16]:
array([[1, 2, 3],
[5, 2, 3],
[1, 4, 3]])
In [17]: _,idx = np.unique(A[:,0],return_index=True)
...: out = A[idx]
...:
In [18]: out
Out[18]:
array([[1, 2, 3],
[5, 2, 3]])
main = [[1, 2, 3], [1, 3, 4], [2, 4, 5], [3, 6, 5]]
used = []
new = [[sub, used.append(sub[0])][0] for sub in main if sub[0] not in used]
print(new)
# Output: [[1, 2, 3], [2, 3, 4], [3, 6, 5]]

Categories

Resources