I have 3 arrays of equal length (e.g.):
[a, b, c]
[1, 2, 3]
[i, ii, iii]
I would like to combine them into a matrix:
|a, 1, i |
|b, 2, ii |
|c, 3, iii|
The problem I have is that when I use codes such as dstack, hstack or concatenate. I get them numerically added or stacked in a fashion that I can work with.
You could use zip():
which maps the similar index of multiple containers so that they can be used just using as single entity.
a1 = ['a', 'b', 'c']
b1 = ['1', '2', '3']
c1 = ['i', 'ii', 'iii']
print(list(zip(a1,b1,c1)))
OUTPUT:
[('a', '1', 'i'), ('b', '2', 'ii'), ('c', '3', 'iii')]
EDIT:
I just thought of stepping forward, how about flattening the list afterwards and then use numpy.reshape
flattened_list = []
#flatten the list
for x in res:
for y in x:
flattened_list.append(y)
#print(flattened_list)
import numpy as np
data = np.array(flattened_list)
shape = (3, 3)
print(data.reshape( shape ))
OUTPUT:
[['a' '1' 'i']
['b' '2' 'ii']
['c' '3' 'iii']]
OR
for one liners out there:
#flatten the list
for x in res:
for y in x:
flattened_list.append(y)
# print(flattened_list)
print([flattened_list[i:i+3] for i in range(0, len(flattened_list), 3)])
OUTPUT:
[['a', '1', 'i'], ['b', '2', 'ii'], ['c', '3', 'iii']]
OR
As suggested by #norok2
print(list(zip(*zip(a1, b1, c1))))
OUTPUT:
[('a', 'b', 'c'), ('1', '2', '3'), ('i', 'ii', 'iii')]
Assuming that you have 3 numpy arrays:
>>> a, b, c = np.random.randint(0, 9, 9).reshape(3, 3)
>>> print(a, b, c)
[4 1 4] [5 8 5] [3 0 2]
then you can stack them vertically (i.e. along the first dimension), and then transpose the resulting matrix to get the order you need:
>>> np.vstack((a, b, c)).T
array([[4, 5, 3],
[1, 8, 0],
[4, 5, 2]])
A slightly more verbose example is to instead stack horizontally, but this requires that your arrays are made into 2D using reshape:
>>> np.hstack((a.reshape(3, 1), b.reshape(3, 1), c.reshape(3, 1)))
array([[4, 5, 3],
[1, 8, 0],
[4, 5, 2]])
this gives you a list of tuples, which might not be what you want:
>>> list(zip([1,2,3],[4,5,6],[7,8,9]))
[(1, 4, 7), (2, 5, 8), (3, 6, 9)]
this gives you a numpy array:
>>> from numpy import array
>>> array([[1,2,3],[4,5,6],[7,8,9]]).transpose()
array([[1, 4, 7],
[2, 5, 8],
[3, 6, 9]])
If you have different data types in each array, then it would make sense to use pandas for this:
# Iterative approach, using concat
import pandas as pd
my_arrays = [['a', 'b', 'c'], [1, 2, 3], ['i', 'ii', 'iii']]
df1 = pd.concat([pd.Series(array) for array in my_arrays], axis=1)
# Named arrays
array1 = ['a', 'b', 'c']
array2 = [1, 2, 3]
array3 = ['i', 'ii', 'iii']
df2 = pd.DataFrame({'col1': array1,
'col2': array2,
'col3': array3})
Now you have the structure you desired, with appropriate data types for each column:
print(df1)
# 0 1 2
# 0 a 1 i
# 1 b 2 ii
# 2 c 3 iii
print(df2)
# col1 col2 col3
# 0 a 1 i
# 1 b 2 ii
# 2 c 3 iii
print(df1.dtypes)
# 0 object
# 1 int64
# 2 object
# dtype: object
print(df2.dtypes)
# col1 object
# col2 int64
# col3 object
# dtype: object
You can extract the numpy array with the .values attribute:
df1.values
# array([['a', 1, 'i'],
# ['b', 2, 'ii'],
# ['c', 3, 'iii']], dtype=object)
Related
This works:
import pandas as pd
data = [["aa", 1, 2], ["bb", 3, 4]]
df = pd.DataFrame(data, columns=['id', 'a', 'b'])
df = df.set_index('id')
print(df)
"""
a b
id
aa 1 2
bb 3 4
"""
but is it possible in just one call of pd.DataFrame(...) directly with a parameter, without using set_index after?
Convert values to 2d array:
data = [["aa", 1, 2], ["bb", 3, 4]]
arr = np.array(data)
df = pd.DataFrame(arr[:, 1:], columns=['a', 'b'], index=arr[:, 0])
print (df)
a b
aa 1 2
bb 3 4
Details:
print (arr)
[['aa' '1' '2']
['bb' '3' '4']]
Another solution:
data = [["aa", 1, 2], ["bb", 3, 4], ["cc", 30, 40]]
cols = ['a','b']
L = list(zip(*data))
print (L)
[('aa', 'bb', 'cc'), (1, 3, 30), (2, 4, 40)]
df = pd.DataFrame(dict(zip(cols, L[1:])), index=L[0])
print (df)
a b
aa 1 2
bb 3 4
cc 30 40
I have two numpy array, I want to remove duplicate values from the first array (including the original value) and remove the items in the matching positions in the second array.
For example:
a = [1, 2, 2, 3]
b = ['a', 'd', 'f', 'c']
Becomes:
a = [1, 3]
b = ['a', 'c']
I need to do this efficiently and not use the naive solution which is time consuming
Here's one with np.unique -
unq,idx,c = np.unique(a, return_index=True, return_counts=True)
unq_idx = np.sort(idx[c==1])
a_out = a[unq_idx]
b_out = b[unq_idx]
Sample run -
In [34]: a
Out[34]: array([1, 2, 2, 3])
In [35]: b
Out[35]: array(['a', 'd', 'f', 'c'], dtype='|S1')
In [36]: unq,idx,c = np.unique(a, return_index=1, return_counts=1)
...: unq_idx = idx[c==1]
...: a_out = a[unq_idx]
...: b_out = b[unq_idx]
In [37]: a_out
Out[37]: array([1, 3])
In [38]: b_out
Out[38]: array(['a', 'c'], dtype='|S1')
Since you are open to NumPy, you may wish to consider Pandas, which uses NumPy internally:
import pandas as pd
a = pd.Series([1, 2, 2, 3])
b = pd.Series(['a', 'd', 'f', 'c'])
flags = ~a.duplicated(keep=False)
idx = flags[flags].index
a = a[idx].values
b = b[idx].values
Result:
print(a, b, sep='\n')
array([1, 3], dtype=int64)
array(['a', 'c'], dtype=object)
I want to sort a list of lists ie:
[['Y', 'K', 'E'],
[3, 1, 2],
[6, 4, 5]]
The first row of that list should match another supplied list (this list will always containing matching letters to the first row of the first list):
['K', 'E', 'Y']
So that the final output is:
[['K', 'E', 'Y'],
[1, 2, 3],
[4, 5, 6]]
The easy way is to transpose it with zip(), sort it with a key of the first element's index in the key list, then transpose it back:
>>> data = [['Y', 'K', 'E'],
... [3, 1, 2],
... [6, 4, 5]]
>>> key = 'KEY'
>>> sorted(zip(*data), key=lambda x: key.index(x[0]))
[('K', 1, 4), ('E', 2, 5), ('Y', 3, 6)]
>>> list(zip(*sorted(zip(*data), key=lambda x: key.index(x[0]))))
[('K', 'E', 'Y'), (1, 2, 3), (4, 5, 6)]
A pretty straightforward approach using list comprehensions:
>>> L = [['Y', 'K', 'E'], [3, 1, 2], [6, 4, 5]]
>>> indexes = [L[0].index(x) for x in "KEY"]
>>> [[row[i] for i in indexes] for row in L]
[['K', 'E', 'Y'], [1, 2, 3], [4, 5, 6]]
Quick Summary:
need_to_reorder = [['a', 'b', 'c', 'd'], [1, 2, 3, 4]]
I want to set an order for the need_to_reorder[0][x] x values using my sorting array
sorting_array = [1, 3, 0, 2]
Required result: need_to_reorder will equal
[['b', 'd', 'a', 'c'], [2, 4, 1, 3]]
Searching for the answer, I tried using numPy:
import numpy as np
sorting_array = [1, 3, 0, 2]
i = np.array(sorting_array)
print i ## Results: [1 3 0 2] <-- No Commas?
need_to_reorder[:,i]
RESULTS:
TypeError: list indicies must be integers, not tuple
I'm looking for a correction to the code above or an entirely different approach.
You can try a simple nested comprehension
>>> l = [['a', 'b', 'c', 'd'], [1, 2, 3, 4]]
>>> s = [1, 3, 0, 2]
>>> [[j[i] for i in s] for j in l]
[['b', 'd', 'a', 'c'], [2, 4, 1, 3]]
If you need this as a function you can have a very simple function as in
def reorder(need_to_reorder,sorting_array)
return [[j[i] for i in sorting_array] for j in need_to_reorder]
Do note that this can be solved using map function also. However in this case, a list comp is preferred as the map variant would require a lambda function. The difference between map and a list-comp is discussed in full length in this answer
def order_with_sort_array(arr, sort_arr):
assert len(arr) == len(sort_arr)
return [arr[i] for i in sort_arr]
sorting_array = [1, 3, 0, 2]
need_to_reorder = [['a', 'b', 'c', 'd'], [1, 2, 3, 4]]
after_reordered = map(lambda arr : order_with_sort_array(arr, sorting_array),
need_to_reorder)
This should work
import numpy as np
ntr = np.array([['a', 'b', 'c', 'd'], [1, 2, 3, 4]])
sa = np.array([1, 3, 0, 2])
print np.array( [ntr[0,] , np.array([ntr[1,][sa[i]] for i in range(sa.shape[0])])] )
>> [['a' 'b' 'c' 'd'],['2' '4' '1' '3']]
Using nested lists like this:
N = [['D','C','A','B'],
[2,3,4,5],
[6,7,8,9]]
How could I swap two columns? for instance column C and column A.
With a for loop and a little help from this post:
Code:
N = [["D","C","A","B"],
[2,3,4,5],
[6,7,8,9]]
# Swap the last two columns
for item in N:
item[2], item[3] = item[3], item[2]
# Or as a function
def swap_columns(your_list, pos1, pos2):
for item in your_list:
item[pos1], item[pos2] = item[pos2], item[pos1]
Output:
swap_columns(N, 2, 3)
[['D', 'C', 'B', 'A'], [2, 3, 5, 4], [6, 7, 9, 8]]
Another possibility, using zip:
In [66]: N = [['D', 'C', 'A', 'B'], [2, 3, 4, 5], [6, 7, 8, 9]]
Transpose using zip:
In [67]: M = list(zip(*N))
Swap rows 1 and 2:
In [68]: M[1], M[2] = M[2], M[1]
Transpose again:
In [69]: N2 = list(zip(*M))
In [70]: N2
Out[70]: [('D', 'A', 'C', 'B'), (2, 4, 3, 5), (6, 8, 7, 9)]
The result is a list of tuples. If you need a list of lists:
In [71]: [list(t) for t in zip(*M)]
Out[71]: [['D', 'A', 'C', 'B'], [2, 4, 3, 5], [6, 8, 7, 9]]
This doesn't make the swap in-place. For that, see #DaveTucker's answer.
>>> N = [['D','C','A','B'],
... [2,3,4,5],
... [6,7,8,9]]
>>>
>>> lineorder = 0,2,1,3
>>>
>>> [[r[x] for x in lineorder] for r in N]
[['D', 'A', 'C', 'B'], [2, 4, 3, 5], [6, 8, 7, 9]]
If you don't want the order hardcoded, you can generate it easily like this
>>> lineorder = [N[0].index(x) for x in ['D','A','C','B']]
To create a copy of N with two columns swapped, S, in one line, You could do the following:
>>> N = [['D','C','A','B'],[2,3,4,5],[6,7,8,9]]
>>> S = [[n[0],n[2],n[1],n[3]] for n in N]
>>> S
[['D', 'A', 'C', 'B'], [2, 4, 3, 5], [6, 8, 7, 9]]
This assumes that each nested list of N are equal in size.
l = [1, 2]
emptl = []
for item in l:
empl.append([item[1], item[0]])