Related
Doing np.roll(a, 1, axis = 1) on:
a = np.array([
[6, 3, 9, 2, 3],
[1, 7, 8, 1, 2],
[5, 4, 2, 2, 4],
[3, 9, 7, 6, 5],
])
results in the correct:
array([
[3, 6, 3, 9, 2],
[2, 1, 7, 8, 1],
[4, 5, 4, 2, 2],
[5, 3, 9, 7, 6]
])
The documentation says:
If a tuple, then axis must be a tuple of the same size, and each of the given axes is shifted by the corresponding number.
Now I like to roll rows of a by different values, like [1,2,1,3] meaning, first row will be rolled by 1, second by 2, third by 1 and forth by 3. But np.roll(a, [1,2,1,3], axis=(1,1,1,1)) doesn't seem to do it. What would be the correct interpretation of the sentence in the docs?
By specifying a tuple in np.roll you can roll an array along various axes. For example, np.roll(a, (3,2), axis=(0,1)) will shift each element of a by 3 places along axis 0, and it will also shift each element by 2 places along axis 1. np.roll does not have an option to roll each row by a different amount. You can do it though for example as follows:
import numpy as np
a = np.array([
[6, 3, 9, 2, 3],
[1, 7, 8, 1, 2],
[5, 4, 2, 2, 4],
[3, 9, 7, 6, 5],
])
shifts = np.c_[[1,2,1,3]]
a[np.c_[:a.shape[0]], (np.r_[:a.shape[1]] - shifts) % a.shape[1]]
It gives:
array([[3, 6, 3, 9, 2],
[1, 2, 1, 7, 8],
[4, 5, 4, 2, 2],
[7, 6, 5, 3, 9]])
I have a matrix:
m = [
[5, 1, 7, 5],
[2, 4, 9, 5],
[3, 4, 5, 5],
[3, 4, 6, 7]]
When I print the matrix, the output is:
[[5, 1, 7, 5], [2, 4, 9, 5], [3, 4, 5, 5], [3, 4, 6, 7]]
How do you print this matrix to where the output is the same as the initial input
like this below:
[
[5, 1, 7, 5],
[2, 4, 9, 5],
[3, 4, 5, 5],
[3, 4, 6, 7]
]
Most answers I see erase the square brackets when printing. Is there a way to do this and still have the square brackets there like I did when I first defined the 2D array?
I think it will be dependent on your console/IDE. You could try to use pprint.
>>> m
[[5, 1, 7, 5], [2, 4, 9, 5], [3, 4, 5, 5], [3, 4, 6, 7]]
>>> pprint(m, width=40)
[[5, 1, 7, 5],
[2, 4, 9, 5],
[3, 4, 5, 5],
[3, 4, 6, 7]]
Attempt at a more general approach of determining the width (not sure how this would fair for other nested lists, but works here):
pprint(m, width=len(str(m))-1)
I want a numpy array like this:
b = np.array([[1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2],
[3, 3, 3, 3, 3, 3],
[4, 4, 4, 4, 4, 4],
[5, 5, 5, 5, 5, 5],
[6, 6, 6, 6, 6, 6],
[7, 7, 7, 7, 7, 7],
[8, 8, 8, 8, 8, 8],
[9, 9, 9, 9, 9, 9]])
Is there a faster way to create a NumPy array like this instead of typing them manually?
You can do something like this:
>>> np.repeat(np.arange(1, 10).reshape(-1,1), 6, axis=1)
array([[1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2],
[3, 3, 3, 3, 3, 3],
[4, 4, 4, 4, 4, 4],
[5, 5, 5, 5, 5, 5],
[6, 6, 6, 6, 6, 6],
[7, 7, 7, 7, 7, 7],
[8, 8, 8, 8, 8, 8],
[9, 9, 9, 9, 9, 9]])
Explanation:
np.arange(1, 10).reshape(-1,1) creates an array
array([[1],
[2],
[3],
[4],
[5],
[6],
[7],
[8],
[9]])
np.repeat(_, 6, axis=1) repeats this 6 times on the first (or second in human words) axis.
Yes. There are plenty of methods. This is one:
np.repeat(np.arange(1,10),6,axis=0).reshape(9,6)
Another method is to use broadcasting:
>>> np.arange(1,10)[:,None] * np.ones(6, dtype=int)
array([[1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2],
[3, 3, 3, 3, 3, 3],
[4, 4, 4, 4, 4, 4],
[5, 5, 5, 5, 5, 5],
[6, 6, 6, 6, 6, 6],
[7, 7, 7, 7, 7, 7],
[8, 8, 8, 8, 8, 8],
[9, 9, 9, 9, 9, 9]])
For any w*l size, convert a list of lists into an np.array like so:
w = 6
l = 9
[np.array([[1+i]*w for i in range(d)])
array([[1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2],
[3, 3, 3, 3, 3, 3],
[4, 4, 4, 4, 4, 4],
[5, 5, 5, 5, 5, 5],
[6, 6, 6, 6, 6, 6],
[7, 7, 7, 7, 7, 7],
[8, 8, 8, 8, 8, 8],
[9, 9, 9, 9, 9, 9]])
np.transpose(np.array(([np.arange(1,10)] * 6)))
np.arange(1,10) creates an numpy array from 1 to 9.
[] puts the array into a list.
*6 augments the array 6 times.
np.array() converts the resulting structure (list of arrays) to a numpy array
np.transpose() rotates the orientation of the numpy array to get vertical one.
I want to do divide an 8*8 array in to 4 segments(each segment of 4*4 array) as shown below in step2. Then again divide each segment in to other small 4 subsegemnts(each subsegment of 2*2 array) and then find the mean of each subsegment and then find the stabbndard deviation of each segment using the 4 means of the 4 subsegments in it. So that finally I only have an array (2*2 array) ie with 1 standard deviation for 1 segment.
import numpy as np
from skimage.util.shape import view_as_blocks
arr=np.array([[1,2,3,4,5,6,7,8],[1,2,3,4,5,6,7,8],[1,2,3,4,5,6,7,8],[1,2,3,4,5,6,7,8],[1,2,3,4,5,6,7,8],[1,2,3,4,5,6,7,8],[1,2,3,4,5,6,7,8],[1,2,3,4,5,6,7,8]])
img= view_as_blocks(arr, block_shape=(4,4))
upto this I have tried but I was unable to go further in my requirement as I am completely new to python and numpy. Kindly, help me in achieve my requirement.
#step1-Array
array([[1, 2, 3, 4, 5, 6, 7, 8],
[1, 2, 3, 4, 5, 6, 7, 8],
[1, 2, 3, 4, 5, 6, 7, 8],
[1, 2, 3, 4, 5, 6, 7, 8],
[1, 2, 3, 4, 5, 6, 7, 8],
[1, 2, 3, 4, 5, 6, 7, 8],
[1, 2, 3, 4, 5, 6, 7, 8],
[1, 2, 3, 4, 5, 6, 7, 8]])
#step2-segments
array([[[[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4]],
[[5, 6, 7, 8],
[5, 6, 7, 8],
[5, 6, 7, 8],
[5, 6, 7, 8]]],
[[[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4]],
[[5, 6, 7, 8],
[5, 6, 7, 8],
[5, 6, 7, 8],
[5, 6, 7, 8]]]])
**more steps to go to get final output**
Expected Output
([[1.0, 1.0],
[1.0, 1.0]])
It can be done using a function view_as_blocks of skimage.util.shape.
How to create pandas dataframe in the following format:
A B C D
0 [1,2,3,4] [2,3,4,5] [4,5,5,6] [6,3,4,5]
1 [2,3,5,6] [3,4,6,6] [3,4,5,7] [2,6,3,4]
2 [8,9,6,7] [5,7,9,5] [3,7,9,5] [5,7,9,8]
Basically each row has a list as elements. I am trying to classify data using machine learning. Each data point has 40 x 6 values. Is there any other format which is suitable to be fed into classifier.
Edit:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plot
from sklearn.neighbors import KNeighborsClassifier
# Read csv data into pandas data frame
data_frame = pd.read_csv('data.csv')
extract_columns = ['LinearAccX', 'LinearAccY', 'LinearAccZ', 'Roll', 'pitch', 'compass']
# Number of sample in one shot
samples_per_shot = 40
# Calculate number of shots in dataframe
count_of_shots = len(data_frame.index)/samples_per_shot
# Initialize Empty data frame
training_index = range(count_of_shots)
training_data_list = []
# flag for backward compatibility
make_old_data_compatible_with_new = 0
if make_old_data_compatible_with_new:
# Convert 40 shot data to 25 shot data
# New logic takes 25 samples/shot
# old logic takes 40 samples/shot
start_shot_sample_index = 9
end_shot_sample_index = 34
else:
# Start index from 1 and continue till lets say 40
start_shot_sample_index = 1
end_shot_sample_index = samples_per_shot
# Extract each shot into pandas series
for shot in range(count_of_shots):
# Extract current shot
current_shot_data = data_frame[data_frame['shot_no']==(shot+1)]
# Select only the following column
selected_columns_from_shot = current_shot_data[extract_columns]
# Select columns from selected rows
# Find start and end row indexes
current_shot_data_start_index = shot * samples_per_shot + start_shot_sample_index
current_shot_data_end_index = shot * samples_per_shot + end_shot_sample_index
selected_rows_from_shot = selected_columns_from_shot.ix[current_shot_data_start_index:current_shot_data_end_index]
# Append to list of lists
# Convert selected short into multi-dimensional array
training_data_list.append([selected_columns_from_shot[extract_columns[index]].values.tolist() for index in range(len(extract_c olumns))])
# Append each sliced shot into training data
training_data = pd.DataFrame(training_data_list, columns=extract_columns)
training_features = [1 for i in range(count_of_shots)]
knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(training_data, training_features)
simple
pd.DataFrame(
[[[1, 2, 3, 4], [2, 3, 4, 5], [4, 5, 5, 6], [6, 3, 4, 5]],
[[2, 3, 5, 6], [3, 4, 6, 6], [3, 4, 5, 7], [2, 6, 3, 4]],
[[8, 9, 6, 7], [5, 7, 9, 5], [3, 7, 9, 5], [5, 7, 9, 8]]],
columns=list('ABCD')
)
Or
build a Series with a MultiIndex and unstack
lst = [
[1, 2, 3, 4],
[2, 3, 4, 5],
[4, 5, 5, 6],
[6, 3, 4, 5],
[2, 3, 5, 6],
[3, 4, 6, 6],
[3, 4, 5, 7],
[2, 6, 3, 4],
[8, 9, 6, 7],
[5, 7, 9, 5],
[3, 7, 9, 5],
[5, 7, 9, 8]]
pd.Series(lst, pd.MultiIndex.from_product([[0, 1, 2], list('ABCD')])).unstack()
A B C D
0 [1, 2, 3, 4] [2, 3, 4, 5] [4, 5, 5, 6] [6, 3, 4, 5]
1 [2, 3, 5, 6] [3, 4, 6, 6] [3, 4, 5, 7] [2, 6, 3, 4]
2 [8, 9, 6, 7] [5, 7, 9, 5] [3, 7, 9, 5] [5, 7, 9, 8]
you can try this.
import pandas as pd
data = [{'A': [1,2,3,4], 'B': [2,3,4,5], 'C': [4,5,5,6], 'D': [6,3,4,5]}, {'A': [2,3,5,6], 'B': [3,4,6,6], 'C': [3,4,5,7], 'D': [2,6,3,4]}, {'A': [8,9,6,7], 'B': [5,7,9,5], 'C': [3,7,9,5], 'D': [5,7,9,8]}]
df = pd.DataFrame(data)
print(df)
# Output
A B C D
0 [1, 2, 3, 4] [2, 3, 4, 5] [4, 5, 5, 6] [6, 3, 4, 5]
1 [2, 3, 5, 6] [3, 4, 6, 6] [3, 4, 5, 7] [2, 6, 3, 4]
2 [8, 9, 6, 7] [5, 7, 9, 5] [3, 7, 9, 5] [5, 7, 9, 8]