I have an array A with shape (2,4,1). I want to calculate the mean of A[0] and A[1] and store both the means in A_mean. I present the current and expected outputs.
import numpy as np
A=np.array([[[1.7],
[2.8],
[3.9],
[5.2]],
[[2.1],
[8.7],
[6.9],
[4.9]]])
for i in range(0,len(A)):
A_mean=np.mean(A[i])
print(A_mean)
The current output is
5.65
The expected output is
[3.4,5.65]
The for loop is not necessary because NumPy already knows how to operate on vectors/matrices.
solution would be to remove the loop and just change axis as follows:
A_mean=np.mean(A, axis=1)
print(A_mean)
Outputs:
[[3.4 ]
[5.65]]
Now you can also do some editing to remove the brackets with [3.4 5.65]:
print(A_mean.ravel())
Try this.
import numpy as np
A=np.array([[[1.7],
[2.8],
[3.9],
[5.2]],
[[2.1],
[8.7],
[6.9],
[4.9]]])
A_mean = []
for i in range(0,len(A)):
A_mean.append(np.mean(A[i]))
print(A_mean)
Related
I'm trying to get my code to take times in the 24hr format (such as 0930 (09:30 AM) or 20h45 (08:45 PM)) and output it as 09:30 and 20:45, respectively.
I tried using datetime and strftime, etc., but I can't use it on Numpy arrays.
I've also tried formatting as Numpy datetime, but I can't seem to get the HH:MM format.
This is my array (example)
[0845 0925 1046 2042 2153]
and I would like to output this:
[08:45 09:25 10:46 20:42 21:53]
Thank you all in advance.
Although I'm not entirely sure what are you trying to acomplish, I think this is the desired output.
For parsing dates you should first use "strptime" to get a datetime object and then "strftime" to get it back into desired string.
You are saying you got numpy arrays, but u got leading zeroes in this example you gave, so I guess it is an np array with defined string dytpe.
Custom functions can be vectorized to work on numpy arrays.
import numpy as np
from datetime import datetime
a=np.array(["0845", "0925", "1046", "2042", "2153"],dtype = str)
def fun(x):
x=datetime.strptime(x,"%H%M")
return datetime.strftime(x,"%H:%M")
vfunc = np.vectorize(fun)
result = vfunc(a)
print(result)
You can leverage pandas.to_datetime():
import numpy as np
import pandas as pd
x = np.array(["0845","0925","1046","2042","2153"])
y = pd.to_datetime(x, format="%H%M").to_numpy()
Outputs:
>>> x
['0845' '0925' '1046' '2042' '2153']
>>> y
['1900-01-01T08:45:00.000000000' '1900-01-01T09:25:00.000000000'
'1900-01-01T10:46:00.000000000' '1900-01-01T20:42:00.000000000'
'1900-01-01T21:53:00.000000000']
More info:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html
Assuming that the initial array includes only valid values (eg. values like '08453' or '2500' are not valid), this is a simple solution that does not require importing extra modules:
import numpy
arr = numpy.array(["0845","0925","1046","2042","2153"])
new_arr = []
for x in arr:
elem = x[:2] + ":" + x[2:]
new_arr.append(elem)
print(new_arr)
I am reading csv file and in that csv file have a Columns
RFMin and RFMax
1000 3333
5125.5 5888
I want 10 numbers between RFMIn and RFMax with using linspace in python
import pandas as pd
Import numpy as np
df = csv.read_csv(filePath)
RFRange = np.linspace(RFMIn, RFMax, 10)
RFRange = RFRange.flatten()
RFarray=[]
for i in RFRange:
RFarray.append(i)
dict = {‘RFRange’: RFarray}
data = pd.DataFrame(dict)
data.to_csv(‘Output.csv’, header=True, sep=’\t’)
I want something like this:
1000
1259.22
1518.44
1777.67
……..
…….
3333
5125.5
5210.22
5294.94
……..
…….
5888
Your problem is coming from the call to flatten. The flatten function of matplotlib converts a 2d array into a 1d array. However this is done in row-major order by default (https://numpy.org/doc/1.18/reference/generated/numpy.ndarray.flatten.html).
In [1]: a = [1000,5125.5]
In [2]: b = [3333,5888]
In [3]: import numpy as np
In [4]: np.linspace(a,b,10)
Out[4]:
array([[1000. , 5125.5 ],
[1259.22222222, 5210.22222222],
[1518.44444444, 5294.94444444],
[1777.66666667, 5379.66666667],
[2036.88888889, 5464.38888889],
[2296.11111111, 5549.11111111],
[2555.33333333, 5633.83333333],
[2814.55555556, 5718.55555556],
[3073.77777778, 5803.27777778],
[3333. , 5888. ]])
In [5]: np.linspace(a,b,10).flatten()
Out[5]:
array([1000. , 5125.5 , 1259.22222222, 5210.22222222,
1518.44444444, 5294.94444444, 1777.66666667, 5379.66666667,
2036.88888889, 5464.38888889, 2296.11111111, 5549.11111111,
2555.33333333, 5633.83333333, 2814.55555556, 5718.55555556,
3073.77777778, 5803.27777778, 3333. , 5888. ])
As you can see this means that it converts your data into a different format to what you are expecting.
There are a few ways to change the order.
1) As per https://numpy.org/doc/1.18/reference/generated/numpy.ndarray.flatten.html you can use fortran ordering (column-major) when flattening
2) You can transpose your data before flattening
RFRange = RFRange.T.flatten() / RFRange = RFRange.transpose().flatten()
3) You can add a second loop when appending and append directly from the 2D array
I would suggest that this method is to be avoided though. It is ok for 10 points, however large loops can be quite slow in python and it is therefore better to use python built in functions where possible. For example in this case a numpy1d array can easily be converted to a list with the following command:
RFArray = list(RFRange)
You want the array in ascending order? If it is, just do RFarray.sort()
I am new to python, I really appreciate it if you could help me.
I have a 2 column array i.e. d.T, and a 1 column array i.e. result and want to unite them in a 3 column array, I tried many times around but could not find the best way to do so, even I tried np.vstack but it doesn't work due to different dimensions.
import numpy as np
import math
n=3
m=3
T=4;
xmin=0; xmax=l=4
zmin=0; zmax=h=2
nx=5; nz=5
dx=(xmax-xmin)*1.0/(nx-1)
dz=(zmax-zmin)*1.0/(nz-1)
dt=0.00001
nt=1
k_z=n*2*math.pi/h
k_x=m*2*math.pi/l
w_theo=np.zeros((nz,nx),dtype='float64')
xx=[]
for i in range(0,nx):
xx.append(i*dx)
zz=[]
for k in range(0,nz):
zz.append(k*dz)
[x,z]=np.meshgrid(xx,zz)
for i in range(0,nz):
for k in range(0,nx):
t=0+nt*dt; omega=2*math.pi/T;
w_theo[i,k]=round(np.sin(k_z*i*dz*1.0)*np.sin(k_x*k*dx*1.0-omega*t),10)
print w_theo
np.savetxt('Theoretical_result.txt', np.array(w_theo), delimiter="\t")
d = np.array([x.flatten(), z.flatten()])
result=[]
for i in range(0,nz):
for k in range(0,nx):
result.append(w_theo[nz-1-i,k])
myarray=np.asarray(result)
print myarray.shape, d.T.shape`
# data=[]
# data=np.vstack((d.T,myarray))
# np.savetxt('datafile_id', data)
Try
data = np.column_stack((d.T, myarray))
No need for data = []
I'm looking for a way to read this csv into python 2.7 and turn it into a (3,22000) array. For some reason I haven't been able to do it, no matter which way i try, I either get a groupn of strings in an array that i cant convert or an array seen below that won't convert to floats or allow computations to be done on them. Any help would be appreciated. Thanks
For the record it says the shape is (22000,), which I'm unsure about also.
In [126]: import csv
import numpy as np
with open("Data.csv") as sd:
ri = []
dv = []
for row in csv.reader(sd):
if row != ["ccx","ccy","ccz","cellVolumes","Cell Type"]:
nrow = []
for val in row[0:3]:
val = float(val)
nrow.append(val)
ri.append(nrow)
nrow = []
for val in row[3:4]:
val = float(val)
nrow.append(val)
dv.append(nrow)
ri = np.array(ri)
ri
.
Out[126]: array([[-0.179967, -0.38936, -0.46127], [-0.0633236, -0.407683, -0.542979],
[-0.125841, -0.494202, -0.412042], ...,
[-0.0116821, 0.764493, 0.573541], [0.630377, 0.469657, 0.442017],
[0.248253, 0.615365, 0.354134]], dtype=object
(from the helpful comments)
Check the length of those sublists. If they are all the same I'd expect a 2d array; but if they differ (most 3, but some 0, 2,4 etc) then the best it can do is give you a 1d array of 'objects' - the lists.
I would just do [len(x) for x in ri] before passing it to np.array. Maybe apply a max and min. A list comprehension like that won't take long.
I'm using numpy savetxt() to save the elements of a matrix to file as a single row (I need to print lots of them in order). This is the method I've found:
import numpy as np
mat = np.array([[1,2,3],
[4,5,6],
[7,8,9]])
with open('myfile.dat','a') as handle:
np.savetxt(handle, mat.reshape(1,mat.size), fmt='%+.8e')
handle.close()
There are 2 questions:
1) Is savetxt() the best option? I need to print 1e5 to 1e7 of these things... and I don't want i/o bottlenecking the actual computation. I'm guessing reopening the file each iteration is a bad plan, speed-wise.
2) Ideally I would have some context data printed to start each row so my output might look like:
(N foo mat):
...
6 -2.309 +1.000 +2.000 ...
7 -4.273 +1.000 +2.000 ...
8 -3.664 +1.000 +2.000 ...
...
I could do this using np.append(), but then the first number won't print as an INT. Is this sort of thing doable directly in savetxt()? Or do I need a C-like fprintf() anyways?
Pandas has a good to_csv method:
import pandas as pd
import numpy as np
mat = np.array([[1,2,3],
[4,5,6],
[7,8,9]])
df = pd.DataFrame(data=mat.astype(np.float))
df.to_csv('myfile.dat', sep=' ', float_format='%+.8e', header=False)
By default it'll add the index (index=True), though if you wanted different context data you could just add that to your data frame and set index=False
$ cat myfile.dat
0 +1.00000000e+00 +2.00000000e+00 +3.00000000e+00
1 +4.00000000e+00 +5.00000000e+00 +6.00000000e+00
2 +7.00000000e+00 +8.00000000e+00 +9.00000000e+00
OK. My original code for printing as an array only works if you want to print once. The mat.reshape() method doesn't just return the reshaped matrix it alters mmat itself. This means the next time through the loop any linalg routines will fail.
To avoid this we need to reshape a copy() of mat. I've also added a tmp variable for clarity.
import numpy as np
mat = np.array([[1,2,3],
[4,5,6],
[7,8,9]]) # initialize mat to see format
handle = open('myfile.dat', 'ab')
for n in range(N):
# perform linalg calculations on mat ...
meta = foo # based on current mat
tmp = np.hstack( ([[n]], [[meta]], (mat.copy()).reshape(1,mat.size)) )
np.savetxt(handle, tmp, fmt='%+.8e')
handle.close()
This gets the context data n and meta in this case. I can live with n being saved as a float.
I did some bench marking to check the i/o cost. I set N=100,000 for the loop, and average the run time for 6 runs:
no i/o, just computations: 9.1 sec
as coded above: 17.2 sec
open 'myfile.dat' to append each iteration: 30.6 sec
So the i/o doubles the runtime and, as expected, constantly opening and closing a file is a bad plan.