I have a question about InsertValue
If I understand it only takes integer arguements. I was wondering if there is a way to have it take float values? Or maybe some other function that does the job of InsertValue but takes float values? I know there is InsertNextValue, but I am not sure if it'll be efficient in my case since my array is a very big array (~ 100.000 by 120)
Below is my code and in my code I am making the entries of fl values integers to make it work for now but ideally it'll be great if I don't have to do that.
Thanks in advance :)
import vtk
import math
from vtk import vtkStructuredGrid, vtkPoints, vtkFloatArray, vtkXMLStructuredGridWriter
import scipy.io
import numpy
import os
#loading the matlab files
mats = scipy.io.loadmat('/home/lusine/data/3DDA/donut_for_vtk/20130228_050000_3D_E=1.mat')
#x,y,z coordinate, fl flux values
xx = mats['xvect']
yy = mats['yvect']
zz = mats['zvect']
fl = mats['fluxmesh3d'] #3d matrix
nx = xx.shape[1]
ny = yy.shape[1]
nz = zz.shape[1]
fl = numpy.nan_to_num(fl)
inx = numpy.nonzero(fl)
l = len(inx[1])
grid = vtk.vtkStructuredGrid()
grid.SetDimensions(nx,ny,nz) # sets the dimensions of the grid
pts = vtk.vtkPoints() # represents 3D points, The data model for vtkPoints is an array of vx-vy-vz triplets accessible by (point or cell) id.
pts.SetNumberOfPoints(nx*ny*nz) # Specify the number of points for this object to hold.
p=0
for i in range(l):
pts.InsertPoint(p, xx[0][inx[0][i]], yy[0][inx[1][i]], zz[0][inx[2][i]])
p = p + 1
SetPoint()
grid.SetPoints(pts)
cdata = vtk.vtkFloatArray()
cdata.SetNumberOfComponents(1)
cdata.SetNumberOfTuples((nx-1)*(ny-1)*(nz-1))
cdata.SetName('cellData')
p=0
for i in range(l-1):
cdata.InsertValue(p,inx[0][i]+inx[1][i]+inx[2][i])
p = p+1
grid.GetCellData().SetScalars(cdata)
pdata = vtk.vtkFloatArray()
pdata.SetNumberOfComponents(1)
#Get the number of tuples (a component group) in the array
pdata.SetNumberOfTuples(nx*ny*nz)
#Sets the array name
pdata.SetName('pointData')
for i in range(l):
pdata.InsertValue(int(fl[inx[0][i]][inx[1][i]][inx[2][i]]), inx[0][i]+inx[1][i]+inx[2][i])
grid.GetPointData().SetScalars(pdata)
writer = vtk.vtkXMLStructuredGridWriter()
writer.SetFileName('new_grid.vts')
#writer.SetInput(grid)
writer.SetInputData(grid)
writer.Update()
print 'end'
The first argument of InsertValue requires an integer because it's the index where the value is going to be inserted. If instead of a vtkFloatArray pdata you had a numpy array called p, this would be the equivalent of your instruction:
pdata.InsertValue(a,b) becomes p[a]=b
p[0.1] wouldn't make sense, it a must be an integer!
But I am a bit lost on the data. What do you mean that your array is (~ 100.000 by 120)..do you have 100.000 points, and each point has a vector of 120 components? In such a case, your pdata should have 120 components, and for each point point_index you call
pdata.SetTuple[point_index,[v0,v1...,v119]
or
pdata.SetComponent[point_index,0,v0]
...
pdata.SetComponent[point_index,119,v119]
If not, are you sure that you have to access pdata based on fl values (you have to be sure that fl is int, 0 <= fl < ntuples, and that you are not going to have holes). Check if you can do the same thing that you do for cdata (btw in your code p is always equal to i, you can just use i)
It's also possible to copy a numpy array directly to vtk , see http://vtk.1045678.n5.nabble.com/vtk-to-numpy-how-to-get-a-vtk-array-tp1244891p1244895.html , but you have to be very careful with the structure of your data
Related
Hi there I am having trouble with my code for a function I am adapting based on a previous code where I manually typed in the data:
I updated my code to be:
import numpy as np
import pandas as pd
def pitch(X0,Y0,Z0,V0MPH,RPM,GyroAngle,TiltAngle,Phi,Theta,WS,WD,Temp,RH,Pressure):
for x0 in X0:
for y0 in Y0:
for z0 in Z0:
for V0 in V0MPH:
for R in RPM:
for a in GyroAngle:
for b in TiltAngle:
for phi in Phi:
for theta in Theta:
for spd in WS:
for dire in WD:
for Tf in Temp:
for H in RH:
for P in Pressure:
uwindfts=spd*np.sin(dire) #U Wind in ft/s
vwindfts=spd*np.cos(dire) #V Wind in ft/s
timestep=0 #time step for calculating
dt=0.001 #D-time
circ=9.125 #Circumference of ball
phi=phi*np.pi/180
theta=theta*np.pi/180
if phi>0: #Angle pitch is released wrt HP-2B line
a=1
else:
a=-1
Tc=(Tf-32.0)*(5/9) #F to C
Tk=Tc+273.15 #C to K
dryrho=P*100/(287*Tk) #Density
ep=RH*0.01*6.11*np.exp(17.625*Tc/(243.05+Tc))
wetrho=ep*100/(461*Tk)
rho=dryrho-wetrho
# print(rho)
gyrospin=R*np.sin(a*np.pi/180)
sidespin=a*(R-gyrospin)*np.sin(b*np.pi/180)
backspin=-1*(R-gyrospin)*np.cos(b*np.pi/180)
v=V0*1.467
c0=(0.07182*rho*0.06261)
vx=v*np.cos(theta*np.pi/180)*np.sin(phi*np.pi/180)
vy=v*np.cos(theta*np.pi/180)*np.cos(phi*np.pi/180)
vz=v*np.sin(theta*np.pi/180)
vwow=np.sqrt((vx-uwindfts)**2+(vy-vwindfts)**2+(vz)**2)
# print (womg)
# print (vz)
# print (vz+wvv)
# print (vz-wvv)
constk=np.pi/30.0
x=x0
y=y0
z=z0
while y >=(17/12):
visco=2.791*(10.0**-7)*(Tk**0.7355)
vre=v*(0.44704/1.4617)
Re=rho*(vre)*(circ*0.254/(np.pi))/visco
upper=74.1*1.467*210000.0/Re
lower=14.3*1.467*210000.0/Re
cd=0.5-(0.227/(1.0+np.exp(-1.0*(v-upper)/lower)))
dragx=-cd*c0*vwow*(vx-uwindfts)
dragy=-cd*c0*vwow*(vy-vwindfts)
dragz=-cd*c0*vwow*(vz)
omgx=constk*(backspin*np.cos(phi)-(sidespin*np.sin(theta)*np.sin(phi))+gyrospin*vx/vwow)
omgy=constk*(-1*backspin*np.sin(phi)-(sidespin*np.sin(theta)*np.sin(phi))+gyrospin*vy/vwow)
omgz=constk*(sidespin*np.cos(theta)+gyrospin*(vz/vwow))
omg=np.sqrt(omgx**2.0+omgy**2.0+omgz**2.0)
romg=omg*(circ/(2.0*np.pi))/12.0
S=np.exp((timestep*dt)/1000.0)*(romg/vwow)
cl=1.0/(2.32+(0.4/S))
magnusx=c0*(cl/omg)*vwow*(omgy*(vz)-omgz*(vy-vwindfts))
magnusy=c0*(cl/omg)*vwow*(omgz*(vx-uwindfts)-omgx*(vz))
magnusz=c0*(cl/omg)*vwow*(omgx*(vy-vwindfts)-omgy*(vx-uwindfts))
ax=magnusx+dragx
ay=magnusy+dragy
az=magnusz+dragz-32.174
vx=vx+ax*dt
vy=vy-ay*dt
vz=vz+az*dt
v=np.sqrt((vx)**2+(vy)**2+vz**2)
vwow=np.sqrt((vx-uwindfts)**2+(vy-vwindfts)**2+(vz)**2)
x=x+vx*dt+ax*(dt**2)
y=y+vy*dt+ay*(dt**2)
z=z+vz*dt+az*(dt**2)
vmph=v/1.467
timestep+=1
xfinal=x
zfinal=z
print (xfinal,zfinal)
return xfinal,zfinal
pitch(-2.5,55,6,95,2450,5,183,176.5,-2,5,0,72,55,1013.25)
in order to read arrays but the new code doesn't want to read float.
[1]: https://i.stack.imgur.com/QffcA.png
[2]: https://i.stack.imgur.com/5K6wt.png
Here's how array RMP propagates through to y. y starts as a single number, but once that iteration is done, it is an array with the same shape as RMP (if I've read the code right).
def pitchtrack(x,y,z,RPM,v0mph,gyro,tiltangle,phi,theta,wu,wv,wvv,Tf,RH,Pmb):
....
backspin=-1*(RPM-gyrospin)*np.cos(a*tiltangle*np.pi/180)
...
while y>=(17/12):
...
omgx=np.pi*(backspin*np.cos(phi*np.pi/180.0)
...
magnusy=c0*(cl/omg)*vwow*(omgz*(vx-wu)-omgx*(vz))
...
ay=magnusy+dragy
...
vy=vy-ay*dt
...
y=y+vy*dt
...
y>=(7/12) will then be a boolean array with multiple values. That's what the error is all about. Such an array CANNOT be used in a Python context that expects ONE boolean value. Examples include if, or and here, while.
The error message suggests using any or all to reduce that array to one value. But you have to decide which is the right one. Sometimes other things might be used, such as sum or max/min, or other means of reducing the multiple values to one.
Are you sure the author of this code intended for RMP to be an array? Maybe it was written with a scalar value in mind - and tested with such!
Following is the numpy array I have. I need to create a matrix containing zeors for an instance like np.zeroes([1,1]).
newEdges =
array([['0', 'Firm'],
['1', 'Firm'],
['2', 'Firm'],
...,
['binA', 'year2017_bin'],
['binA', 'year2017_bin'],
['binA', 'year2017_bin']],
dtype='<U21')
newEdges.shape
#(63673218, 2)
newEdges.size
#127346436
However, based on the size of my matrix (as you can see above, that is, (63673218, 2)), if I run syntax to generate the zeroes matrix I get a Memory Error.
He is full syntax:
print(newEdges)
unique_Bin = np.unique(newEdges[:,0])
n_unique_Bin = len(unique_Bin)
unique_Bin
n_unique_Bin
#3351248
Q = np.zeros([n_unique_Bin,n_unique_Bin])
--------------------------------------------------------------------------
MemoryError Traceback (most recent call last)
<ipython-input-16-581dfaca2eab> in <module>()
----> 1 Q = np.zeros([n_unique_Bin,n_unique_Bin])
MemoryError:
How do I resolve this error? Or, how would I safely convert this huge matrix to a sparse matrix for further calculation done below:
for n, employer_employee in enumerate(newEdges):
#print(employer_employee)
#copy the array for the original o be intact
eee = np.copy(newEdges)
#sustitue the current tuple with a empty one to avoid self comparing
eee[n] = (None,None)
#get the index for the current employee, the one on the y axis
employee_index = np.where(employer_employee[0] != unique_Bin)
#get the indexes where the the employees letter match
eq_index = np.where(eee[:,1] == employer_employee[1])[0]
eq_employee = eee[eq_index,0]
#add at the final array Q by index
for emp in eq_employee:
#print(np.unique(emp))
emp_index = np.where(unique_Bin == emp)
#print(emp)
Q[employee_index,emp_index]+= 1
# print(Q)
print(Q)
I have 24GB left in the memory for this calculation.
Just to point this out you are trying to create an array which is 3,351,248 x 3,351,248 in size. This is 11,230,863,157,504 entries! 11.2 trillion entries! The fact that you tried to print this makes me think you hadn't realised how big it was. I don't think you will be able to do this without sparse matrices. First of all you should probably make sure you need to do this and see if there is some other way.
Otherwise you can create a sparse matrix using scipy
import numpy as np
import scipy
Q = scipy.sparse.csr_matrix((n_unique_Bin,n_unique_Bin), dtype = np.int8)
Then go from there.
I have a netCDF variable with 372 time-steps, I need to slice this variable to read in each individual time-step for subsequent processing.
I have used glob. to read in my 12 netCDF files and then defined the variables.
NAME_files = glob.glob('RGL*nc')
NAME_files = NAME_files[0:12]
for n in (NAME_files):
RGL = Dataset(n, mode='r')
footprint = RGL.variables['fp'][:]
lons = RGL.variables['lon'][:]
lats = RGL.variables['lat'][:]
I now need to repeat the code below in a loop for each of the 372 time-steps of the variable 'footprint'.
footprint_2 = RGL.variables['fp'][:,:,1:2]
I'm new to Python and have a poor grasp of looping. Any help would be appreciated, including better explanation/description of my issue.
You need to determine both the dimensions and shape of the fp variable in order to access it properly.
I'm making assumptions here about those values.
Your code implies 3 dimensions: time,lon,lat. Again just assuming.
footprint_2 = RGL.variables['fp'][:,:,1:2]
But the code above gets all the times, all the lons, for 1 latitude. Slice 1:2 selects 1 value.
fp_dims = RGL.variables['fp'].dimensions
print(fp_dims)
# a tuple of dimesions names
(u'time', u'lon', u'lat')
fp_shape = RGL.variables['fp'].shape
# a tuple of dimesions sizes or lengths
print(fp_shape)
(372, 30, 30)
len = fp_shape[0]
for time_idx in range(0,len)):
# you don't say if you want a single lon,lat or all the lon,lat's for a given time step.
test = RGL.variables['fp'][time_idx,:,:]
# or if you really want this:
test = RGL.variables['fp'][time_idx,:,1:2]
# or a single lon, lat
test = RGL.variables['fp'][time_idx,8,8]
I am trying to parse some .out files to get a value E contained within each file, and then plot these value against theta and r as a 3d surface plot. The values of theta and r are contained in the .out file title names: H2O.r{}theta{}.out. I.e. r is given in the first {} and theta is then given in the next {}. r is given to 2 d.p and theta is given to 1 d.p. in the file names, e.g. r = 0.90, theta = 190.0.
I am having a hard time iterating through the files, and extracting this information into an array E . I have come across an error:
IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices.
However, if I change my r array to int to get rid of this error, then all of the values in r will become 0. Addtionally, my code to extract E from the file will no longer work as I will inputting 'H2O.r0.00theta70.out', a file which doesn't exist. Does anybody have any suggestions ?
from numpy import *
import matplotlib.pyplot as plt
import os
os.chdir('C:/Users/myName/ex2/all')
theta = arange(70.0, 161.0, 1, dtype = float)
r = arange(0.70, 1.95, 0.05, dtype = float)
r_2dp = [ '%.2f' % elem for elem in r_O ] # string array, with rounding to match the file names
E = zeros((theta.shape[0],r.shape[0]))
def extract(filename): #extract value from file
filename = open(filename,"r")
for line in filename:
if 'SCF Done' in line:
l = line.split()
p = float(l[4])
return p
for i in r_2dp: #create E array that will allow me to plot E vs r vs theta
for j in theta:
for filename in os.listdir('C:/Users/myName/ex2/all'):
if filename.startswith('H2O'):
filename = 'H2O.r{}theta{}.out'.format(i,j)
E[i,j] = extract(filename)
One solution would be to associate an index with the filename which you can accomplish using enumerate; I think you can just change your loop to
for i, fn in enumerate(r_2dp):
for j, ti in enumerate(theta):
for filename in os.listdir('C:/Users/myName/ex2/all'):
if filename.startswith('H2O'):
filename = 'H2O.r{}theta{}.out'.format(fn,ti)
E[j,i] = extract(filename)
Please note that I changed E[i,j] to E[j,i] to get the dimensions correctly; you could also change the order of the two for-loops or initialize E the other way round...
Untested, as we cannot access your file, but the general idea should work...
This question may be a little specialist, but hopefully someone might be able to help. I normally use IDL, but for developing a pipeline I'm looking to use python to improve running times.
My fits file handling setup is as follows:
import numpy as numpy
from astropy.io import fits
#Directory: /Users/UCL_Astronomy/Documents/UCL/PHASG199/M33_UVOT_sum/UVOTIMSUM/M33_sum_epoch1_um2_norm.img
with fits.open('...') as ima_norm_um2:
#Open UVOTIMSUM file once and close it after extracting the relevant values:
ima_norm_um2_hdr = ima_norm_um2[0].header
ima_norm_um2_data = ima_norm_um2[0].data
#Individual dimensions for number of x pixels and number of y pixels:
nxpix_um2_ext1 = ima_norm_um2_hdr['NAXIS1']
nypix_um2_ext1 = ima_norm_um2_hdr['NAXIS2']
#Compute the size of the images (you can also do this manually rather than calling these keywords from the header):
#Call the header and data from the UVOTIMSUM file with the relevant keyword extensions:
corrfact_um2_ext1 = numpy.zeros((ima_norm_um2_hdr['NAXIS2'], ima_norm_um2_hdr['NAXIS1']))
coincorr_um2_ext1 = numpy.zeros((ima_norm_um2_hdr['NAXIS2'], ima_norm_um2_hdr['NAXIS1']))
#Check that the dimensions are all the same:
print(corrfact_um2_ext1.shape)
print(coincorr_um2_ext1.shape)
print(ima_norm_um2_data.shape)
# Make a new image file to save the correction factors:
hdu_corrfact = fits.PrimaryHDU(corrfact_um2_ext1, header=ima_norm_um2_hdr)
fits.HDUList([hdu_corrfact]).writeto('.../M33_sum_epoch1_um2_corrfact.img')
# Make a new image file to save the corrected image to:
hdu_coincorr = fits.PrimaryHDU(coincorr_um2_ext1, header=ima_norm_um2_hdr)
fits.HDUList([hdu_coincorr]).writeto('.../M33_sum_epoch1_um2_coincorr.img')
I'm looking to then apply the following corrections:
# Define the variables from Poole et al. (2008) "Photometric calibration of the Swift ultraviolet/optical telescope":
alpha = 0.9842000
ft = 0.0110329
a1 = 0.0658568
a2 = -0.0907142
a3 = 0.0285951
a4 = 0.0308063
for i in range(nxpix_um2_ext1 - 1): #do begin
for j in range(nypix_um2_ext1 - 1): #do begin
if (numpy.less_equal(i, 4) | numpy.greater_equal(i, nxpix_um2_ext1-4) | numpy.less_equal(j, 4) | numpy.greater_equal(j, nxpix_um2_ext1-4)): #then begin
#UVM2
corrfact_um2_ext1[i,j] == 0
coincorr_um2_ext1[i,j] == 0
else:
xpixmin = i-4
xpixmax = i+4
ypixmin = j-4
ypixmax = j+4
#UVM2
ima_UVM2sum = total(ima_norm_um2[xpixmin:xpixmax,ypixmin:ypixmax])
xvec_UVM2 = ft*ima_UVM2sum
fxvec_UVM2 = 1 + (a1*xvec_UVM2) + (a2*xvec_UVM2*xvec_UVM2) + (a3*xvec_UVM2*xvec_UVM2*xvec_UVM2) + (a4*xvec_UVM2*xvec_UVM2*xvec_UVM2*xvec_UVM2)
Ctheory_UVM2 = - alog(1-(alpha*ima_UVM2sum*ft))/(alpha*ft)
corrfact_um2_ext1[i,j] = Ctheory_UVM2*(fxvec_UVM2/ima_UVM2sum)
coincorr_um2_ext1[i,j] = corrfact_um2_ext1[i,j]*ima_sk_um2[i,j]
The above snippet is where it is messing up, as I have a mixture of IDL syntax and python syntax. I'm just not sure how to convert certain aspects of IDL to python. For example, the ima_UVM2sum = total(ima_norm_um2[xpixmin:xpixmax,ypixmin:ypixmax]) I'm not quite sure how to handle.
I'm also missing the part where it will update the correction factor and coincidence correction image files, I would say. If anyone could have the patience to go over it with a fine tooth comb and suggest the neccessary changes I need that would be excellent.
The original normalised image can be downloaded here: Replace ... in above code with this file
One very important thing about numpy is that it does every mathematical or comparison function on an element-basis. So you probably don't need to loop through the arrays.
So maybe start where you convolve your image with a sum-filter. This can be done for 2D images by astropy.convolution.convolve or scipy.ndimage.filters.uniform_filter
I'm not sure what you want but I think you want a 9x9 sum-filter that would be realized by
from scipy.ndimage.filters import uniform_filter
ima_UVM2sum = uniform_filter(ima_norm_um2_data, size=9)
since you want to discard any pixel that are at the borders (4 pixel) you can simply slice them away:
ima_UVM2sum_valid = ima_UVM2sum[4:-4,4:-4]
This ignores the first and last 4 rows and the first and last 4 columns (last is realized by making the stop value negative)
now you want to calculate the corrections:
xvec_UVM2 = ft*ima_UVM2sum_valid
fxvec_UVM2 = 1 + (a1*xvec_UVM2) + (a2*xvec_UVM2**2) + (a3*xvec_UVM2**3) + (a4*xvec_UVM2**4)
Ctheory_UVM2 = - np.alog(1-(alpha*ima_UVM2sum_valid*ft))/(alpha*ft)
these are all arrays so you still do not need to loop.
But then you want to fill your two images. Be careful because the correction is smaller (we inored the first and last rows/columns) so you have to take the same region in the correction images:
corrfact_um2_ext1[4:-4,4:-4] = Ctheory_UVM2*(fxvec_UVM2/ima_UVM2sum_valid)
coincorr_um2_ext1[4:-4,4:-4] = corrfact_um2_ext1[4:-4,4:-4] *ima_sk_um2
still no loop just using numpys mathematical functions. This means it is much faster (MUCH FASTER!) and does the same.
Maybe I have forgotten some slicing and that would yield a Not broadcastable error if so please report back.
Just a note about your loop: Python's first axis is the second axis in FITS and the second axis is the first FITS axis. So if you need to loop over the axis bear that in mind so you don't end up with IndexErrors or unexpected results.