I am attempting to export a large array of 3D points into excel.
import numpy as np
import pandas as pd
d = np.asarray(data)
df = pd.Dataframe(d)
df.to_csv("C:/Users/Fred/Desktop/test.csv")
This exports the data into rows as below:
3.361490011 -27.39559937 -2.934410095
4.573401244 -26.45699201 -3.845634521
.....
Each line representing the x,y,z coordinates. However, for my analysis, I would like that the 2nd row is moved to columns beside the 1st row, and so on, so that all the coordinates for one shape are on the one row of the excel. I tried turning the data into a string but this returned the above too.
The reason is so I can add some population characteristics to the row for each 3d shape. Thanks for any help that anyone can give.
you can use x = df.to_numpy().flatten() to flatten your data and then save it to csv using np.savetxt.
Im trying to utilize the vtk Python library to write out some unstructured grids to legacy .vtk format
I am able to create the geometry, cell data and point data no problem and write them out to vtk.
The problem I have is I want to add the functionality to take cell data and convert it to point data.
See below code:
import vtk
#dfsu data
pts = df.mesh.generate_node_vertexes()
cls = df.mesh.generate_element_vertex_indexes()
cnts = df.mesh.generate_element_vertex_count()
#vtk instantiate
points = vtk.vtkPoints()
cell = vtk.vtkCellArray()
mesh_1 = vtk.vtkUnstructuredGrid()
#loop through points
for pt in pts:
points.InsertNextPoint(pt)
#set points
mesh_1.SetPoints(points)
#loop through cells
for cl, cnt in zip(cls, cnts):
if cnt == 3:
tri = vtk.vtkTriangle()
for i,j in enumerate(cl):
tri.GetPointIds().SetId(i, j)
mesh_1.InsertNextCell(tri.GetCellType(), tri.GetPointIds())
elif cnt == 4:
quad = vtk.vtkQuad()
for i,j in enumerate(cl):
quad.GetPointIds().SetId(i,j)
mesh_1.InsertNextCell(quad.GetCellType(), quad.GetPointIds())
#add cell data
arr = vtk.vtkDoubleArray()
arr.SetName('test1')
for i in range(len(cls)):
arr.InsertNextTuple([0.5])
arr1 = vtk.vtkDoubleArray()
arr1.SetName('test2')
for i in range(len(cls)):
arr1.InsertNextTuple([0.25])
mesh_1.GetCellData().AddArray(arr)
mesh_1.GetCellData().AddArray(arr1)
#convert to point data - THIS IS THE PART I CANT FIGURE OUT!
c2p = vtk.vtkCellDataToPointData()
c2p.SetInputData(arr)
#add point data
##here i want to add the converted point data
##
#write
writer = vtk.vtkUnstructuredGridWriter()
writer.SetFileName('test.vtk')
writer.SetInputData(mesh_1)
writer.Write()
I'm really new to VTK and it's a bit confusing. I can't figure out how to take my cell data and convert it to point data.
I've tried:
c2p.SetInputData(arr)
c2p.SetInputData(mesh_1.GetCellData().GetArray(0))
and a bunch of other random commands, really can't figure out how to do it.
any suggestions are appreciated - I've seen a ton of examples but were slightly different than what I am trying to do..
figured it out.. I had to actually pass the vtkUnstructuredGrid into the cell to point filter
c2p = vtk.vtkCellDataToPointData()
c2p.SetInputData(mesh_1)
c2p.Update()
ptdata = c2p.GetOutput()
this outputs another vtkUnstructuredGrid object with the cell data converted to point data, which I can then pass into the writer
I'm trying to plot a grid of air pollution data from a netCDF files in python using xarray. However, I'm facing a couple roadblocks.
To start off, here is the data that can be used to reproduce my code:
Data
When you try to import this data using xarray.open_dataset, you end up with a file that has zero coordinates or variables, and lots of attributes:
FILE_NAME = "test2.nc". ##I changed the name to make it shorter
xr.open_dataset(FILE_NAME)
So I created variables of the data and tried to import those into xarray:
prd='PRODUCT'
metdata = "METADATA"
lat= ds.groups[prd].variables['latitude']
lon= ds.groups[prd].variables['longitude']
no2 = ds.groups[prd].variables['nitrogendioxide_tropospheric_column']
scanline = ds.groups[prd].variables['scanline']
time = ds.groups[prd].variables['time']
ground_pixel = ds.groups[prd].variables['ground_pixel']
ds = xr.DataArray(no2,
dims=["time","x","y"],
coords={
"lon":(["time","x", "y"], lon)
}
# coords=[("time", time), ("x", scanline),("y", ground_pixel)]
)
As you can see above, I tried multiple ways of creating the coordinates, but I'm still getting an error. The data in this netCDF file is on an irregular grid, and I just want to be able to plot that accurately and quickly using xarray.
Does someone know how I can do this?
Two sections of my code are giving me trouble, I am trying to get the basemap created in this first section here:
#Basemap
epsg = 6060; width = 2000.e3; height = 2000.e3 #epsg 3413. 6062
m=Basemap(epsg=epsg,resolution='l',width=width,height=height) #lat_ts=(90.+35.)/2.
m.drawcoastlines(color='white')
m.drawmapboundary(fill_color='#99ffff')
m.fillcontinents(color='#cc9966',lake_color='#99ffff')
m.drawparallels(np.arange(10,70,20),labels=[1,1,0,0])
m.drawmeridians(np.arange(-100,0,20),labels=[0,0,0,1])
plt.title('ICESAT2 Tracks in Greenland')
plt.figure(figsize=(20,10))
Then my next section is meant to plot the data its getting from a file, and plot these tracks on top of the Basemap. Instead, it creates a new plot entirely. I have tried rewording the secondary plt.scatter to match Basemap, such as m.scatter, m.plt, etc. But it only returns with “RuntimeError: Can not put single artist in more than one figure” when I do so.
Any ideas on how to get this next section of code onto the basemap? Here is the next section, focus on the end to see where it is plotting.
icesat2_data[track] = dict() # creates a sub-dictionary, track
icesat2_data[track][year+month+day] = dict() # and one layer more for the date under the whole icesat2_data dictionary
icesat2_data[track][year+month+day] = dict.fromkeys(lasers)
for laser in lasers: # for loop, access all the gt1l, 2l, 3l
if laser in f:
lat = f[laser]["land_ice_segments"]["latitude"][:] # data for a particular laser's latitude.
lon = f[laser]["land_ice_segments"]["longitude"][:] #data for a lasers longitude
height = f[laser]["land_ice_segments"]["h_li"][:] # data for a lasers height
quality = f[laser]["land_ice_segments"]["atl06_quality_summary"][:].astype('int')
# Quality filter
idx1 = quality == 0 # data dictionary to see what quality summary is
#print('idx1', idx1)
# Spatial filter
idx2 = np.logical_and( np.logical_and(lat>=lat_min, lat<=lat_max), np.logical_and(lon>=lon_min, lon<=lon_max) )
idx = np.where( np.logical_and(idx1, idx2) ) # combines index 1 and 2 from data quality filter. make sure not empty. if empty all data failed test (low quality or outside box)
icesat2_data[track][year+month+day][laser] = dict.fromkeys(['lat','lon','height']) #store data, creates empty dictionary of lists lat, lon, hi, those strings are the keys to the dict.
icesat2_data[track][year+month+day][laser]['lat'] = lat[idx] # grabbing only latitudes using that index of points with good data quality and within bounding box
icesat2_data[track][year+month+day][laser]['lon'] = lon[idx]
icesat2_data[track][year+month+day][laser]['height'] = height[idx]
if lat[idx].any() == True and lon[idx].any() == True:
x, y = transformer.transform(icesat2_data[track][year+month+day][laser]['lon'], \
icesat2_data[track][year+month+day][laser]['lat'])
plt.scatter(x, y, marker='o', color='#000000')
Currently, they output separately, like this:
Not sure if you're still working on this, but here's a quick example I put together that you might be able to work with (obviously I don't have the data you're working with). A couple things that might not be self-explanatory - I used m() to transform the coordinates to map coordinates. This is Basemap's built-in transformation method so you don't have to use PyProj. Also, setting a zorder in the scatter function ensures that your points are plotted above the countries layer and don't get hidden underneath.
#Basemap
epsg = 6060; width = 2000.e3; height = 2000.e3 #epsg 3413. 6062
plt.figure(figsize=(20,10))
m=Basemap(epsg=epsg,resolution='l',width=width,height=height) #lat_ts=(90.+35.)/2.
m.drawcoastlines(color='white')
m.drawmapboundary(fill_color='#99ffff')
m.fillcontinents(color='#cc9966',lake_color='#99ffff')
m.drawparallels(np.arange(10,70,20),labels=[1,1,0,0])
m.drawmeridians(np.arange(-100,0,20),labels=[0,0,0,1])
plt.title('ICESAT2 Tracks in Greenland')
for coord in [[68,-39],[70,-39]]:
lat = coord[0]
lon = coord[1]
x, y = m(lon,lat)
m.scatter(x,y,color='red',s=100,zorder=10)
plt.show()
I think you might need:
plt.figure(figsize(20,10))
before creating the basemap, not after. As it stands it's creating a map and then creating a new figure after that which is why you're getting two figures.
Then your plotting line should be m.scatter() as you mentioned you tried before.
I have just recently started to work with shapefiles. I have a shapefile in which each object is a polygon. I want to produce a new shapefile in which the geometry of each polygon is replaced by its centroid. There is my code.
import geopandas as gp
from shapely.wkt import loads as load_wkt
fname = '../data_raw/bg501c_starazagora.shp'
outfile = 'try.shp'
shp = gp.GeoDataFrame.from_file(fname)
centroids = list()
index = list()
df = gp.GeoDataFrame()
for i,r in shp.iterrows():
index.append(i)
centroid = load_wkt(str(r['geometry'])).centroid.wkt
centroids.append(centroid)
df['geometry'] = centroids
df['INDEX'] = index
gp.GeoDataFrame.to_file(df,outfile)
When I run the script I end up with raise ValueError("Geometry column cannot contain mutiple " ValueError: Geometry column cannot contain mutiple geometry types when writing to file.
I cannot understand what is wrong. Any help?
The issue is that you're populating the geometry field with a string representation of the geometry rather than a shapely geometry object.
No need to convert to wkt. Your loop could instead be:
for i,r in shp.iterrows():
index.append(i)
centroid = r['geometry'].centroid
centroids.append(centroid)
However, there's no need to loop through the geodataframe at all. You could create a new one of shapefile centroids as follows:
df=gp.GeoDataFrame(data=shp, geometry=shp['geometry'].centroid)
df.to_file(outfile)