I'd like to render an ASCII art world map given this GeoJSON file.
My basic approach is to load the GeoJSON into Shapely, transform the points using pyproj to Mercator, and then do a hit test on the geometries for each character of my ASCII art grid.
It looks (edit: mostly) OK when centered one the prime meridian:
But centered on New York City (lon_0=-74), and it suddenly goes haywire:
I'm fairly sure I'm doing something wrong with the projections here. (And it would probably be more efficient to transform the ASCII map coordinates to lat/lon than transform the whole geometry, but I am not sure how.)
import functools
import json
import shutil
import sys
import pyproj
import shapely.geometry
import shapely.ops
# Load the map
with open('world-countries.json') as f:
countries = []
for feature in json.load(f)['features']:
# buffer(0) is a trick for fixing polygons with overlapping coordinates
country = shapely.geometry.shape(feature['geometry']).buffer(0)
countries.append(country)
mapgeom = shapely.geometry.MultiPolygon(countries)
# Apply a projection
tform = functools.partial(
pyproj.transform,
pyproj.Proj(proj='longlat'), # input: WGS84
pyproj.Proj(proj='webmerc', lon_0=0), # output: Web Mercator
)
mapgeom = shapely.ops.transform(tform, mapgeom)
# Convert to ASCII art
minx, miny, maxx, maxy = mapgeom.bounds
srcw = maxx - minx
srch = maxy - miny
dstw, dsth = shutil.get_terminal_size((80, 20))
for y in range(dsth):
for x in range(dstw):
pt = shapely.geometry.Point(
(srcw*x/dstw) + minx,
(srch*(dsth-y-1)/dsth) + miny # flip vertically
)
if any(country.contains(pt) for country in mapgeom):
sys.stdout.write('*')
else:
sys.stdout.write(' ')
sys.stdout.write('\n')
I made edit at the bottom, discovering new problem (why there is no Canada and unreliability of Shapely and Pyproj)
Even though its not exactly solving the problem, I believe this attitude has more potential than using pyproc and Shapely and in future, if you will do more Ascii art, will give you more possibilites and flexibility. Firstly I will write pros and cons.
PS: Initialy I wanted to find problem in your code, but I had problems with running it, because pyproj was returning me some error.
PROS
1) I was able to extract all points (Canada is really missing) and rotate image
2) The processing is very fast and therefore you can create Animated Ascii art.
3) Printing is done all at once without need to loop
CONS (known Issues, solvable)
1) This attitude is definetly not translating geo-coordinates correctly - too plane, it should look more spherical
2) I didnt take time to try to find out solution to filling the borders, so only borders has '*'. Therefore this attitude needs to find algorithm to fill the countries. I think it shouldnt be problem since the JSON file contains countries separated
3) You need 2 extra libs beside numpy - opencv(you can use PIL instead) and Colorama, because my example is animated and I needed to 'clean' terminal by moving cursor to (0,0) instead of using os.system('cls')
4) I made it run only in python 3. In python 2 it works too but I am getting error with sys.stdout.buffer
Change font size on terminal to lowest point so the the printed chars fit in terminal. Smaller the font, better resolution
The animation should look like the map is 'rotating'
I used little bit of your code to extract the data. Steps are in the commentaries
import json
import sys
import numpy as np
import colorama
import sys
import time
import cv2
#understand terminal_size as how many letters in X axis and how many in Y axis. Sorry not good name
if len(sys.argv)>1:
terminal_size = (int(sys.argv[1]),int(sys.argv[2]))
else:
terminal_size=(230,175)
with open('world-countries.json') as f:
countries = []
minimal = 0 # This can be dangerous. Expecting negative values
maximal = 0 # Expecting bigger values than 0
for feature in json.load(f)['features']: # getting data - I pretend here, that geo coordinates are actually indexes of my numpy array
indexes = np.int16(np.array(feature['geometry']['coordinates'][0])*2)
if indexes.min()<minimal:
minimal = indexes.min()
if indexes.max()>maximal:
maximal = indexes.max()
countries.append(indexes)
countries = (np.array(countries)+np.abs(minimal)) # Transform geo-coordinates to image coordinates
correction = np.abs(minimal) # because geo-coordinates has negative values, I need to move it to 0 - xaxis
colorama.init()
def move_cursor(x,y):
print ("\x1b[{};{}H".format(y+1,x+1))
move = 0 # 'rotate' the globe
for i in range(1000):
image = np.zeros(shape=[maximal+correction+1,maximal+correction+1]) #creating clean image
move -=1 # you need to rotate with negative values
# because negative one are by numpy understood. Positive one will end up with error
for i in countries: # VERY STRANGE,because parsing the json, some countries has different JSON structure
if len(i.shape)==2:
image[i[:,1],i[:,0]+move]=255 # indexes that once were geocoordinates now serves to position the countries in the image
if len(i.shape)==3:
image[i[0][:,1],i[0][:,0]+move]=255
cut = np.where(image==255) # Bounding box
if move == -1: # creating here bounding box - removing empty edges - from sides and top and bottom - we need space. This needs to be done only once
max_x,min_x = cut[0].max(),cut[0].min()
max_y,min_y = cut[1].max(),cut[1].min()
new_image = image[min_x:max_x,min_y:max_y] # the bounding box
new_image= new_image[::-1] # reverse, because map is upside down
new_image = cv2.resize(new_image,terminal_size) # resize so it fits inside terminal
ascii = np.chararray(shape = new_image.shape).astype('|S4') #create container for asci image
ascii[:,:]='' #chararray contains some random letters - dunno why... cleaning it
ascii[:,-1]='\n' #because I pring everything all at once, I am creating new lines at the end of the image
new_image[:,-1]=0 # at the end of the image can be country borders which would overwrite '\n' created one step above
ascii[np.where(new_image>0)]='*' # transforming image array to chararray. Better to say, anything that has pixel value higher than 0 will be star in chararray mask
move_cursor(0,0) # 'cleaning' the terminal for new animation
sys.stdout.buffer.write(ascii) # print into terminal
time.sleep(0.025) # FPS
Maybe it would be good to explain what is the main algorithm in the code. I like to use numpy whereever I can. The whole thing is that I pretend that coordinates in the image, or whatever it may be (in your case geo-coordinates) are matrix indexes. I have then 2 Matrixes - Real Image and Charray as Mask. I then take indexes of interesting pixels in Real image and for the same indexes in Charray Mask I assign any letter I want. Thanks to this, the whole algorithm doesnt need a single loop.
About Future posibilities
Imagine you will also have information about terrain(altitude). Let say you somehow create grayscale image of world map where gray shades expresses altitude. Such grayscale image would have shape x,y. You will prepare 3Dmatrix with shape = [x,y,256]. For each layer out of 256 in the 3D matrix, you assign one letter ' ....;;;;### and so on' that will express shade.
When you have this prepared, you can take your grayscale image where any pixel will actually have 3 coordinates: x,y and shade value. So you will have 3 arrays of indexes out of your grascale map image -> x,y,shade. Your new charray will simply be extraction of your 3Dmatrix with layer letters, because:
#Preparation phase
x,y = grayscale.shape
3Dmatrix = np.chararray(shape = [x,y,256])
table = ' ......;;;;;;;###### ...'
for i in range(256):
3Dmatrix[:,:,i] = table[i]
x_indexes = np.arange(x*y)
y_indexes = np.arange(x*y)
chararray_image = np.chararray(shape=[x,y])
# Ready to print
...
shades = grayscale.reshape(x*y)
chararray_image[:,:] = 3Dmatrix[(x_indexes ,y_indexes ,shades)].reshape(x,y)
Because there is no loop in this process and you can print chararray all at once, you can actually print movie into terminal with huge FPS
For example if you have footage of rotating earth, you can make something like this - (250*70 letters), render time 0.03658s
You can ofcourse take it into extreme and make super-resolution in your terminal, but resulting FPS is not that good: 0.23157s, that is approximately 4-5 FPS. Interesting to note is, that this attitude FPS is enourmous, but terminal simply cannot handle printing, so this low FPS is due to limitations of terminal and not of calculation as calculation of this high resolution took 0.00693s, that is 144 FPS.
BIG EDIT - contradicting some of above statements
I accidentaly opened raw json file and find out, there is CANADA and RUSSIA with full correct coordinates. I made mistake to rely on the fact that we both didnt have canada in the result, so I expected my code is ok. Inside JSON, the data has different NOT-UNIFIED structure. Russia and Canada has 'Multipolygon', so you need to iterate over it.
What does it mean? Dont rely on Shapely and pyproj. Obviously they cant extract some countries and if they cant do it reliably, you cant expect them to do anything more complicated.
After modifying the code, everything is allright
CODE: This is how to load the file correctly
...
with open('world-countries.json') as f:
countries = []
minimal = 0
maximal = 0
for feature in json.load(f)['features']: # getting data - I pretend here, that geo coordinates are actually indexes of my numpy array
for k in range((len(feature['geometry']['coordinates']))):
indexes = np.int64(np.array(feature['geometry']['coordinates'][k]))
if indexes.min()<minimal:
minimal = indexes.min()
if indexes.max()>maximal:
maximal = indexes.max()
countries.append(indexes)
...
Related
I am trying to create a volume in Gmsh (using Python API) by cutting some small cylinders from a bigger one.
When I do that, I expect to have one surface for each cutted region, instead, I get the result in the figure. I have highlighted in red the surfaces that give me the problem (some cutted regions behave as expected), as you can see, instead of one surface I get two, that sometimes aren't even equal.
gmsh creates more surfaces than expected:
So, my questions are:
Why gmsh behaves like that?
How can I fix this as I need predictable behavior?
Below is the code I used to generate the geometry.
The code to work requires some parameters such as core_height, core_inner_radius and core_outer_radius, the number of small cylinders and their radius.
gmsh.initialize(sys.argv)
#gmsh.initialize()
gmsh.clear()
gmsh.model.add("circle_extrusion")
inner_cyl_tag = 1
outer_cyl_tag = 2
inner_cyl = gmsh.model.occ.addCylinder(0,0,0, 0, 0, core_height, core_inner_radius, tag = inner_cyl_tag)
outer_cyl = gmsh.model.occ.addCylinder(0,0,0, 0, 0, core_height, core_outer_radius, tag = outer_cyl_tag)
core_tag = 3
cut1 = gmsh.model.occ.cut([(3,outer_cyl)],[(3,inner_cyl)], tag = core_tag)
#create a set of filled cylinders
#set position
angle_vector = np.linspace(0,2*np.pi,number_of_hp+1)
pos_x = hp_radial_position*np.cos(angle_vector)
pos_y = hp_radial_position*np.sin(angle_vector)
pos_z = 0.0
#cut one cylinder at the time and assign the new core tag
for ii in range(0,len(angle_vector)):
old_core_tag = core_tag
heat_pipe = gmsh.model.occ.addCylinder(pos_x[ii], pos_y[ii], pos_z, 0, 0, core_height,hp_outer_radius, tag =-1)
core_tag = heat_pipe+1
core = gmsh.model.occ.cut([(3,old_core_tag)],[(3,heat_pipe)], tag = core_tag)
gmsh.model.occ.synchronize()
#get volume entities and assign physical groups
volumes = gmsh.model.getEntities(dim=3)
solid_marker = 1
gmsh.model.addPhysicalGroup(volumes[0][0], [volumes[0][1]],solid_marker)
gmsh.model.setPhysicalName(volumes[0][0],solid_marker, "solid_volume")
#get surfaces entities and apply physical groups
surfaces = gmsh.model.getEntities(dim=2)
surface_markers= np.arange(1,len(surfaces)+1,1)
for ii in range(0,len(surfaces)):
gmsh.model.addPhysicalGroup(2,[surfaces[ii][1]],tag = surface_markers[ii])
#We finally generate and save the mesh:
gmsh.model.mesh.generate(3)
gmsh.model.mesh.refine()
gmsh.model.mesh.refine()
gmsh.option.setNumber("Mesh.MshFileVersion", 2.2) #save in ASCII 2 format
gmsh.write(mesh_name+".msh")
# Launch the GUI to see the results:
#if '-nopopup' not in sys.argv:
# gmsh.fltk.run()
gmsh.finalize()
I do not think that you have additional surfaces in the sense of gmsh.model.occ surfaces. To me this looks like your volume mesh is sticking out of your surface mesh, i.e. volume and surface mesh do not fit together.
Here is what I did to check your case:
First I added the following lines at the beginning of our code to get a minimum working example:
import gmsh
import sys
import numpy as np
inner_cyl_tag = 1
outer_cyl_tag = 2
core_height = 1
core_inner_radius = 0.1
core_outer_radius = 0.2
number_of_hp = 5
hp_radial_position = 0.1
hp_outer_radius = 0.05
What I get with this code is the following:
To visualize it like this go to "Tools"-->"Options"-->"Mesh" and check "2D element faces", "3D element edges" and "3D element faces".
You can see that there are some purple triangles sticking out of the green/yellowish surfaces triangles of the inner surfaces.
You could try to visualize your case the same way and check <--> uncheck the "3D element faces" a few times.
So here is the solution for this behaviour, I did not know that gmsh behaves like this myself. It seems that when you create your mesh and refine it the refinement will be applied on the 2D surface mesh and the 3D volume mesh seperately, which means that those two meshes are not connected after the refinement anymore. What I did next was to try what happens if you create the 2D mesh only, refine it, and then create the 3D mesh, i.e.:
replace:
gmsh.model.mesh.generate(3)
gmsh.model.mesh.refine()
gmsh.model.mesh.refine()
by:
gmsh.model.mesh.generate(2)
gmsh.model.mesh.refine()
gmsh.model.mesh.refine()
gmsh.model.mesh.generate(3)
The result then looks like this:
I hope that this was actually your problem. But in future it would be good if you could provide us a minimum working example of code that we can copy-paste and get the same visualization you showed us in your image.
#!/usr/bin/env python3
import numpy as np
from osgeo import gdal
from osgeo import osr
# Load an array with shape (197, 250, 3)
# Data with dim of 3 contain (value, longitude, latitude)
data = np.load("data.npy")
# Copy the data and coordinates
array = data[:,:,0]
lon = data[:,:,1]
lat = data[:,:,2]
nLons = array.shape[1]
nLats = array.shape[0]
# Calculate the geotransform parameters
maxLon, minLon, maxLat, minLat = [lon.max(), lon.min(), lat.max(), lat.min()]
resLon = (maxLon - minLon) / nLons
resLat = (maxLat - minLat) / nLats
# Get the transform
geotransform = (minLon, resLon, 0, maxLat, 0, -resLat)
# Create the ouptut raster
output_raster = gdal.GetDriverByName('GTiff').Create('myRaster.tif', nLons, nLats, 1,
gdal.GDT_Int32)
# Set the geotransform
output_raster.SetGeoTransform(geotransform)
srs = osr.SpatialReference()
# Set to world projection 4326
srs.ImportFromEPSG(4326)
output_raster.SetProjection(srs.ExportToWkt())
output_raster.GetRasterBand(1).WriteArray(array)
output_raster.FlushCache()
The code above is meant to georeference a raster using GDAL but returns blank tiff files. I have vetted the data and variables, I, however, suspect the problem could be from geotransform variables. The documentation demands the variable to be:
top-left-x, w-e-pixel-resolution, 0,
top-left-y, 0, n-s-pixel-resolution (negative value)
I have used lats and lons not sure I'm getting which one corresponds to x and which to y. It could be something else but I'm not quite sure.
Overall your approach looks correct to me, but it's hard to tell without seeing the data you're using, but here are some points to consider:
First, there's a difference between the output file being empty, and/or being in the wrong location, georeferencing relates only to the latter.
When working interactive, you should also make sure to properly close the Dataset using output_raster = None, that will also trigger flushing for you.
You could start by testing if GDAL reads the same data that you intended to write. Using something like:
ds = gdal.Open('myRaster.tif')
data_from_disk = ds.ReadAsArray()
ds = None
np.testing.assert_array_equal(data_from_disk, array)
If those are not identical, it could be an issue with the datatype. Like writing floats close to 0 as integers, causing them to clip to 0 giving the appearance of an "empty" file.
Regarding the georeferencing, the projection you use has the coordinates in degrees. If yours are in radians your output ends up close to null-island.
Your approach also assumes that the data and lat/lon arrays are on a regular grid (having a constant resolution). That might not be the case (especially if the data comes with a 2D grid of coordinates).
Often when coordinate arrays are given, they are defined as valid for the center of the pixel. Compared to GDAL's geotransform which is defined for the (outer) edge of the pixel. So you might need to account for that by subtracting half the resolution. And this also impacts your calculation of the resolution, which in the case for the center-definition should probably use / (nLons-1) & / (nLats-1). Or alternatively verify with:
# for a regular grid
resLon = lon[0,1] - lon[0,0]
resLat = lat[1,0] - lat[0,0]
When I run your snippet with some dummy data, it gives me a correct output (ignoring the center/edge issue mentioned above).
lat, lon = np.mgrid[89:-90:-2, -179:180:2]
array = np.sqrt(lon**2 + lat**2).astype(np.int32)
So when one exports r.out.vtk from Grass GIS we get a bad surface with -99999 points instead of nulls:
I want to remove them, yet a simple clip is not enough:
pd = pv.read('./pid1.vtk')
pd = pd.clip((0,1,1), invert=False).extract_surface()
p.add_mesh(pd ) #add atoms to scene
p.show()
resulting in:
So I wonder how to keep from it only top (> -999) points and connected vertices - in order to get only the top plane (it is curved\not flat actually) using pyvista?
link to example .vtk
There is an easy way to do this and there isn't...
You could use pyvista's threshold filter with all_scalars=True as long as you have only one set of scalars:
import pyvista as pv
pd = pv.read('./pid1.vtk')
pd = pd.threshold(-999, all_scalars=True)
plotter = pv.Plotter()
plotter.add_mesh(pd) #add atoms to scene
plotter.show()
Since all_scalars starts filtering based on every scalar array, this will only do what you'd expect if there are no other scalars. Furthermore, unfortunately there seems to be a bug in pyvista (expected to be fixed in version 0.32.0) which makes the use of this keyword impossible.
What you can do in the meantime (if you don't want to use pyvista's main branch before the fix is released) is to threshold the data yourself using numpy:
import pyvista as pv
pd = pv.read('./pid1.vtk')
scalars = pd.active_scalars
keep_inds = (scalars > -999).nonzero()[0]
pd = pd.extract_points(keep_inds, adjacent_cells=False)
plotter = pv.Plotter()
plotter.add_mesh(pd) #add atoms to scene
plotter.show()
The main point of both all_scalars (in threshold) and adjacent_cells (in extract_points) is to only keep cells where every point satisfies the condition.
With both of the above I get the following figure using your data:
Marmot is a document image dataset (http://www.icst.pku.edu.cn/cpdp/data/marmot_data.htm) where labelling several things such as document body, image area, table area, table caption and so on. This dataset specially use for document image analysis research purpose. They mentioned all coordinates in 16 digit hexa decimal with little endian format. Is there anyone how worked with this dataset and how to convert that 16 digit XY coordinate to human understandable format?
Finally I got the clue after analysis and posting here if anyone need to investigate this dataset. However, they mentioned the unit value in which way they convert the given coordinate into pixel value but it was difficult to trace out because they did not mentioned it in their manual/guideline. They mentioned another place as an annotation.
First you have to convert their 16 character hexadecimal value using IEEE 754 little endian format. For example, a given coordinates for a label is,
BBox=['4074145c00000005', '4074dd95999999a9', '4080921e74bc6a80', '406fb9999999999a']
Convert using python,
conv_pound = struct.unpack('!d', str(t).decode('hex'))[0]) for t in BBox]
You will get value in "pound" unit which is 1/72 inch. We usually use coordinate in pixel unit and we know 1 inch is 96 pixel. So,
conv_pound = [321.2724609375003, 333.8490234375009, 530.2648710937501, 253.8]
Then, divided each value by 72 and multiply with 96 to finally get corresponding pixel value which is,
in_pixel = [428.36328, 445.13203, 707.01983, 338.40000]
They started to count pixel position from bottom-left corner of the document image. If you consider from top-left corner (usually we consider in this way), you have to subtract 2nd and 4th value from image height. If we consider image [height, width] is [1123, 793] then we can represent the above coordinates in integer value as,
label_boundary = [428, 678, 707, 785]
After staring at the xmls for an hour, I've found the last missing piece in the answer by #MMReza:
You don't need to rely on the units of measure in (step number 3). There is an attribute called "CropBox" of the root element "Page". Use that one to scale the coordinates.
I have something along the following lines (also inverse y axis here):
px0, py1, px1, py0 = list(map(hex_to_double, page.get("CropBox").split()))
pw = abs(px1 - px0)
ph = abs(py1 - py0)
for table in page.findall(".//Composite[#Label='TableBody']"):
x0p, y1m, x1p, y0m = list(map(hex_to_double, table.get("BBox").split()))
x0 = round(imgw*(x0p - px0)/pw)
x1 = round(imgw*(x1p - px0)/pw)
y0 = round(imgh*(py1 - y0m)/ph)
y1 = round(imgh*(py1 - y1m)/ph)
In case anyone is trying to do this in Python 3 like I did, you only have to change step 2 of the other answer like this :
conv_pound = [struct.unpack('!d', bytes.fromhex(t))[0] for t in BBox]
I wanted to convert the coordinates as well as wanted to verify that my conversion actually worked. So, I made this script to read label file and respective image file then extract coordinates of table body(for eg) and visualize them on the images. It can be used to extract other fields in the similar manner. Comments explain it all
import glob
import struct
import cv2
import binascii
import re
xml_files = glob.glob("path_to_labeled_files/*.xml")
for i in xml_files:
# Open the current file and read everything
cur_file = open(i,"r")
content = cur_file.read()
# Find index of all occurrences of only needed portions (eg TableBody this case)
residxs = [l.start() for l in re.finditer('Label="TableBody"', content)]
# Read the image
img = cv2.imread("path_to_images_folder/"+i.split('/')[-1][:-3]+"jpg")
# Traverse over all occurences
for r in residxs[:-1]:
# List to store output points
coords = []
# Start index of an occurence
sidx = r
# Substring from whole file content
substr = content[sidx:sidx+400]
# Now find start index and end index of coordinates in this substring
sidx = substr.find('BBox="')
eidx = substr.find('" CLIDs')
# String containing only points
points = substr[sidx+6:eidx]
# Make the conversion (also take care of little and big endian in unpack)
bins = ''
for j in points.split(' '):
if(j == ''):
continue
coords.append(struct.unpack('>d', binascii.unhexlify(j))[0])
if len(coords) != 4:
continue
# As suggested by MMReza
for k in range(4):
coords[k] = (coords[k]/72)*96
coords[1] = img.shape[0] - coords[1]
coords[3] = img.shape[0] - coords[3]
# Print the extracted coordinates
print(coords)
# Visualize it on the image
cv2.rectangle(img, (int(coords[0]),int(coords[1])) , (int(coords[2]),int(coords[3])), (255, 0, 0), 2)
cv2.imshow("frame",img)
cv2.waitKey(0)
I am trying to splice a fits array based on the latitudes provided from the Header. However, I cannot seem to do so with my knowledge of Python and the documentation of astropy. The code I have is something like this:
from astropy.io import fits
import numpy as np
Wise1 = fits.open('Image1.fits')
im1 = Wise1[0].data
im1 = np.where(im1 > *latitude1, 0, im1)
newhdu = fits.PrimaryHDU(im1)
newhdulist = fits.HDUList([newhdu])
newhdulist.writeto('1b1_Bg_Removed_2.fits')
Here latitude1 would be a value in degrees, recognized after being called from the header. So there are two things I need to accomplish:
How to call the header to recognize Galactic Latitudes?
Splice the array in such a way that it only contains values for the range of latitudes, with everything else being 0.
I think by "splice" you mean "cut out" or "crop", based on the example you've shown.
astropy.nddata has a routine for world-coordinate-system-based (i.e., lat/lon or ra/dec) cutouts
However, in the simple case you're dealing with, you just need the coordinates of each pixel. Do this by making a WCS:
from astropy import wcs
w = wcs.WCS(Wise1[0].header)
xx,yy = np.indices(im.shape)
lon,lat = w.wcs_pix2world(xx,yy,0)
newim = im[lat > my_lowest_latitude]
But if you want to preserve the header information, you're much better off using the cutout tool, since you then do not have to manually manage this.
from astropy.nddata import Cutout2D
from astropy import coordinates
from astropy import units as u
# example coordinate - you'll have to figure one out that's in your map
center = coordinates.SkyCoord(mylon*u.deg, mylat*u.deg, frame='fk5')
# then make an array cutout
co = nddata.Cutout2D(im, center, size=[0.1,0.2]*u.arcmin, wcs=w)
# create a new FITS HDU
hdu = fits.PrimaryHDU(data=co.data, header=co.wcs.to_header())
# write to disk
hdu.writeto('cropped_file.fits')
An example use case is in the astropy documentation.