How to dump values from a Grib1/.grb file - python

I am wondering, is there a way to dump values from a grib1 file? My end goal is to find values for individual messages at latitude and longitude, or at least a grid point. I am using a linux system. Wgrib seems to do nothing except read metadata about the messages, or reconstruct the messages.
I know a bit of python, so I can use pygrib, but I don't know how to pull the values out for a specific latitude and longitude.
Here are some .grb files for everyone to play around with.
http://nomads.ncdc.noaa.gov/data/gfs-avn-hi/201402/20140218/
Thank you for your answers,

If you are interested in data from NOMADS, I would suggest going through their THREDDS Data Server, which will allow you to access data by specifying a lat/lon, and you can get that data back as a csv file, if you wish. To do so, first visit the NOMADS TDS site:
http://nomads.ncdc.noaa.gov/thredds/catalog/catalog.html
The example data files you linked to can be found here:
http://nomads.ncdc.noaa.gov/thredds/catalog/gfs-003/201402/20140218/catalog.html
Find a grid that you are interested in, say the 18Z run analysis field:
http://nomads.ncdc.noaa.gov/thredds/catalog/gfs-003/201402/20140218/catalog.html?dataset=gfs-003/201402/20140218/gfs_3_20140218_1800_000.grb
Follow the link that says "NetcdfService":
http://nomads.ncdc.noaa.gov/thredds/ncss/grid/gfs-003/201402/20140218/gfs_3_20140218_1800_000.grb/dataset.html
Near the top of that page, click "As Point Dataset":
http://nomads.ncdc.noaa.gov/thredds/ncss/grid/gfs-003/201402/20140218/gfs_3_20140218_1800_000.grb/pointDataset.html
Then check the parameters you are interested in, the lat/lon (the closest grid point to that lat/lon will be chosen), and the output format type.
This web interface basically generates an access url, which if I want Temperature over Boulder, CO, returned in csv, this is what it looks like:
http://nomads.ncdc.noaa.gov/thredds/ncss/grid/gfs-003/201402/20140218/gfs_3_20140218_1800_000.grb?var=Temperature_surface&latitude=40&longitude=-105&temporal=all&accept=csv&point=true
As you can see from the above URL, you can generate these pretty easily and make the request without going through all the steps above.
This access method (NetcdfSubsetService) can be combined with Python easily. For an example, check out this ipython notebook:
http://nbviewer.ipython.org/github/Unidata/tds-python-workshop/blob/master/ncss.ipynb
Specifically, the first and second cells in the notebook.
Note that you can get recent GFS data, in which an entire model run is contained within one grib file, at:
http://thredds.ucar.edu/thredds/idd/modelsNcep.html
This would allow you to make a request, like the one above, but for multiple times using one request.
Cheers,
Sean

You can use grib tools, specifically grib_ls and grib_get to get values from 1 grid point or 4 grid points nearest to specified latitude and longitude. So, you can use nearest neighbour or bilinear interpolation or whatever you like to get you value.
Read this presentation, grib_ls starts at page 31:
http://nwmstest.ecmwf.int/services/computing/training/material/grib_api/grib_api_tools.pdf
When you install grib tools, you will get several tools to help you play around with GRIB files.

Related

Display GTFS - Routes on a Map without Shapes

I try to consume some GTFS Feeds and work with them.
I created a MySQL Database and a Python Script, which downloads GTFS - Files and import them in the right tables.
Now I am using the LeafLet Map - Framework, to display the Stops on the map.
The next step is to display the routes of a bus or tram line on the map.
In the GTFS - Archive is no shapes.txt.
Is there a way to display the routes without the shapes.txt ?
Thanks!
Kali
You will have to generate your own shape using underlying street data or public transit lines. See detailed post by Anton Dubrau (he is an angel for writing this blog post).
https://medium.com/transit-app/how-we-built-the-worlds-prettiest-auto-generated-transit-maps-12d0c6fa502f
Specifically:
Here’s an example. In the diagram below, we have a trip with three
stops, and no shape information whatsoever. We extract the set of
tracks the trip uses from OSM (grey lines). Our matching algorithm
then finds a trajectory (black line) that follows the OSM, while
minimizing its length and the errors to the stops (e1, e2, e3).
The only alternative to using shapes.txt would be to use the stops along the route to define shapes. The laziest way would be to pick a single trip for that route, get the set of stops from stop_times.txt, and then get the corresponding stop locations from stops.txt.
If you wanted or needed to, you could get a more complete picture by finding the unique ordered sets of stops among all of the trips on that route, and define a shape for each ordered set in the same way.
Of course, these shapes would only be rough estimates because you don't have any information about the path taken by the vehicles between stops.

Spatial data extraction using Python

I have a lot of UK data and what I would like to do is extract this data based upon a post code, co-ordinates, grid ref etc.
Is this possible using Python?
Yes. If you just have the postcodes, you'll first most likely need to convert them to coordinates. This can be done with 3rd party tools such as Googles Distance Matrix API, or the Royal Mail UK Postcode mailing list. Once you have coordinates, with this, you can plot them however you like using other tools such as Highcharts, or make your own.

Geospatial Analytics in Python

I have been doing some investigation to find a package to install and use for Geospatial Analytics
The closest I got to was https://github.com/harsha2010/magellan - This however has only scala interface and no doco how to use it with Python.
I was hoping if you someone knows of a package I can use?
What I am trying to do is analyse Uber's data and map it to the actual postcodes/suburbs and run it though SGD to predict the number of trips to a particular suburb.
There is already lots of data info here - http://hortonworks.com/blog/magellan-geospatial-analytics-in-spark/#comment-606532 and I am looking for ways to do it in Python.
In Python I'd take a look at GeoPandas. It provides a data structure called GeoDataFrame: it's a list of features, each one having a geometry and some optional attributes. You can join two GeoDataFrames together based on geometry intersection, and you can aggregate the numbers of rows (say, trips) within a single geometry (say, postcode).
I'm not familiar with Uber's data, but I'd try to find a way to get it into a GeoPandas GeoDataFrame.
Likewise postcodes can be downloaded from places like the U.S. Census, OpenStreetMap[1], etc, and coerced into a GeoDataFrame.
Join #1 to #2 based on geometry intersection. You want a new GeoDataFrame with one row per Uber trip, but with the postcode attached to each. Another StackOverflow post discusses how do to this, and it's currently harder than it ought to be.
Aggregate this by postcode and count the trips in each. The code will look like joined_dataframe.groupby('postcode').count().
My fear for the above process is if you have hundreds of thousands of very complex trip geometries, it could take forever on one machine. The link you posted uses Spark and you may end up wanting to parallelize this after all. You can write Python against a Spark cluster(!) but I'm not the person to help you with this component.
Finally, for the prediction component (e.g. SGD), check out scikit-learn: it's a pretty fully featured machine learning package, with a dead simple API.
[1]: There is a separate package called geopandas_osm that grabs OSM data and returns a GeoDataFrame: https://michelleful.github.io/code-blog/2015/04/27/osm-data/
I realize this is an old questions, but to build on Jeff G's answer.
If you arrive at this page looking for help putting together a suite of geospatial analytics tools in python - I would highly recommend this tutorial.
https://geohackweek.github.io/vector
It really picks up steam in the 3rd section.
It shows how to integrate
GeoPandas
PostGIS
Folium
rasterstats
add in scikit-learn, numpy, and scipy and you can really accomplish a lot. You can grab information from this nDarray tutorial as well

How to find the source data of a series chart in Excel

I have some pretty strange data i'm working with, as can be seen in the image. Now I can't seem to find any source data for the numbers these graphs are presenting.
Furthermore if I search for the source it only points to an empty cell for each graph.
Ideally I want to be able to retrieve the highlighted labels in each case using python, and it seems finding the source is the only way to do this, so if you know of a python module that can do that i'd be happy to use it. Otherwise if you can help me find the source data that would be even perfecter :P
So far i've tried the XLDR module for python as well as manually showing all hidden cells, but neither work.
Here's a link to the file: Here
EDIT I ended up just converting the xlsx to a pdf using cloudconvert.com API
Then using pdftotext to convert the data to a .txt which just analyses everything including the numbers on the edge of the chart which can then be searched using an algorithm.
If a hopeless internet wanderer comes upon this thread with the same problem, you can PM me for more details :P

How to convert from lat lon to zipcode or state to generate choropleth map

I have a large collection (and growing) of geospatial data (lat, lon) points (stored in mongodb, if that helps).
I'd like to generate a choropleth map (http://vis.stanford.edu/protovis/ex/choropleth.html), which requires knowing the state which contains that point. Is there a database or algorithm that can do this without requiring call to external APIs (i.e. I'm aware of things like geopy and the google maps API).
Actually, the web app you linked to contains the data you need -
If you look at http://vis.stanford.edu/protovis/ex/us_lowres.js for each state, borders[] contains a [lat,long] polyline which outlines the state. Load this data and check for point-in-polygon - http://en.wikipedia.org/wiki/Point_in_polygon
Per Reverse Geocoding Without Web Access you can speed it up a lot by pre-calculating a bounding box on each state and only testing point-in-polygon if point-in-bounding-box.
Here's how to do it in FORTRAN. Remember FORTRAN? Me neither. Anyway, it looks pretty simple, as every state has its own range.
EDIT It's been point out to me that your starting point is LAT-LONG, not the zipcode.
The algorithm for converting a lat-long to a political division is called "a map". Seriously, that's allan ordinary map is, a mapping of every point in some range to the division it belongs to. A detailed digital map of all 48 contiguous states would be a big database, and then you would need some (fairly simple) code to determine for each state (described as a series of line segments outlining the border) whether a given point was inside it or out.
you can try using Geonames database. It has long/lat as well as city, postal and other location type data. It is free as well.
but If you need to host it locally or import it into your own database , the USGS and NGA provide a comprehensive list of cities with lat/lon. It's updated reguarly, free, and reliable.
http://geonames.usgs.gov
http://earth-info.nga.mil/gns/html/index.html
Not sure the quality of the data, but give this a shot: http://www.boutell.com/zipcodes/
If you don't mind a very crude solution, you could adapt the click-map here.

Categories

Resources