I have a python application that creates polygons to identify geographic areas of interest at specific times. To this point I've been using geojson because of the handy geojson library that makes writing it easy. I put the time information in the file name. However now I need to publish my polygons via a WMS with TIME (probably going to use mapserver). As geojson doesn't appear to support a feature time and geojson-events hasn't been accepted yet, I thought I would try to convert to GML,however I cannot seem to locate a library that would make writing GML from python simple. Does one exist? I tried using the geojson-events format and then ogr2ogr to convert from geojson-events to gml but the time information gets dropped.
So looking for either:
a) an efficient way to write GML from python,
b) a way to encode datetime information into geojson such that ogr will recognize it or
c) another brilliant solution I haven't thought of.
To convert GeoJSON into GML you could use GDAL (Geospatial Data Abstraction Library). There are numerous ways of using the library including directly with Python
However as you want to set up a WMS to serve your data, you might want to set up a spatial database, for example PostgreSQL/PostGIS, and import the GeoJSON directly into the database, then allow MapServer to do the conversion for you.
See Store a GeoJSON FeatureCollection to postgres with postgis for details of how you might do this.
Related
Let's say I want to create maps of crime, education, traffic or etc on a street or a city. What are the modules I need to learn or the best ones?
For some data, I will be using excell like documents where I will have street names or building numbers unlinked to Google Maps directly and will be combined later through codes. For some, I want to obtain data directly from Google Maps, such as names of the stores or street numbers. I'm a beginner and a sociologist and this is the main reason I want to learn programming. Maybe painting on a map picture can be a lot easier but on the long term my aim is using Google Maps since it can obtain data by itself. Thanks in advance.
I'm a beginner, need a long shot plan and an advice. I watched some numpy and pandas videos and they seem ok and doable so far.
There are several Python modules that can be used to work with Google Maps data. Some of the most popular ones include:
Google Maps API: This is the official API for working with Google Maps data. It allows you to access a wide range of data, including street maps, satellite imagery, and places of interest. You can use the API to search for addresses, get directions, and even create custom maps.
gmaps: This is a Python wrapper for the Google Maps API. It makes it easy to work with the API by providing a simple, Pythonic interface. It also includes support for several popular Python libraries, such as Pandas and Numpy.
folium: This is a library for creating leaflet maps in Python. It allows you to create interactive maps, add markers and other data, and customize the appearance of your maps.
geopandas: This library allows you to work with geospatial data in Python. It is built on top of the popular Pandas library and includes support for working with shapefiles, geojson, and more.
geopy: This is a Python library for working with geocoding and distance calculations. It can be used to convert addresses to latitude and longitude coordinates, as well as to perform distance calculations between two points.
In general, it's recommended to start with Google Maps API and gmaps and folium, you can also use geopandas and geopy later when you need more advanced functionalities. Try to start with simple examples and gradually increase the complexity of your projects.
I've got a PDF file that I'm trying to obtain specific data from.
I've been able to parse the PDF via PyPDF2 into one long string but searching for specific data is difficult because of - I assume - formatting in the original PDF.
What I am looking to do is to retrieve specific known fields and the data that immediately follows (as formatted in the PDF) and then store these in seperate variables.
The PDFs are bills and hence are all presented in the exact same way, with defined fields and images. So what I am looking to do is to extract these fields.
What would be the best way to achieve this?
I've got a PDF file that I'm trying to obtain specific data from.
In general, it is probably impossible (or extremely difficult), and details (than you don't mention) are very important. Study in details the complex PDF specification. Notice that PDF is (more or less accidentally) Turing complete (so your problem is undecidable in general, since equivalent to the halting problem).
For example, a normal human reader could read digits in the document as text, or as a JPEG image, etc. And in practice many PDF documents have such kind of data.... Practically speaking, PDF is an output-only format and is designed for screen displaying and printing, not for extracting data from it.
You need to understand how exactly that PDF file was generated (with what exact software, from what actual data). That could take a lot of time (maybe several years of full time reverse-engineering work) without help.
A much better approach is to contact the person or entity providing that PDF file and negotiate some way of accessing the actual data (or at least get detailed explanation about the generation of that particular PDF file). For example, if the PDF file is computed from some database, you'll better access that database.
Perhaps using metadata or comments in your PDF file might help in guessing how it was generated.
The source of the data might produce various kinds of PDF file. For example, my cheap scanner is able to produce PDF. But your program would have hard time in extracting some numerical data from it (because that kind of PDF is essentially wrapping a pixelated image à la JPEG) and would need to deploy image recognition techniques (i.e. OCR) to do so.
I have some pretty strange data i'm working with, as can be seen in the image. Now I can't seem to find any source data for the numbers these graphs are presenting.
Furthermore if I search for the source it only points to an empty cell for each graph.
Ideally I want to be able to retrieve the highlighted labels in each case using python, and it seems finding the source is the only way to do this, so if you know of a python module that can do that i'd be happy to use it. Otherwise if you can help me find the source data that would be even perfecter :P
So far i've tried the XLDR module for python as well as manually showing all hidden cells, but neither work.
Here's a link to the file: Here
EDIT I ended up just converting the xlsx to a pdf using cloudconvert.com API
Then using pdftotext to convert the data to a .txt which just analyses everything including the numbers on the edge of the chart which can then be searched using an algorithm.
If a hopeless internet wanderer comes upon this thread with the same problem, you can PM me for more details :P
I am trying to use ConceptNet with the divisi2 package. The divisi package is particularly designed for working with knowledge in semantic networks. This package takes up the graph as a input and convert it into SVD forms. With the package distribution they had provided the basic conceptNet data into graph format, this data seems to be outdated. Divisi can be used in this way Using Divisi with conceptNet(link). But the data needs to be updated with conceptNet5 data, is there any way to do that. Provided I have all the conceptNet data setted up locally as described in Running your own copy. So, the sqlite database I have all the data. Also I have the data into csv formats separately. So how can I load this data into Divisi package. Thanks.
I'm processing some data for a research project, and I'm writing all my scripts in python. I've been using matplotlib to create graphs to present to my supervisor. However, he is a die-hard MATLAB user and he wants me to send him MATLAB .fig files rather than SVG images.
I've looked all over but can't find anything to do the job. Is there any way to either export .fig files from matplotlib, convert .svg files to .fig, or import .svg files into MATLAB?
Without access to (or experience with matlab) this is going to be a bit tricky. As Amro stated, .fig files store the underlying data, and not just an image, and you're going to have a hard time saving .fig files from python. There are however a couple of things which might work in your favour, these are:
numpy/scipy can read and write matlab .mat files
the matplotlib plotting commands are very similar to/ based on the matlab ones, so the code to generate plots from the data is going to be nearly identical (modulo round/square brackets and 0/1 based indexing).
My approach would be to write your data out as .mat files, and then just put your plotting commands in a script and give that to your supervisor - with any luck it shouldn't be too hard for him to recreate the plots based on that information.
If you had access to Matlab to test/debug, I'm sure it would be possible to create some code which automagically created .mat files and a matlab .m file which would recreate the figures.
There's a neat list of matlab/scipy equivalent commands on the scipy web site.
good luck!