I have a .geojson file (call it data.geojson) which I use to manually update a dataset on mapbox.
Suppose that my data.geojson file is structured as follows:
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"properties": {
"suburb": "A",
"unemployed": 10
},
"geometry": {
"type": "Point",
"coordinates": [
0,
0
]
}
},
{
"type": "Feature",
"properties": {
"suburb": "B",
"unemployed": 20
},
"geometry": {
"type": "Point",
"coordinates": [
1,
1
]
}
data.geojson is stored locally, and every 12 hours the 'unemployed' property of each feature is updated using another python script that scrapes data from the web.
Currently, in order to update these properties within the online dataset (stored at mapbox.com) I am manually navigating to the Mapbox website and reuploading the data.geojson file. I am looking for a way to accomplish this task pythonically.
Any help would be greatly appreciated!
you can setup a timer of some sort to automatically update the data using javascript functions. Here I am using a source and layer named "STI", which is just geoJSON line data.
The function would first add the source of the data as well as the layer :
var STI_SOURCE = 'json/sti/STI.json'; // declare URL for data
map.addSource('sti', { type: 'geojson', data: STI1 }); // Add source using URL
// Add the actual layer using the source
map.addLayer({
"id": "sti",
"type": "line",
"source": "sti",
"layout": {
"line-join": "miter",
"line-cap": "round"
},
"paint": {
"line-color": "#fff",
"line-width": 1,
"line-dasharray": [6, 2]
}
});
Then, when you want to refresh the data - remove them :
map.removeLayer('sti');
map.removeSource('sti');
Then, you can re-add them by starting at the beginning. There are other ways (and better) to do this, but this is just one way that works. I think there is a setData() function that does this better. But hopefully this can get you started.
My solution, in the end, was simply to point the source of the Mapbox layer to the locally stored dataset.geojson file rather than the corresponding dataset stored online at mapbox.com.
I was able to edit the locally stored dataset.geojson using the 'json' python package. Since the Mapbox layer source was pointing directly to the local dataset, all updates to this local file would then be reflected in the Mapbox layer. This way, there is no need to upload any data to Mapbox.
#David also posted a helpful solution if you wish to go down that route.
Related
I have been using Python for a while and want to learn DBs now. I am trying to learn MongoDB.
Goal: To add a Nested dictionary to a key in a pre existing document.
I am using Pymongo for this.
Like for ex. I have this-
{'name':'Ryan', 'titles':[{'title_name':'Victory Set'}]}
And I want to add another dictionary in titles key, so that it looks like this-
{'name':'Ryan', 'titles':[{'title_name':'Victory Set'}, {'title_name':'Bronze Trial'}]}
I have seen you can use update_one or update_many to update a pre existing value, but couldn't find for adding new data.
How can I achieve this?
Try this instead, refer https://mongoplayground.net/p/P5w6bF0UtWj
db.collection.update({
"name": "Ryan"
},
{
"$push": {
"titles": {
"$each": [
{
"title_name": "Bronze Trial"
},
{
"title_name": "Silver Trial"
}
],
}
}
})
I'm trying to split a JSON file to two different XML files. Example below.
Trying to use a python script to perform this. A groovy script would work as well. This split function is part of a file transformation in Apace NiFi.
JSON file :
{
"Cars": {
"Car": [{
"Brand": "Volkswagon"
"Country": "Germany",
"Type": "All",
"Models":
[{
"Polo": {
"Type": "Hatchback",
"Color": "White",
"Cost": "10000"
}
} {
"Golf": {
"Type": "Hatchback",
"Color": "White",
"Cost": "12000"
}
}
]
}
]
}
}
Split to two XML files :
XML 1 :
<VehicleEntity>
<VehicleEntity>
<GlobalBrandId>Car123</GlobalBrandId>
<Name>Random Value</Name>
<Brand>Volkswagon</Brand>
</VehicleEntity>
</VehicleEntity>
XML 2 :
<VehicleEntityDetail>
<VehicleEntityDetailsEntity>
<GlobalBrandId>Car123</GlobalBrandId>
<Brand>Volkswagon</Brand>
<Type>Hatchback</Type>
<Color>White</Color>
<Cost>10000</Cost>
</VehicleEntityDetailsEntity>
</VehicleEntityDetail>
The XML tag names are a little different to the elements in the JSON file.
I'm looking for the best possible way to achieve this, but prefer a python script due to some experience working with Python.
Any other solution for Apache NiFi is also appreciated.
I have json files in S3 containing array of objects in each file, like shown below.
[{
"id": "c147162a-a304-11ea-aa90-0242ac110028",
"clientId": "xxx",
"contextUUID": "1bb6b39e-b181-4a6d-b43b-4040f9d254b8",
"tags": {},
"timestamp": 1592855898
}, {
"id": "c147162a-a304-11ea-aa90-0242ac110028",
"clientId": "yyy",
"contextUUID": "1bb6b39e-b181-4a6d-b43b-4040f9d254b8",
"tags": {},
"timestamp": 1592855898
}]
I used crawler to detect and load the schema to catalog. It was successful and it created a schema with a single column named array with data type array<struct<id:string,clientId:string,contextUUID:string,tags:string,timestamp:int>>.
Now, I tried to load the data using glueContext.create_dynamic_frame.from_catalog function, but I could not see any data. I tried printing schema and data as shown below.
ds = glueContext.create_dynamic_frame.from_catalog(
database = "dbname",
table_name = "tablename")
ds.printSchema()
root
ds.schema()
StructType([], {})
ds.show()
empty
ds.toDF().show()
++
||
++
++
Any idea, what I am doing wrong? I am planning to extract each object in array and transform the object to a different schema.
You can try to give regex in format_options to tell glue how it should read the data. Following code has worked for me:
glueContext.create_dynamic_frame_from_options('s3',
{
'paths': ["s3://glue-test-bucket-12345/events/101-1.json"]
},
format="json",
format_options={"jsonPath": "$[*]"}
).toDF()
I hope it solves the problem.
According to the GeoJSON Format Specification
"If a feature has a commonly used identifier, that identifier should be included as a member of the feature object with the name "id"."
My question is how do I add this to my GeoJSON?
If I create it as an attribute and then save it as GeoJSON in QGIS it ends up in Properties instead of in Features.
This is what I want to do:
{
"type": "FeatureCollection",
"crs": { "type": "name", "properties": { "name":"urn:ogc:def:crs:OGC:1.3:CRS84" } },
"features": [
{ "type": "Feature", "id":"1", "properties": { "Namn":.................
This is what QGIS produces:
{
"type": "FeatureCollection",
"crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:OGC:1.3:CRS84" } },
"features": [
{ "type": "Feature", "properties": { "id": 1, "Name"..................
I have also tried PyGeoj https://github.com/karimbahgat/PyGeoj. This has a function to add a unique id but it also adds it under properties.
If I open the GeoJSON and write it in by hand then it works but I don't want to do this for all of my layers, some of which contain many features.
Thought I would write here how I have solved it in case anyone else comes across the same problem.
A simple python script:
#create empty file to be writen to
file = open("kommun_id.geojson", "w")
count = 0
#read original file
with open('kommun.geojson', 'r')as myfile:
for line in myfile:
#lines that don't need to be edited with an 'id'
if not line.startswith('{ "type": '):
file.write(line)
else:
#lines that need to be edited
count = count +1
idNr = str(count)
file.write(line[0:20] + '"id":'+ '"'+ idNr + '",' +line[21:])
file.close()
It might not work for all geoGJSONs but it works for the ones I have created with QGIS
You can fix this by using the preserve_fid flag in ogr2ogr. You're going to "convert" the bad GeoJSON that QGIS dumps to the format you want, with the id field exposed.
ogr2ogr -f GeoJSON fixed.geojson qgis.geojson -preserve_fid
Here qgis.geojson is the one the QGIS created and fixed.geojson is the new version with the exposed id field.
This is due to ogr2ogr not knowing which QGIS field to use as the geoJSON ID.
Per this QGIS issue (thank you #gioman), a workaround is to manually specify the ID field in Custom Options --> Layer Settings when you do the export:
If you need to do this to geoJSONs that have already been created, you can use the standalone ogr2ogr command-line program:
ogr2ogr output.geojson input.geojson -lco id_field=id
If you need to convert the IDs from properties inside multiple files, you can use the command-line tool jq inside a Bash for-loop:
for file in *.geojson; do jq '(.features[] | select(.properties.id != null)) |= (.id = .properties.id)' $file > "$file"_tmp; mv "$file"_tmp $file;done
I had the same problem and i managed with this little python script.
import json
with open('georef-spain-municipio.geojson') as json_file:
data = json.load(json_file)
i = 0
for p in data['features']:
i+=1
p['id'] = i
with open('georef-spain-municipio_id.geojson', 'w') as outfile:
json.dump(data, outfile)
I hope it helps
Is there a python library for converting a JSON schema to a python class definition, similar to jsonschema2pojo -- https://github.com/joelittlejohn/jsonschema2pojo -- for Java?
So far the closest thing I've been able to find is warlock, which advertises this workflow:
Build your schema
>>> schema = {
'name': 'Country',
'properties': {
'name': {'type': 'string'},
'abbreviation': {'type': 'string'},
},
'additionalProperties': False,
}
Create a model
>>> import warlock
>>> Country = warlock.model_factory(schema)
Create an object using your model
>>> sweden = Country(name='Sweden', abbreviation='SE')
However, it's not quite that easy. The objects that Warlock produces lack much in the way of introspectible goodies. And if it supports nested dicts at initialization, I was unable to figure out how to make them work.
To give a little background, the problem that I was working on was how to take Chrome's JSONSchema API and produce a tree of request generators and response handlers. Warlock doesn't seem too far off the mark, the only downside is that meta-classes in Python can't really be turned into 'code'.
Other useful modules to look for:
jsonschema - (which Warlock is built on top of)
valideer - similar to jsonschema but with a worse name.
bunch - An interesting structure builder thats half-way between a dotdict and construct
If you end up finding a good one-stop solution for this please follow up your question - I'd love to find one. I poured through github, pypi, googlecode, sourceforge, etc.. And just couldn't find anything really sexy.
For lack of any pre-made solutions, I'll probably cobble together something with Warlock myself. So if I beat you to it, I'll update my answer. :p
python-jsonschema-objects is an alternative to warlock, build on top of jsonschema
python-jsonschema-objects provides an automatic class-based binding to JSON schemas for use in python.
Usage:
Sample Json Schema
schema = '''{
"title": "Example Schema",
"type": "object",
"properties": {
"firstName": {
"type": "string"
},
"lastName": {
"type": "string"
},
"age": {
"description": "Age in years",
"type": "integer",
"minimum": 0
},
"dogs": {
"type": "array",
"items": {"type": "string"},
"maxItems": 4
},
"gender": {
"type": "string",
"enum": ["male", "female"]
},
"deceased": {
"enum": ["yes", "no", 1, 0, "true", "false"]
}
},
"required": ["firstName", "lastName"]
} '''
Converting the schema object to class
import python_jsonschema_objects as pjs
import json
schema = json.loads(schema)
builder = pjs.ObjectBuilder(schema)
ns = builder.build_classes()
Person = ns.ExampleSchema
james = Person(firstName="James", lastName="Bond")
james.lastName
u'Bond' james
example_schema lastName=Bond age=None firstName=James
Validation :
james.age = -2
python_jsonschema_objects.validators.ValidationError: -2 was less
or equal to than 0
But problem is , it is still using draft4validation while jsonschema has moved over draft4validation , i filed an issue on the repo regarding this .
Unless you are using old version of jsonschema , the above package will work as shown.
I just created this small project to generate code classes from json schema, even if dealing with python I think can be useful when working in business projects:
pip install jsonschema2popo
running following command will generate a python module containing json-schema defined classes (it uses jinja2 templating)
jsonschema2popo -o /path/to/output_file.py /path/to/json_schema.json
more info at: https://github.com/frx08/jsonschema2popo