Insert and delete geojson object based on conditions in Python - python

I have a Geojson as follows:
data = {
"type": "FeatureCollection",
"name": "entities",
"features": [
{
"type": "Feature",
"properties": {
"Layer": "0",
"SubClasses": "AcDbEntity:AcDbBlockReference",
"EntityHandle": "3C8",
"area": "141.81",
"type": "p",
"Text": "area:141.81;type:p"
},
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
2721.1572741014097,
1454.3223948456648
],
[
2720.121847266826,
1454.3223948456648
],
[
2720.121847266826,
1452.6092152478227
],
[
2710.5679254269344,
1452.6092152478227
],
[
2721.1572741014097,
1430.1478385206133
],
[
2721.1572741014097,
1454.3223948456648
]
]
]
}
},
{
"type": "Feature",
"properties": {
"Layer": "0",
"SubClasses": "AcDbEntity:AcDbBlockReference",
"EntityHandle": "3CE",
"area": "44.79",
"type": "h",
"Text": "area:44.79;type:h"
},
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
2710.723323781393,
1450.3320226620049
],
[
2720.0654518264787,
1450.3320226620049
],
[
2720.0654518264787,
1445.537183875705
],
[
2710.723323781393,
1445.537183875705
],
[
2710.723323781393,
1450.3320226620049
]
]
]
}
},
{
"type": "Feature",
"properties": {
"Layer": "0",
"SubClasses": "AcDbEntity:AcDbBlockReference",
"EntityHandle": "610",
"name": "706",
"area": "92.28",
"type": "o",
"Text": "name:706;area:92.28;type:o"
},
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
2714.603212251531,
1462.7249212430308
],
[
2711.7289360797,
1462.7249212430308
],
[
2711.7289360797,
1464.852506681824
],
[
2705.7302059101926,
1460.6840827804538
],
[
2710.567925426934,
1454.3223948456637
],
[
2710.567925426934,
1453.838838298367
],
[
2714.603212251531,
1453.838838298367
],
[
2714.603212251531,
1462.7249212430308
]
]
]
}
}
]
}
I want to insert "name": "" if name does not exist in properties, and delete "Text" object since it's duplicated, how can I do that in Python?
Thanks a lot at advance!
Expected result:
data = {
"type": "FeatureCollection",
"name": "entities",
"features": [
{
"type": "Feature",
"properties": {
"Layer": "0",
"SubClasses": "AcDbEntity:AcDbBlockReference",
"EntityHandle": "3C8",
"name": "",
"area": "141.81",
"type": "p"
},
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
2721.1572741014097,
1454.3223948456648
],
[
2720.121847266826,
1454.3223948456648
],
[
2720.121847266826,
1452.6092152478227
],
[
2710.5679254269344,
1452.6092152478227
],
[
2721.1572741014097,
1430.1478385206133
],
[
2721.1572741014097,
1454.3223948456648
]
]
]
}
},
{
"type": "Feature",
"properties": {
"Layer": "0",
"SubClasses": "AcDbEntity:AcDbBlockReference",
"EntityHandle": "3CE",
"name": "",
"area": "44.79",
"type": "h"
},
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
2710.723323781393,
1450.3320226620049
],
[
2720.0654518264787,
1450.3320226620049
],
[
2720.0654518264787,
1445.537183875705
],
[
2710.723323781393,
1445.537183875705
],
[
2710.723323781393,
1450.3320226620049
]
]
]
}
},
{
"type": "Feature",
"properties": {
"Layer": "0",
"SubClasses": "AcDbEntity:AcDbBlockReference",
"EntityHandle": "610",
"name": "706",
"area": "92.28",
"type": "o"
},
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
2714.603212251531,
1462.7249212430308
],
[
2711.7289360797,
1462.7249212430308
],
[
2711.7289360797,
1464.852506681824
],
[
2705.7302059101926,
1460.6840827804538
],
[
2710.567925426934,
1454.3223948456637
],
[
2710.567925426934,
1453.838838298367
],
[
2714.603212251531,
1453.838838298367
],
[
2714.603212251531,
1462.7249212430308
]
]
]
}
}
]
}
UPDATE:
My solution so far, it seems works.
import json
features = data["features"]
for i in features:
d = i["properties"]
if "name" not in d:
d["name"] = ""
if i["properties"]["Text"] is not None:
del i["properties"]["Text"]
I define it as a function, but in some cases I get an error as follows. Does someone know how to fix it? Thanks.
Traceback (most recent call last):
File "<ipython-input-1-8e3095f67c57>", line 138, in <module>
modify_geojson(output_file)
File "<ipython-input-1-8e3095f67c57>", line 102, in modify_geojson
if i["properties"]["Text"] is not None:
KeyError: 'Text'

In each property 'Text' is only present once. Please explain where it's duplicated?

My solution so far, it seems works.
import json
features = data["features"]
for i in features:
d = i["properties"]
if "name" not in d:
d["name"] = ""
if i["properties"]["Text"] is not None:
del i["properties"]["Text"]
I define it as a function, but in some cases I get an error as follows. Does someone know how to fix it? Thanks.
Traceback (most recent call last):
File "<ipython-input-1-8e3095f67c57>", line 138, in <module>
modify_geojson(output_file)
File "<ipython-input-1-8e3095f67c57>", line 102, in modify_geojson
if i["properties"]["Text"] is not None:
KeyError: 'Text'

Related

Convert to DataFrame Pandas to json multiple level python

Is there any way to convert a pandas dataframe into a json with different levels, I have this dataframe:
df = pd.DataFrame([
{"type":"Feature","Id":319,"Departament":"1 DE MAYO","State":"CHACO","coordinates":[[[-58.95370956800002, -26.87059472200002]]]},
{"type":"Feature","Id":320,"Departament":"12 DE OCTUBRE","State":"CHACO","coordinates":[[[-58.95370956800002, -26.87059472200002]]]},
{"type":"Feature","Id":314,"Departament":"2 DE ABRIL","State":"CHACO","coordinates":[[[-58.95370956800002, -26.87059472200002]]]},
{"type":"Feature","Id":308,"Departament":"25 DE MAYO","State":"CHACO","coordinates":[[[-58.95370956800002, -26.87059472200002]]]},
{"type":"Feature","Id":100,"Departament":"25 DE MAYO","State":"CHACO","coordinates":[[[-58.95370956800002, -26.87059472200002]]]}])
I really want an output like so:
"features": [
{
"type": "Feature",
"properties": {
"id": 319,
"Departament": "1 DE MAYO",
"State": "CHACO"
},
"geometry": {
"coordinates": [
[
[
-58.32487869300002,
-30.838373183999977
]
]
]
}
}
]
}
Thanks for your help i hope i was clear.
You can use:
import json
def to_json(row):
return {'type': row.iloc[0],
'properties': row.iloc[1:-1].to_dict(),
'geometry': row.iloc[-1]}
data = {'features': df.apply(to_json, axis=1).to_list()}
print(json.dumps(data, indent=2))
Output:
{
"features": [
{
"type": "Feature",
"properties": {
"Id": 319,
"Departament": "1 DE MAYO",
"State": "CHACO"
},
"geometry": [
[
[
-58.95370956800002,
-26.87059472200002
]
]
]
},
{
"type": "Feature",
"properties": {
"Id": 320,
"Departament": "12 DE OCTUBRE",
"State": "CHACO"
},
"geometry": [
[
[
-58.95370956800002,
-26.87059472200002
]
]
]
},
{
"type": "Feature",
"properties": {
"Id": 314,
"Departament": "2 DE ABRIL",
"State": "CHACO"
},
"geometry": [
[
[
-58.95370956800002,
-26.87059472200002
]
]
]
},
{
"type": "Feature",
"properties": {
"Id": 308,
"Departament": "25 DE MAYO",
"State": "CHACO"
},
"geometry": [
[
[
-58.95370956800002,
-26.87059472200002
]
]
]
},
{
"type": "Feature",
"properties": {
"Id": 100,
"Departament": "25 DE MAYO",
"State": "CHACO"
},
"geometry": [
[
[
-58.95370956800002,
-26.87059472200002
]
]
]
}
]
}
You can use the dataframes’ .to_json() method.

how to validate properties key in a geojson

i would like to know why the below posted geojson format is invalid. i tried to visualize its data in
http://geojson.io
but nothing gets displayed.
geojson
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [
[
[[7.85563468082516,49.90287230375267],[7.855636808249913,49.902782379662085],[7.855776033932631,49.902783753651605],[7.855773906766568,49.902873677746555]]
]
]
},"properties": {"areaOfCoverage":"30"}},
}
Try this:
{
"type": "FeatureCollection",
"features": [
{
"properties":{"areaOfCoverage":"30"},
"type": "Feature",
"properties": {},
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
7.85563468082516,
49.90287230375267
],
[
7.855636808249913,
49.902782379662085
],
[
7.855776033932631,
49.902783753651605
],
[
7.85563468082516,
49.90287230375267
]
]
]
}
}
]
}

Remove object from JSON whose values are NaN using Python?

My final output JSON file is in following format
[
{
"Type": "UPDATE",
"resource": {
"site ": "Lakeside mh041",
"name": "Total Flow",
"unit": "CubicMeters",
"device": "2160 LaserFlow Module",
"data": [
{
"timestamp": [
"1087009200"
],
"value": [
6945.68
]
},
{
"timestamp": [
"1087095600"
],
"value": [
NaN
]
},
{
"timestamp": [
"1087182000"
],
"value": [
7091.62
]
},
I want to remove the whole object if the "value" is NaN.
Expected Output
[
{
"Type": "UPDATE",
"resource": {
"site ": "Lakeside mh041",
"name": "Total Flow",
"unit": "CubicMeters",
"device": "2160 LaserFlow Module",
"data": [
{
"timestamp": [
"1087009200"
],
"value": [
6945.68
]
},
{
"timestamp": [
"1087182000"
],
"value": [
7091.62
]
},
I cannot remove the blank values from my csv file because of the format of the file.
I have tried this:
with open('Result.json' , 'r') as j:
json_dict = json.loads(j.read())
json_dict['data'] = [item for item in json_dict['data'] if
len([val for val in item['value'] if isnan(val)]) == 0]
print(json_dict)
Error - json_dict['data'] = [item for item in json_dict['data'] if len([val for val in item['value'] if isnan(val)]) == 0]
TypeError: list indices must be integers or slices, not str
In case you have more than one value for json"value": [...]
then,
import json
from math import isnan
json_str = '''
[
{
"Type": "UPDATE",
"resource": {
"site ": "Lakeside mh041",
"name": "Total Flow",
"unit": "CubicMeters",
"device": "2160 LaserFlow Module",
"data": [
{
"timestamp": [
"1087009200"
],
"value": [
6945.68
]
},
{
"timestamp": [
"1087095600"
],
"value": [
NaN
]
}
]
}
}
]
'''
json_dict = json.loads(json_str)
for typeObj in json_dict:
resource_node = typeObj['resource']
resource_node['data'] = [
item for item in resource_node['data']
if len([val for val in item['value'] if isnan(val)]) == 0
]
print(json_dict)
For testing if value is NaN you could use math.isnan() function (doc):
data = '''{"data": [
{
"timestamp": [
"1058367600"
],
"value": [
9.65
]
},
{
"timestamp": [
"1058368500"
],
"value": [
NaN
]
},
{
"timestamp": [
"1058367600"
],
"value": [
4.75
]
}
]}'''
import json
from math import isnan
data = json.loads(data)
data['data'] = [i for i in data['data'] if not isnan(i['value'][0])]
print(json.dumps(data, indent=4))
Prints:
{
"data": [
{
"timestamp": [
"1058367600"
],
"value": [
9.65
]
},
{
"timestamp": [
"1058367600"
],
"value": [
4.75
]
}
]
}

Update values in geojson file in Python

I have geojson file as follows:
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"properties": {},
"geometry": {
"type": "LineString",
"coordinates": [
[
57.45849609375,
57.36801461845934
],
[
57.10693359375,
56.31044317134597
],
[
59.205322265625,
56.20059291588374
],
[
59.4140625,
57.29091812634045
],
[
57.55737304687501,
57.36801461845934
]
]
}
},
{
"type": "Feature",
"properties": {},
"geometry": {
"type": "LineString",
"coordinates": [
[
59.40307617187499,
57.29685437021898
],
[
60.8203125,
57.314657355733274
],
[
60.74340820312499,
56.26776108757582
],
[
59.227294921875,
56.21281407174654
],
[
59.447021484375,
57.29091812634045
]
]
}
}
]
}
I want to replace LineString in "type": "LineString" with Polygon, and also, replace coordinates last point of each linestring by coordinates of first point to make it close if it has more than 3 points.
How can I do it in Python with geopandas or pandas? Thanks.
Here is expected output:
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"properties": {},
"geometry": {
"type": "Polygon",
"coordinates": [
[
57.45849609375,
57.36801461845934
],
[
57.10693359375,
56.31044317134597
],
[
59.205322265625,
56.20059291588374
],
[
59.4140625,
57.29091812634045
],
[
57.45849609375,
57.36801461845934
]
]
}
},
{
"type": "Feature",
"properties": {},
"geometry": {
"type": "Polygon",
"coordinates": [
[
59.40307617187499,
57.29685437021898
],
[
60.8203125,
57.314657355733274
],
[
60.74340820312499,
56.26776108757582
],
[
59.227294921875,
56.21281407174654
],
[
59.40307617187499,
57.29685437021898
]
]
}
}
]
}
Script to get type and coordinates of first LineString:
import json
from pprint import pprint
with open('data.geojson') as f:
data = json.load(f)
pprint(data)
data["features"][0]["geometry"]['type']
data["features"][0]["geometry"]['coordinates']
You can achieve that with the json module:
file_line = 'file.json'
file_poly = 'file_poly.json'
import json
with open(file_line, 'r') as f:
data = json.load(f)
for feature in data['features']:
if (feature['geometry']['type'] == 'LineString') & (len(feature['geometry']['coordinates']) >= 3):
feature['geometry']['type'] = 'Polygon'
feature['geometry']['coordinates'].append(feature['geometry']['coordinates'][0])
with open(file_poly, 'w+') as f:
json.dump(data, f, indent=2)

Python Remove element from json if value exists

I have a rather large geojson file which is converted from some National Weather Service data. I've trimmed it down to this sample here:
{
"properties": {
"name": "day1otlk"
},
"type": "FeatureCollection",
"features": [
{
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
-122.71424459627099,
40.229695635383166
],
[
-122.62484780364827,
40.53410620541074
],
[
-122.71424459627099,
40.229695635383166
]
]
]
},
"properties": {
"Name": "General Thunder",
"stroke-opacity": 1,
"stroke-width": 4,
"name": "General Thunder",
"fill": "#c0e8c0",
"fill-opacity": 0.75,
"stroke": "#ffffff",
"timeSpan": {
"end": "2017-03-30T12:00:00Z",
"begin": "2017-03-29T20:00:00Z"
}
},
"type": "Feature"
},
{
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
-108.65861565996833,
32.91391108773154
],
[
-108.63932601964923,
32.95521185698698
],
[
-108.65861565996833,
32.91391108773154
]
]
]
},
"properties": {
"Name": "General Thunder",
"stroke-opacity": 1,
"stroke-width": 4,
"name": "General Thunder",
"fill": "#c0e8c0",
"fill-opacity": 0.75,
"stroke": "#ffffff",
"timeSpan": {
"end": "2017-03-30T12:00:00Z",
"begin": "2017-03-29T20:00:00Z"
}
},
"type": "Feature"
},
{
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
-92.67280213157608,
38.47870651780003
],
[
-92.62448390998837,
38.45534960370862
],
[
-92.59475154780039,
38.493327413824595
],
[
-92.64308574626148,
38.51669676139087
],
[
-92.67280213157608,
38.47870651780003
]
]
]
},
"properties": {
"Name": "10 %",
"stroke-opacity": 1,
"stroke-width": 4,
"name": "10 %",
"fill": "#8b4726",
"fill-opacity": 0.89,
"stroke": "#ffffff",
"timeSpan": {
"end": "2017-03-30T12:00:00Z",
"begin": "2017-03-29T20:00:00Z"
}
},
"type": "Feature"
},
{
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
-92.67280213157608,
38.47870651780003
],
[
-92.62448390998837,
38.45534960370862
],
[
-92.59475154780039,
38.493327413824595
],
[
-92.64308574626148,
38.51669676139087
],
[
-92.67280213157608,
38.47870651780003
]
]
]
},
"properties": {
"Name": "10 %",
"stroke-opacity": 1,
"stroke-width": 4,
"name": "20 %",
"fill": "#8b4726",
"fill-opacity": 0.89,
"stroke": "#ffffff",
"timeSpan": {
"end": "2017-03-30T12:00:00Z",
"begin": "2017-03-29T20:00:00Z"
}
},
"type": "Feature"
},
{
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
-97.09845994557838,
38.43843745045377
],
[
-97.07114801649661,
38.47751978088534
],
[
-97.09845994557838,
38.43843745045377
]
]
]
},
"properties": {
"Name": "5 %",
"stroke-opacity": 1,
"stroke-width": 4,
"name": "5 %",
"fill": "#b47f00",
"fill-opacity": 0.89,
"stroke": "#ffffff",
"timeSpan": {
"end": "2017-03-30T12:00:00Z",
"begin": "2017-03-29T20:00:00Z"
}
},
"type": "Feature"
}
]
}
I'm looking to remove the elements where name has a % in it. I don't want those coordinates or anything included.
Here's my code:
import json
with open('test.geojson') as data_file:
data = json.load(data_file)
for element in data["features"]:
if '%' in element["properties"]["name"]:
del element["type"]
del element["properties"] # Deletes the properties
del element["geometry"] # Deletes the coords
with open('test_output.geojson', 'w') as data_file:
data = json.dump(data, data_file)
This works well enough to remove the element's sub keys, but I'm left with output that looks like:
{}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}
I've also tried to use
for element in data["features"]:
if '%' in element["properties"]["name"]:
data["features"].remove(element)
but that seems to delete only the last element in the sample group, which is the 5 % group. It should be removing the 10 %, 20 % and the 5 % groups.
Is there a way to remove the element from data["features"] if name has a % in it all together so I'm left with clean json output? In this sample data, the only data["features"] I should have left are the General Thunder and no empty brackets.
Use a simple filter:
no_percent = lambda feature: '%' not in feature['properties']['name']
data['features'] = filter(no_percent, data['features'])
Or as a list comprehension:
data['features'] = [feature for feature in data['features']
if '%' not in feature['properties']['name']]
The issue with using del element["type"], del element["properties"] and del element["geometry"] is that it only removes those items from properties dict of that element. Not the element itself.
For your 2nd item, when you're iterating over a list like in for element in data["features"]:, it's not good to modify a list or object while iterating over it (which is what's happening with data["features"].remove(element)). Also, list.remove() removes an item with that value. So element gets used in a value context, not as that element.
It's better to create a new list and then assign that. What you could do is:
new_features = []
for element in data["features"]:
if '%' not in element["properties"]["name"]: # note the 'not'
new_features.append(element) # new_features has the one's you want
# and then re-assign features to the list with the elements you want
data["features"] = new_features

Categories

Resources