Strange Plotly behaviour with Choropleth Mapbox - python

I want to create a choropleth map out of a GeoJSON file that looks like this:
{"type": "FeatureCollection", "features": [
{'type': 'Feature', 'geometry': {'type': 'MultiPolygon', 'coordinates': [[[[... , ...] ... [..., ...]]]]}, 'properties': {'id': 'A'},
{'type': 'Feature', 'geometry': {'type': 'MultiPolygon', 'coordinates': [[[[... , ...] ... [..., ...]]]]}, 'properties': {'id': 'B'},
with each id property being different for each feature.
I mapped each feature (by the id property) to it's particular region as it follows:
regions = {
'B': 'BEACH',
and then created a DataFrame to store each id and each region:
ids = []
for feature in geojson['features']:
df = pd.DataFrame (ids, columns = ['id'])
df['region'] = df['id'].map(regions)
That returns a DataFrame like this:
id region
I then tried to create a choropleth map with that info:
fig = px.choropleth_mapbox(df, geojson=geojson, color="region",
locations="id", featureidkey="",
center={"lat": -9.893, "lon": -50.423},
mapbox_style="white-bg", zoom=9)
However, this results in an excessively long running time, which crashes about a minute or so later, with no error.
I wanted to check if there was something wrong with the GeoJSON file and/or with the mapping, so I assigned random numeric data to each id in df, by:
df['random_number'] = np.random.randint(0,100,size=len(df))
and re-tried the map with the following code:
fig = px.choropleth_mapbox(df, geojson=geojson, color="random_number",
locations="id", featureidkey="",
center={"lat": -9.893, "lon": -50.423},
mapbox_style="white-bg", zoom=9)
and it worked, so I am guessing there is some kind of trouble with the non-numeric values in the region column of df, which are not being properly passed to the choropleth map.
Any advice, help or solution will be much appreciated!


mapping some key-value pairs from nested json to new columns in Pandas dataframe

I spent a few hours searching for hints on how to do this, and tried a bunch of things (see below). I'm not getting anywhere, so I finally decided to post a new question.
I have a nested JSON with a dictionary data structure, like this:
for k,v in d.items():
print(f'{k} = {v}')
First two keys:
obj1 = {'color': 'red', 'size': 'large', 'description': 'a large red ball'}
obj2 = {'color': 'blue', 'size': 'small', 'description': 'a small blue ball'}
Side question: is this actually a nested json? Each key (obj1, obj2) has a set of keys, so I think so but I'm not sure.
I then have a dataframe like this:
key id_num source
obj1 143 loc1
obj2 139 loc1
I want to map only 'size' and 'description' from my json dictionary to this dataframe, by key. And I want to do that efficiently and readably. I also want it to be robust to the presence of the key, so that if a key doesn't exist in the JSON dict, it just prints "NA" or something.
Things I've tried that got me closest (I tried to map one column at a time, and both at same time):
df['description'] = df['key'].map(d['description'])
df['description'] = df['key'].map(lambda x: d[x]['description'])
df2 = df.join(pd.DataFrame.from_dict(d, orient='index', columns=['size','description']), on='key')
The first one - it's obvious why this doesn't work. It prints KeyError: 'description', as expected. The second one I think would work, but there is a key in my dataframe that doesn't exist in my JSON dict. It prints KeyError: 'obj42' (an object in my df but not in d). The third one works, but requires creating a new dataframe which I'd like to avoid.
How can I make Solution #2 robust to missing keys? Also, is there a way to assign both columns at the same time without creating a new df? I found a way to assign all values in the dict here, but that's not what I want. I only want a couple.
There's always a possibility that my search keywords were not quite right, so if a post exists that answers my question please do let me know and I can delete this one.
One way to go, based on your second attempt, would be as follows:
import pandas as pd
import numpy as np
d = {'obj1': {'color': 'red', 'size': 'large', 'description': 'a large red ball'},
'obj2': {'color': 'blue', 'size': 'small', 'description': 'a small blue ball'}
# just adding `obj3` here to supply a `KeyError`
data = {'key': {0: 'obj1', 1: 'obj2', 2: 'obj3'},
'id_num': {0: 143, 1: 139, 2: 140},
'source': {0: 'loc1', 1: 'loc1', 2: 'loc1'}}
df = pd.DataFrame(data)
df[['size','description']] = df['key'].map(lambda x: [d[x]['size'], d[x]['description']] if x in d else [np.nan]*2).tolist()
key id_num source size description
0 obj1 143 loc1 large a large red ball
1 obj2 139 loc1 small a small blue ball
2 obj3 140 loc1 NaN NaN
You can create a dataframe from the dictionary and then do .merge:
df = df.merge(
pd.DataFrame(d.values(), index=d.keys())[["size", "description"]],
key id_num source size description
0 obj1 143 loc1 large a large red ball
1 obj2 139 loc1 small a small blue ball
2 obj3 140 loc1 NaN NaN
Data used:
d = {
"obj1": {
"color": "red",
"size": "large",
"description": "a large red ball",
"obj2": {
"color": "blue",
"size": "small",
"description": "a small blue ball",
data = {
"key": {0: "obj1", 1: "obj2", 2: "obj3"},
"id_num": {0: 143, 1: 139, 2: 140},
"source": {0: "loc1", 1: "loc1", 2: "loc1"},
df = pd.DataFrame(data)

The fastest way to search for all the lines that don't have a matching ID string in pandas

I have a large dataset, and I am trying to draw them as lines using GeoJSON. For any line, there needs to be a minimum of 2 points so that it can be drawn correctly. However, I realise that in my dataset, there are some points, that have no matching ID (i.e they cannot form a line as I am grouping them by their IDs which is the last value in each row - wayID). The error I get says LineStrings must have at least 2 coordinate tuples
This is the dataset sample
data = '''lat=1.3240787,long=103.93576,102677,130828
This is the code I am using:
import pandas as pd
import geopandas as gpd
from shapely.geometry import LineString
import io
col = ['lat','long','pointID','WAYID']
#load csv as dataframe (replace io.StringIO(data) with the csv filename), use converters to clean up lat and long columns upon loading
df = pd.read_csv(io.StringIO(data), names=col, sep=',', engine='python', converters={'lat': lambda x: float(x.split('=')[1]), 'long': lambda x: float(x.split('=')[1])})
#input the data from the text file
#df = pd.read_csv("latlongWayID.txt", names=col, sep=',', engine='python', converters={'lat': lambda x: float(x.split('=')[1]), 'long': lambda x: float(x.split('=')[1])})
#load dataframe as geodataframe
gdf = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df.long,
#groupby on name and description, while converting the grouped geometries to a LineString
#gdf = gdf.groupby(['description'])['geometry'].apply(lambda p: LineString(zip(p.x, p.y)) if len(p) > 1 else Point(p.x, p.y))
gdf = gdf.groupby(['WAYID'])['geometry'].apply(lambda x: LineString(x.tolist())).reset_index()
jsonLoad = gdf.to_json()
Then save to a file using
import json
from geojson import Point, Feature, dump
#save the data to the file
parsed = json.loads(jsonLoad)
print(json.dumps(parsed, indent=4, sort_keys=True))
#parsed = gdf.to_json()
with open('savedMyfile.geojson', 'w') as f:
dump(parsed, f,indent=1)
Is there a way to check through the large file and quickly exclude all those that don't have the matching ID? I wouldn't mind converting those not-matching coords into a 'Point' type and those with pairs kept as LineString using the code above.
Could someone advise on how I should go about doing it?
Thanks in advance!
this is a simple case of filter in pandas before generating geopandas GeoDataFrame
(df.groupby("WAYID").size() >= 2).loc[lambda s: s].index will give list of WAYID where there are at least 2 associated rows
then it's a simple case of build up a filter for df
import pandas as pd
import geopandas as gpd
from shapely.geometry import LineString
import io
col = ["lat", "long", "pointID", "WAYID"]
df = pd.read_csv(
"lat": lambda x: float(x.split("=")[1]),
"long": lambda x: float(x.split("=")[1]),
# filter dataframe so that remaining WAYID have at least 2 co-ordinates
df = df.loc[df["WAYID"].isin((df.groupby("WAYID").size() >= 2).loc[lambda s: s].index)]
gdf = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df.long,
gdf = gdf.groupby(["WAYID"], as_index=False)["geometry"].apply(
lambda x: LineString(x.tolist())
# check generated geojson...
{'type': 'FeatureCollection',
'features': [{'id': '0',
'type': 'Feature',
'properties': {'WAYID': 131521},
'geometry': {'type': 'LineString',
'coordinates': ((103.9371845, 1.3219643), (103.9371391, 1.3222227))},
'bbox': (103.9371391, 1.3219643, 103.9371845, 1.3222227)},
{'id': '1',
'type': 'Feature',
'properties': {'WAYID': 190573},
'geometry': {'type': 'LineString',
'coordinates': ((103.9350341, 1.3218434), (103.9351205, 1.3213905))},
'bbox': (103.9350341, 1.3213905, 103.9351205, 1.3218434)},
{'id': '2',
'type': 'Feature',
'properties': {'WAYID': 190576},
'geometry': {'type': 'LineString',
'coordinates': ((103.9351812, 1.3214117),
(103.9352676, 1.3215218),
(103.9351328, 1.3218405))},
'bbox': (103.9351328, 1.3214117, 103.9352676, 1.3218405)},
{'id': '3',
'type': 'Feature',
'properties': {'WAYID': 885346352},
'geometry': {'type': 'LineString',
'coordinates': ((103.9327891, 1.3202224),
(103.932483, 1.3217119),
(103.9322832, 1.3226904),
(103.9322084, 1.3231835))},
'bbox': (103.9322084, 1.3202224, 103.9327891, 1.3231835)}],
'bbox': (103.9322084, 1.3202224, 103.9371845, 1.3231835)}

Add multiple markers on each coordinates in flask-googlemaps

I tried to do a simple car rental web project using flask, but meet an issue in add multiple markers on coordinates in flask-googlemaps, tried did this according to the tutorial ,
below is my code for add multiple coordinates on google map
catdatas = CarsDataset.query.all()
locations = [d.serializer() for d in catdatas]
carmap = Map(
markers=[(loc['lat'], loc['lng']) for loc in locations]
each coordinates are successful added, but I don't know how to add multiple markers on it.. thanks in advance!
According to the docs of GoogleMapsFlask, you can put in the "markers" key a list of dictionaries (objects). Example:
'icon': '',
'lat': 37.4419,
'lng': -122.1419,
'infobox': "<b>Hello World</b>"
'icon': '',
'lat': 37.4300,
'lng': -122.1400,
'infobox': "<b>Hello World from other place</b>"

xlsxwriter chart create dynamic rows

I'm trying to create charts with xlsxwriter python module.
It works fine, but I would like to not have to hard code the row amount
This example will chart 30 rows.
'name': 'SNR of old AP',
'values': '=Depart!$D$2:$D$30',
'marker': {'type': 'circle'},
'data_labels': {'value': True,'num_format':'#,##0'},
For values': I would like the row count to be dynamic. How do I do this?
It works fine, but I would like to not have to hard code the row amount
XlsxWriter supports a list syntax in add_series() for this exact case. So your example could be written as:
'name': 'SNR of old AP',
'values': ['Depart', 1, 3, 29, 3],
'marker': {'type': 'circle'},
'data_labels': {'value': True, 'num_format':'#,##0'},
And then you can set any of the first_row, first_col, last_row, last_col parameters programmatically.
See the docs for add_series().

Pandas DataFrame.apply: create new column with data from two columns

I have a DataFrame (df) like this:
PointID Time geojson
---- ---- ----
36F 2016-04-01T03:52:30 {'type': 'Point', 'coordinates': [3.961389, 43.123]}
36G 2016-04-01T03:52:50 {'type': 'Point', 'coordinates': [3.543234, 43.789]}
The geojson column contains data in geoJSON format (esentially, a Python dict).
I want to create a new column in geoJSON format, which includes the time coordinate. In other words, I want to inject the time information into the geoJSON info.
For a single value, I can successfully do:
oldjson = df.iloc[0]['geojson']
newjson = [df['coordinates'][0], df['coordinates'][1], df.iloc[0]['time'] ]
For a single parameter, I successfully used dataFrame.apply in combination with lambda (thanks to SO: related question
But now, I have two parameters, and I want to use it on the whole DataFrame. As I am not confident with the .apply syntax and lambda, I do not know if this is even possible. I would like to do something like this:
def inject_time(geojson, time):
Injects Time dimension into geoJSON coordinates. Expects a dict in geojson POINT format.
geojson['coordinates'] = [geojson['coordinates'][0], geojson['coordinates'][1], time]
return geojson
df["newcolumn"] = df["geojson"].apply(lambda x: inject_time(x, df['time'])))
...but that does not work, because the function would inject the whole series.
I figured that the format of the timestamped geoJSON should be something like this:
"type": "FeatureCollection",
"features": [
"type": "Feature",
"geometry": {
"type": "LineString",
"coordinates": [[-70,-25],[-70,35],[70,35]],
"properties": {
"times": [1435708800000, 1435795200000, 1435881600000]
So the time element is in the properties element, but this does not change the problem much.
You need DataFrame.apply with axis=1 for processing by rows:
df['new'] = df.apply(lambda x: inject_time(x['geojson'], x['Time']), axis=1)
#temporary display long string in column
with pd.option_context('display.max_colwidth', 100):
print (df['new'])
0 {'type': 'Point', 'coordinates': [3.961389, 43.123, '2016-04-01T03:52:30']}
1 {'type': 'Point', 'coordinates': [3.543234, 43.789, '2016-04-01T03:52:50']}
Name: new, dtype: object

