Python plotly choropleth does not work with geoJSONs - python

I am trying to use plotly choropleth to draw the map, lets say for a random variable of num for each of the feature regions of the map in Italy. However, it does not work. below is the code that I use:
I have downloaded the GeoJson files for Italy from here.
import random
import pandas as pd
import plotly.express as px
import plotly.io as pio
import json
pio.renderers.default='browser'
with open('it-all.geo.json') as f:
geojson = json.load(f)
n_provinces = len(geojson['features'])
province_names = [geojson['features'][k]['properties']['name'] for k in range(n_provinces)]
randomlist = []
for i in range(0,110):
n = random.randint(1,30)
randomlist.append(n)
datadata = pd.DataFrame({'province':province_names, 'num':randomlist})
fig = px.choropleth(datadata, geojson=geojson, color="num",
locations="province", featureidkey="properties.name",
color_continuous_scale="Viridis")
fig.show()
What I am getting is a mixed shape map as below, can anyone please let me know what I am doing wrong, thanks!!

I tried doing the same thing with data from my country and had the same issues. I think that this data might not be readable by plotly. If you look at the website's demos for their maps, there are several javascript scripts running in order to create the maps. It's possible that they've put their geojson into a custom format so that you have to use their javascript services in order to create a comprehensible map.
I later found a different set of data, and was able to easily create a chorpleth map with plotly using the exact same code that didn't work with the original data. Hopefully you found a different dataset that you could use. Oftentimes governments will provide open data about census areas, province/state borders, etc.

Related

Can mark_geoshape () be used for Canadian Provinces/cities?

I'm looking to somehow figure out a way to insert a geographic graph of British Columbia which is a part of Canada in my data analysis.
I have made this image here explaining what tree is being planted the most in Vancouver
Now I want to make a geograph kind of like this https://altair-viz.github.io/gallery/airports_count.html
to answer: how the density/distribution of species planted different in different neighbourhoods look like.
This is what I'm having trouble with.
Thus
from vega_datasets import data
world_map = alt.topo_feature(data.world_110m.url, 'countries')
alt.Chart(world_map).mark_geoshape().project()
and it's giving me a world map! Great! I tried zooming into just British Columbia but it's not really working out.
Can anyone give me any direction on where to go and how I should go about answering my question? I really wanted to use geoshape
I also found this if it's helpful
https://global.mapit.mysociety.org/area/960958.html
Thank you and I appreciate everyones advice!
Canadian provinces are not part of world_110m map in the example gallery. You would need to provide your own geojson and topojson file that contains that information in order to work with Altair and then follow the guidelines here How can I make a map using GeoJSON data in Altair?.
You can also work with geopandas together with Altair, which in many ways is more flexible. We are working on integrating info on this into the docs, but in the meantime you can view this preview version to get started https://deploy-preview-1--spontaneous-sorbet-49ed10.netlify.app/user_guide/marks/geoshape.html
Looks like you got your data from here
import pandas as pd
import numpy as np
import plotly.express as px
#loading data
df = pd.read_csv('street-trees.csv', sep=';')
#extracting coords
df['coords'] = df['Geom'].str.extract('\[(.*?)\]')
df['lon'] = df['coords'].str.split(',').str[0].astype(float)
df['lat'] = df['coords'].str.split(',').str[1].astype(float)
#getting neighborhood totals
df2 = pd.merge(df[['NEIGHBOURHOOD_NAME']].value_counts().reset_index(), df[['NEIGHBOURHOOD_NAME', 'lon', 'lat']].groupby('NEIGHBOURHOOD_NAME').mean().reset_index())
#drawing figure
fig = px.scatter_mapbox(df2,
lat='lat',
lon='lon',
color=0,
opacity=0.5,
center=dict(lon=df2['lon'].mean(),
lat=df2['lat'].mean()),
zoom=11,
size=0)
fig.update_layout(mapbox_style='open-street-map')
fig.show()
I am definitely not an expert but using Joel's advice ... you can download a geojson from here:
https://data.opendatasoft.com/explore/dataset/georef-canada-province%40public/export/?disjunctive.prov_name_en
Because I downloaded it I then had to open it rather than reference a url like most of the examples so
can_prov_file = 'C:/PyProjects/georef-canada-province.geojson'
with open(can_prov_file) as f:
var_geojson = geojson.load(f)
data_geojson = alt.InlineData(values=var_geojson, format=alt.DataFormat(property='features',type='json'))
# chart object
provinces = alt.Chart(data_geojson).mark_geoshape(
).encode(
color="properties.prov_name_en:O"
).project(
type='identity', reflectY=True
)
Worked for me. Best of luck.

How to plot addresses (Lat/Long) from a csv on JSON map.?

So I am trying to do something which seems relatively simple but is proving incredibly difficult. I have a .csv file with addresses and their correspondent latitude/longitude, I just want to plot those on a California JSON map like this one in python:
https://github.com/deldersveld/topojson/blob/master/countries/us-states/CA-06-california-counties.json
I've tried bubble maps, scatter maps, etc. but to no luck I keep getting all kind of errors :(. This is the closest I've got, but that uses a world map and can't zoom in effectively like that json map up there. I am still learning python so please go easy on me ><
import plotly.express as px
import pandas as pd
df = pd.read_csv(r"C:\Users\FT4\Desktop\FT Imported Data\Calimapdata.csv")
fig = px.scatter_geo(df,lat='Latitude',lon='Longitude', hover_name="lic_type", scope="usa")
fig.update_layout(title = 'World map', title_x=0.5)
fig.show()
If anyone could help me with this I would appreciate it. Thank you
your example link is just a GeoJSON geometry definition. Are you talking about a Choropleth?
If so, check out geopandas - you should be able to link your data to the polygons in the shape definition you linked to by reading it in with geojson and then joining on the shapes with sjoin. Once you have data tied to each geometry, you can plot with geopandas's .plot method. Check out the user guide for more info.
Something along these lines should work:
import geopandas as gpd, pandas as pd
geojson_raw_url = (
"https://raw.githubusercontent.com/deldersveld/topojson/"
"master/countries/us-states/CA-06-california-counties.json"
)
gdf = gpd.read_file(geojson_raw_url, engine="GeoJSON")
df = pd.read_csv(r"C:\Users\FT4\Desktop\FT Imported Data\Calimapdata.csv")
merged = gpd.sjoin(gdf, df, how='right')
# you could plot this directly with geopandas
merged.plot("lic_type")
alternatively, using #r-beginners' excellent answer to another question, we can plot with express:
fig = px.choropleth(merged, geojson=merged.geometry, locations=merged.index, color="lic_type")
fig.update_geos(fitbounds="locations", visible=False)
fig.show()

Stuck on How to Display Hover Data in Specific Format

I am testing some plotly code here.
import plotly.express as px
# find business profits
pd.options.display.float_format = '{:.2f}'.format
df_gains = df_rev_exp[((df_rev_exp.ltd_spending) < df_rev_exp.REV2)]
df_gains.tail()
# scatter plot of losses
import plotly.express as px
fig = px.scatter(df_gains, x="site_name",
y="gain_or_loss",
color="gain_or_loss",
size='REV2', hover_data=['site_name','REV2'])
fig.update_xaxes(tickangle=325)
fig.show()
Everything plots just fine but the REV2 is pretty large, and as such it is hard to read when I hover over the data points in the chart. I'm trying to figure out a way to show numbers as millions. For instance, In would like to see 1.25M and not 1257789.84, which is what I am seeing now. I tried playing around with fig.update but I couldn't get anything working. How can I modify the formatting on these plotly charts?
I'm on Plotly 4.14.3 and this version shows 2.2M straight out of the box when the source is x=[10000000, 22000000, 34000000]:
import numpy as np
import plotly.graph_objects as go
fig = go.Figure()
fig.add_traces(go.Scatter(x=[10*10**6, 22*10**6, 34*10**6],
y=[10,12,14]))
fig.show()
So two things come to mind:
Update Plotly.
Check that you're inputting your values as values and not strings

Plotly in Python: show mean and variance of selected data

I am generating histograms using go.Histogram as described here. I am getting what is expected:
What I want to do is to show some statistics of the selected data, as shown in the next image (the white box I added manually in Paint):
I have tried this and within the function selection_fn I placed the add_annotation described here. However, it does nothing. No errors too.
How can I do this?
Edit: I am using this code taken from this link
import plotly.graph_objects as go
import numpy as np
x = np.random.randn(500)
fig = go.Figure(data=[go.Histogram(x=x)])
fig.show()
with obviously another data set.

Is there a way to plot many markers in Folium?

I am trying to use Folium for geographical information reading from a pandas dataframe.
The code I have is this one:
import folium
from folium import plugins
import pandas as pd
...operations on dataframe df...
map_1 = folium.Map(location=[51.5073219, -0.1276474],
zoom_start=12,
tiles='Stamen Terrain')
markerCluster = folium.plugins.MarkerCluster().add_to(map_1)
lat=0.
lon=0.
for index,row in df.iterrows():
lat=row['lat]
lon=row['lon']
folium.Marker([lat, lon], popup=row['name']).add_to(markerCluster)
map_1
df is a dataframe with longitude, latitude and name information. Longitude and latitude are float.
I am using jupyter notebook and the map does not appear. Just a white empty box.
I have also tried to save the map:
map_1.save(outfile='map_1.html')
but also opening the file doesn't work (using Chrome, Firefox or Explorer).
I have tried to reduce the number of markers displayed and below 300 markers the code works and the map is correctly displayed. If there are more than 300 Markers the map returns to be be blank.
The size of the file is below 5 MB and should be processed correctly by Chrome.
Is there a way around it (I have more than 2000 entries in the dataframe and I would like to plot them all)? Or to set the maximum number of markers shown in folium?
Thanks
This might be too late but I stumbled upon the same problem and found a solution that worked for me without having to remove the popups so I figured if anybody has the same issue they can try it out. Try replacing popup=row['name'] with popup=folium.PopUp(row['name'], parse_html=True) and see whether it works. You can read more about it here https://github.com/python-visualization/folium/issues/726

Categories

Resources