Issue with creating choropleth map on Python - python

I'm trying to create a choropleth map using folium on python and I was able to get the base map running, but when I try to add a layer with neighborhood boundaries, it does not show up on the html page. I thought maybe I had to increase the line opacity, but that doesn't seem to be it.
This is my code:
import folium
import pandas as pd
crimeData = pd.read_csv('NYC_crime.csv')
crime2020 = crimeData[crimeData.CMPLNT_FR_DT == 2020]
nycMap = folium.Map(location=[40.693943, -73.985880],zoom_start = 10)
mapLines = 'nbhdMap.geojson.json'
folium.Choropleth(geo_data = mapLines,
data = crime2020,
fill_color = 'OrRd',
fill_opacity=0.5,
line_opacity=1.0,
key_on = 'feature.geometry.coordinates',
columns = ['Lat_Lon']
)
nycMap.save(outfile='index.html')
I'm also having trouble filling the map with data. I'm trying to make it so that each complaint documented on the CSV file from 2020 is used to show which areas received the most calls. But I get this error:
Traceback (most recent call last):
File "/Users/kenia/Desktop/CSCI233/PRAC.py", line 10, in <module>
folium.Choropleth(geo_data = mapLines,
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/folium/features.py", line 1158, in __init__
color_data = data.set_index(columns[0])[columns[1]].to_dict()
IndexError: list index out of range
This is the neighborhood boundaries: https://data.beta.nyc/dataset/pediacities-nyc-neighborhoods/resource/35dd04fb-81b3-479b-a074-a27a37888ce7
And this is my data: https://data.cityofnewyork.us/Public-Safety/NYPD-Complaint-Data-Current-Year-To-Date-/5uac-w243
[EDIT] So I tried #r-beginners suggestion with a simpler dataset: https://data.cityofnewyork.us/Health/Restaurants-rolled-up-/59dk-tdhz
import pandas as pd
import folium
data = pd.read_csv('nycrestaurants.csv')
data = pd.concat([data, str(data['ZIPCODE']).split(',')], axis=1)
data.columns = ['CAMIS', 'DBA', 'BORO', 'BUILDING', 'STREET', 'ZIPCODE']
resData = data.groupby(['ZIPCODE'])['DBA'].sum().reset_index()
nycMap = folium.Map(location=[40.693943, -73.985880],zoom_start = 10)
mapLines = 'zipMap.geojson.json'
folium.Choropleth(geo_data = mapLines,
data = resData,
key_on = 'feature.properties.postalCode',
columns = ['ZIPCODE', 'DBA'],
fill_color = 'OrRd',
fill_opacity=0.5,
line_opacity=1.0
).add_to(nycMap)
nycMap.save(outfile='index.html')
But now I'm getting this error message:
Traceback (most recent call last):
File "/Users/kenia/Desktop/CSCI233/PRAC.py", line 5, in <module>
data = pd.concat([data, str(data['ZIPCODE']).split(',')], axis=1)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pandas/core/reshape/concat.py", line 274, in concat
op = _Concatenator(
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pandas/core/reshape/concat.py", line 359, in __init__
raise TypeError(msg)
TypeError: cannot concatenate object of type '<class 'list'>'; only Series and DataFrame objs are valid

Since you were presented with the data of the complaint in another question, you got the GEOJSON data from here for the corresponding zip code range. As for the process, we have tabulated it by the number of zip codes and tied it to the number of occurrences.
import pandas as pd
import numpy as np
df = pd.read_csv('./data/311_Noise_Complaints.csv', sep=',')
df['Incident Zip'].fillna(0, inplace=True)
df['Incident Zip'] = df['Incident Zip'].astype(int)
df_zip = df['Incident Zip'].value_counts().to_frame().reset_index()
df_zip.columns = ['postal_code', 'counts']
df_zip['postal_code'] = df_zip['postal_code'].astype(str)
import folium
nycMap = folium.Map(location=[40.693943, -73.985880], zoom_start=10)
mapLines = './data/nyc_zip_code_tabulation_areas_polygons.geojson'
choropleth = folium.Choropleth(geo_data = mapLines,
data = df_zip,
columns = ['postal_code', 'counts'],
key_on = 'feature.properties.postalcode',
fill_color = 'BuPu',
fill_opacity=0.5,
line_opacity=1.0
).add_to(nycMap)
choropleth.geojson.add_child(
folium.features.GeoJsonTooltip(['po_name'], labels=False)
)
nycMap.save(outfile='index.html')
nycMap

Related

how to solve "KeyError :'month'"

I tried to make a dashboard using panel.
Traceback (most recent call last):
File "/home/istts/Desktop/P3/DBPANEL.py", line 70, in <module>
interDataFrame[(interDataFrame.month <= mnthSlider)]
File "/home/istts/.local/lib/python3.9/site-packages/hvplot/interactive.py", line 499, in __call__
clone = new._clone(method(*args, **kwargs), plot=new._method == 'plot')
File "/home/istts/.local/lib/python3.9/site-packages/hvplot/interactive.py", line 370, in _clone
return type(self)(self._obj, fn=self._fn, transform=transform, plot=plot, depth=depth,
File "/home/istts/.local/lib/python3.9/site-packages/hvplot/interactive.py", line 276, in __init__
self._current = self._transform.apply(ds, keep_index=True, compute=False)
File "/home/istts/.local/lib/python3.9/site-packages/holoviews/util/transform.py", line 774, in apply
data = self._apply_fn(dataset, data, fn, fn_name, args,
File "/home/istts/.local/lib/python3.9/site-packages/holoviews/util/transform.py", line 672, in _apply_fn
raise e
File "/home/istts/.local/lib/python3.9/site-packages/holoviews/util/transform.py", line 666, in _apply_fn
data = method(*args, **kwargs)
File "/home/istts/.local/lib/python3.9/site-packages/pandas/util/_decorators.py", line 331, in wrapper
return func(*args, **kwargs)
File "/home/istts/.local/lib/python3.9/site-packages/pandas/core/frame.py", line 6912, in sort_values
k = self._get_label_or_level_values(by, axis=axis)
File "/home/istts/.local/lib/python3.9/site-packages/pandas/core/generic.py", line 1850, in _get_label_or_level_values
raise KeyError(key)
KeyError: 'month'
I always get that error, tried to use different name but still KeyError : 'month'. does any one know what is that error
note : I'll put my code below. I take the data from local database, but there's no month column so I make one.
import pandas as pd
import numpy as np
import panel as pn
import hvplot.pandas
import mysql.connector as connection
import datetime as dt
pn.extension()
# DATABSE VARIABLE
DBHOST = "localhost" #The Host use "localhost" or the database IP
DBUSER = "ISTTS" #The user
DBPASS = "digital123" #The user password
DBDATA = "exampledb" #The database to insert
DBTABL = "Test2" #The table to insert
# END DATABASE VARIABLE
# Device list (Incase of more than 1 device)
DEVICELIST = ['1'] # LATER, TAKE IT FROM DATABASE AND MAKE IT STRING
# End Device list
# Read data
try:
mydb = connection.connect(host=DBHOST, user=DBUSER, password=DBPASS, database=DBDATA, use_pure=True)
query = "Select * from " + DBTABL + ";"
dataFrame = pd.read_sql(query, mydb)
#query = "Select Distinct DeviceID from " + DBTABL +";"
#DEVICELIST = pd.read_sql(query, mydb)
mydb.close()
except Exception as e:
mydb.close()
print(str(e))
# End of read data
## Some process that need to be made for the program
# DEVICE LIST FROM DeviceID COLUMN
# TAKE MONTH FROM Date COLUMN
dataFrame['month'] = pd.DatetimeIndex(dataFrame['Date']).month
## End of the data processing
# Make the dataframe interactive
interDataFrame = dataFrame.interactive()
# End of Process
## Widgets
# Slider to see which month to inspect
mnthSlider = pn.widgets.IntSlider(name='Month slider', start=1, end=12, step=1, value=1)
# Button to pick which data to show for top graph
topYAxisData = pn.widgets.RadioButtonGroup(
name='data select',
options=['Temperature', 'Pressure'],
button_type='primary'
)
# Selector to pick which device data to show for top graph
topDeviceSelector = pn.widgets.Select(name='Device selector', options=DEVICELIST)
# Data pipeline for plotting the top graph
topDataPipeline = (
interDataFrame[(interDataFrame.month <= mnthSlider)]
.groupby(['Date'])[topYAxisData].mean() ## Revision, groupby. plan : month
.to_frame()
.reset_index()
.sort_values(by='month')
.reset_index(drop = True)
)
topDataPlot = topDataPipeline.hvplot(x='Date', y=topYAxisData, line_witdh=2, title='Sensor Data')
# Device selector for bottom one
botDeviceSelector1 = pn.widgets.Select(name='Device selector 1', options=DEVICELIST)
botDeviceSelector2 = pn.widgets.Select(name='Device selector 2', options=DEVICELIST)
botDeviceSelector3 = pn.widgets.Select(name='Device selector 3', options=DEVICELIST)
# Button to pick which data to show for bottom one
botYAxisData1 = pn.widgets.RadioButtonGroup(
name='bot data select1',
options=['Temperature', 'Pressure'],
button_type='primary'
)
botYAxisData2 = pn.widgets.RadioButtonGroup(
name='bot data select2',
options=['Temperature', 'Pressure'],
button_type='primary'
)
botYAxisData3 = pn.widgets.RadioButtonGroup(
name='bot data select3',
options=['Temperature', 'Pressure'],
button_type='primary'
)
# Data pipeline for plotting
botDataPipeline1 = (
interDataFrame[(interDataFrame.month <= monthSlider)]
.groupby(['Date'])[botYAxisData1].mean() ## Revision, groupby. plan : month
.to_frame()
.reset_index()
.sort_values(by='month')
.reset_index(drop = True)
)
botDataPipeline2 = (
interDataFrame[(interDataFrame.month <= monthSlider)]
.groupby(['Date'])[botYAxisData2].mean() ## Revision, groupby. plan : month
.to_frame()
.reset_index()
.sort_values(by='month')
.reset_index(drop = True)
)
botDataPipeline3 = (
interDataFrame[(interDataFrame.month <= monthSlider)]
.groupby(['Date'])[botYAxisData3].mean() ## Revision, groupby. plan : month
.to_frame()
.reset_index()
.sort_values(by='month')
.reset_index(drop = True)
)
# Plotting for bot graph
botDataPlot1 = botDataPipeline1.hvplot(x='Date', y=botYAxisData1, line_witdh=2, title='Sensor Data 1')
botDataPlot2 = botDataPipeline2.hvplot(x='Date', y=botYAxisData2, line_witdh=2, title='Sensor Data 2')
botDataPlot3 = botDataPipeline3.hvplot(x='Date', y=botYAxisData3, line_witdh=2, title='Sensor Data 3')
## End Widgets
# Template
template = pn.template.FastListTemplate(
title='Temperature / Pressure Data',
sidebar=[pn.pane.markdown("# What should I write here??"),
pn.pane.markdown("### What should I write here?? AGAIN"),
pn.pane.PNG("Temperature.png", sizing_mode='scale_both'),
pn.pane.markdown('Month Selector'),
monthSlider],
main=[pn.Row(pn.Column(topYAxisData, topDataPlot.panel(), margin=(0,25))),
pn.Row(pn.Column(botYAxisData1, botDataPlot1.panel(), margin=(0,25)),
pn.Column(botYAxisData2, botDataPlot2.panel(), margin=(0,25)),
pn.Column(botYAxisData3, botDataPlot3.panel(), margin=(0,25)),
pn.Column(pn.Row(botDeviceSelector1), pn.Row(botDeviceSelector2, pn.Row(botDeviceSelector3))))],
accent_base_color="#84A9FF",
header_background="#84A9FF"
)
template.show()
template.servable();
I tried to use "astype" and "to_numeric" but still same error

MatPlot lib keeps giving "out of range error"

Im trying to make a pokedex where it stores all the names and stats of the pokemon in a .csv file and reads off of that when called. It should also show an image of the pokemon when called.
The code is as follows:
import pandas as pd
import matplotlib.pyplot as plt
import os
import glob
import natsort
# reading the images dataset
dir1 = r"C:\Users\yash1\Desktop\pokedex\pokemon_images"
path1 = os.path.join(dir1, '*g')
files = glob.glob(path1)
files1 = natsort.natsorted(files, reverse=False)
imag = []
for x in files1:
img = plt.imread(x)
imag.append(img)
# reading the details dataset
data = pd.read_csv('pokemon.csv')
print("Pokedex\n")
print("Welcome Pokemon Lovers\n")
print("Search for a pokemon\n")
df1 =input("<A>Search by pokemon name\n<B>Search by pokemon ID\n(select A or B)\n")
df1.upper()
if(df1=="A"):
print("Enter the name of the pokemon")
name = input()
name.lower().strip()
dt = data[:].where(data['pokemon']==name)
st = dt[dt['id'].notnull()]
idx = dt.index[dt['pokemon'] == name]
if idx > 721:
exit(0)
plt.imshow(imag[idx[1]])
plt.axis("off") # turns off axes
plt.axis("tight") # gets rid of white border
plt.axis("image") # square up the image instead of filling the "figure" space
plt.show()
elif(df1=="B"):
print("Enter the ID of the pokemon")
ID = int(input())
tt = data[:].where(data['id']==ID)
idx1 = tt.index[tt['id']==ID]
qt = tt[tt['id'].notnull()]
for i in qt.columns:
print(i," : ",qt[i][idx1[0]])
if idx1>721:
exit(0)
#plt.imshow(imag[idx1[0]])
plt.axis("off") # turns off axes
plt.axis("tight") # gets rid of white border
plt.axis("image") # square up the image instead of filling the "figure" space
plt.show()
when I run it it gives me an error like this:
Traceback (most recent call last):
File "G:/pythonProject/main.py", line 33, in <module>
plt.imshow(imag[idx[1]])
File "C:\python\lib\site-packages\pandas\core\indexes\base.py", line 4604, in __getitem__
return getitem(key)
IndexError: index 1 is out of bounds for axis 0 with size 1
How do I fix the axis bounds, I even replaced the brackets with "0" and it still didn't work.

TypeError When Adding another value to GeoJsonToolTip

I'm trying to make a map with the number of noise complaints for each zipcode and everything runs fine, but I can't get the count number to appear on the map when I hover over each area. I tried making it into an int as the error suggested, but nothing seems to work.
import pandas as pd
df2020 = pd.read_csv('/Users/kenia/Desktop/CSCI 233 Seminar Project/311_Noise_Complaints.csv',sep=',', low_memory = False)
df2020=df2020[df2020['Created Date'].str[6:10] == '2020']
df2020['Incident Zip'].fillna(0, inplace=True)
df2020['Incident Zip'] = df2020['Incident Zip'].astype(int)
df2020_zip = df2020['Incident Zip'].value_counts().to_frame().reset_index()
df2020_zip.columns = ['postal_code', 'counts']
df2020_zip['postal_code'] = df2020_zip['postal_code'].astype(str)
df2020_zip['counts'] = df2020_zip['counts'].astype(int)
import folium
nycMap = folium.Map(location=[40.693943, -73.985880], zoom_start=10)
zipLines = '/Users/kenia/Desktop/CSCI 233 Seminar Project/zipMap.geojson.json'
df2020_zip['counts'] = df2020_zip['counts'].astype(int)
df2020_zip['counts'] = pd.Series(zipLines['counts'])
count_col = df2020_zip['counts']
bins = list(df2020_zip['counts'].quantile([0,0.2,0.4,0.6,0.8,1]))
choropleth = folium.Choropleth(geo_data = zipLines,
data=df2020_zip,
columns=['postal_code', 'counts'],
key_on='feature.properties.postalCode',
fill_color='OrRd',
fill_opacity=0.7,
line_opacity=1.0,
bins = bins,
highlight=True,
legend_name="Noise Frequency in 2020"
).add_to(nycMap)
folium.LayerControl().add_to(nycMap)
choropleth.geojson.add_child(
folium.features.GeoJsonTooltip(['postalCode','PO_NAME','count_col'])
)
nycMap.save(outfile='index.html')
Error:
Traceback (most recent call last):
File "/Users/kenia/Desktop/throwaway.py", line 20, in <module>
df2020_zip['counts'] = pd.Series(zipLines['counts'])
TypeError: string indices must be integers
Dataset: https://data.cityofnewyork.us/Social-Services/311-Noise-Complaints/p5f6-bkga
Zipcode GeoJson: https://data.beta.nyc/dataset/nyc-zip-code-tabulation-areas/resource/6df127b1-6d04-4bb7-b983-07402a2c3f90?view_id=b34c6552-9fdb-4f95-8810-0588ad1a4cc8

Get Geographical Coordinates from address in Python 2

I'm using Pycharm 2.2 community, currently trying to get latitude and longitude from addresses. Stored the address data in a panda data frame and tried to get its geographical coordinates. However, it returned me:
"Traceback (most recent call last):
File "C:/map.py", line 19, in <module>
geocode_result = gmaps_key.geocode(data1.iat[i, 0])" and "return self.obj._get_value(*key, takeable=self._takeable)" after execution.
When hover over geocode, the tooltip returns me "unresolved attribute reference 'geocode' for class 'Client'..."
Below is my code, any help would be much appreciated, thanks!
import pandas as pd
import googlemaps
data = pd.read_csv("listing.csv", usecols=['building_no', 'street_name'])
# df = pd.DataFrame({'Year': ['2014', '2015'], 'quarter': ['q1', 'q2']})
data['address'] = data[['building_no', 'street_name']].apply(lambda x: ' '.join(x), axis=1)
data1 = data['address']
# print data1
# Set google map API key
gmaps_key = googlemaps.Client(key="AIzaSyDjB0HJolcomNZCWrtq9gef70V4F2xtB_s")
# create Geocode result object
# get LON and LAT
data1["LAT"] = None
data1["LON"] = None
for i in range(0, len(data1), 1):
geocode_result = gmaps_key.geocode(data1.iat[i, 0])
try:
lat = geocode_result[0]["geometry"]["location"]["lat"]
lon = geocode_result[0]["geometry"]["location"]["lng"]
data1.iat[i, data1.columns.get_loc("LAT")] = lat
data1.iat[i, data1.columns.get_loc("LON")] = lon
except:
lat = None
lon = None
print data1

: Must pass DataFrame with boolean values only while trying to use to_dict

import pandas as pd
dataset = "C:/Users/ashik swaroop/Desktop/anaconda/Gene Dataset/acancergenecensus.csv"
datacan = pd.read_csv(dataset)
datacan = datacan.fillna(0)
cols_to_retain = datacan[[ "Tumour_Types_Somatic","Tumour_Types_Germline","Mutation_Types","Tissue_Type"]]
cat_dict = datacan[ cols_to_retain ].to_dict( orient = 'records' )
getting an error after running this please help or give suggestion :
cat_dict = datacan[ cols_to_retain ].to_dict( orient = 'records' )
Traceback (most recent call last):
File "<ipython-input-47-dde9a2c1af34>", line 1, in <module>
cat_dict = datacan[ cols_to_retain ].to_dict( orient = 'records' )
File "C:\Users\ashik swaroop\Anaconda2\lib\site-packages\pandas\core\frame.py", line 2055, in __getitem__
return self._getitem_frame(key)
File "C:\Users\ashik swaroop\Anaconda2\lib\site-packages\pandas\core\frame.py", line 2130, in _getitem_frame
raise ValueError('Must pass DataFrame with boolean values only')
ValueError: Must pass DataFrame with boolean values only
You need change:
cols_to_retain = datacan[[ "Tumour_Types_Somatic","Tumour_Types_Germline","Mutation_Types","Tissue_Type"]]
cat_dict = datacan[ cols_to_retain ].to_dict( orient = 'records' )
to:
cols_to_retain = [ "Tumour_Types_Somatic","Tumour_Types_Germline","Mutation_Types","Tissue_Type"]
cat_dict = datacan[ cols_to_retain ].to_dict( orient = 'records' )
because if select by double [] is is called subset and return filtered DataFrame, not columns names.
Another possible solution is:
df = datacan[[ "Tumour_Types_Somatic","Tumour_Types_Germline","Mutation_Types","Tissue_Type"]]
cat_dict = df.to_dict( orient = 'records' )

Categories

Resources