I get two different results when I use bokehs circle (ordiamond_cross` function) and line function. The line function includes negative values and the circle does not.
Plot with line
and a plot with diamond_cross
I want to plot the temperatures for a certain place over a timespan. I have a lot of values therefore I would like to make a scatterplot through bokeh.
I also get the same problem when I use the x function.
In my code below you can change the diamond_cross with line and remove the fill_alpha and size then you will probably also get two different graphs.
import pandas as pd
import numpy as np
import bokeh as bk
import scipy.special
from bokeh.plotting import figure, output_file, show
from bokeh.models import ColumnDataSource
from bokeh.models.tools import HoverTool
from bokeh.models.glyphs import Quad
from bokeh.layouts import gridplot
df = pd.read_csv('KNMI2.csv', sep=';',
usecols= ['YYYYMMDD','Techt', 'YYYY', 'MM', 'D'])
jan= df[df['MM'].isin(['1'])]
source_jan = ColumnDataSource(jan)
p = figure(plot_width = 800, plot_height = 800,
x_range=(0,32), y_range=(-20,20))
p.diamond_cross(x='D', y='Techt', source=source_jan,
fill_alpha=0.2, size=2)
p.title.text = 'Temperatuur per uur vanaf 1951 tot 2019'
p.xaxis.axis_label = 'januari'
p.yaxis.axis_label = 'Temperatuur (C)'
show(p)
If both the circle/ diamond_cross function work the same as the line function then their plots will also show negative values.
I had a similar issue where the the data type of the variable I was trying to plot was string instead of int.
Try using
jan = df[df['MM'].isin(['1'])]
jan['Techt'] = jan['Techt'].astype(int)
source_jan = ColumnDataSource(jan)
Related
I tried to write these codes to display the dataseries plot, but no data was not displayed.
I dont know where is the problem exactly.
data=pd.read_csv('weather.csv')[['STA','Date','Precip','MaxTemp','MinTemp','MeanTemp','Snowfall']].dropna()
data = data[data['Precip'] != 'T']
data['Precip'].astype(float)
data['STA']=data['STA'].astype("string")
data['Date']=pd.to_datetime(data['Date'])
stations=list(set(data['STA']))
stations.sort()
select_inital=select.value
colors = list(Category20_16)
colors.sort()
subset=data[data['STA']==select_inital]
initial_values= list(set(subset['STA']))
for i, j in enumerate(initial_values):
subset=data[data['STA']==j]
d=subset[['Date','Precip']]
d.sort_values('Date')
x=d['Date']
y=d['Precip']
d = ColumnDataSource(d)
p = figure(plot_width=700, plot_height=700, x_range=(0,200), title='Weather Evolution',x_axis_label='Date', y_axis_label='Precip',x_axis_type='datetime')
p.line(x,y, legend_label="Evolution", line_width=2)
show(p)
This is just guessing but I believe the problem is, that you are trying to set limits to the x_range. Bokeh is evaluating dates as milliseconds from 1970-01-01 00:00 and your x_range=(0,200) is also interpreted as millisecond. This means the visible area is very small and starts at January 1st 1970. You could use the defaults by bokeh instead.
Minimal example
This is your code for the figure except I removed the x_range.
import pandas as pd
from bokeh.plotting import figure, show, output_notebook
output_notebook()
x = pd.date_range('2022-12-01', '2022-12-24', freq='D')
y = list(range(1,25))
p = figure(
plot_width=700,
plot_height=700,
# x_range=(0,200),
title='Weather Evolution',
x_axis_label='Date',
y_axis_label='Precip',
x_axis_type='datetime'
)
p.line(x,y, legend_label="Evolution", line_width=2)
show(p)
Bokeh default x_range
x_range by user
Comment
If you want to set the x_range for a axis with type "datetime" you can pass timestamp objects to it.
Valid are among other things (e.g. float)
# datetime
from datetime import datetime
x_range=(datetime(2022,12, 7),datetime(2022,12, 10))
# pandas
import pandas as pd
x_range=(pd.Timestamp('2022-12-07'),pd.Timestamp('2022-12-10'))
I am trying to represent the data using the bokeh scatter.
Here is my code:
from bokeh.plotting import Scatter, output_file, show import pandas
df=pandas.Dataframe(colume["X","Y"])
df["X"]=[1,2,3,4,5,6,7]
df["Y"]=[23,43,32,12,34,54,33]
p=Scatter(df,x="X",y="Y", title="Day Temperature measurement", xlabel="Tempetature", ylabel="Day")
output_file("File.html")
show(p)
The Output should look like this:
Expected Output
The error is:
ImportError Traceback (most recent call
> last) <ipython-input-14-1730ac6ad003> in <module>
> ----> 1 from bokeh.plotting import Scatter, output_file, show
> 2 import pandas
> 3
> 4 df=pandas.Dataframe(colume["X","Y"])
> 5
ImportError: cannot import name 'Scatter' from 'bokeh.plotting'
(C:\Users\LENOVO\Anaconda3\lib\site-packages\bokeh\plotting__init__.py)
I had also found that the Scatter is no longer maintained now. Is there is any way to use it?
Also which alternative do I have to represent the data same as the Scatter using any another python libraries?
Using older version of Bokeh will resolve this issue?
Scatter (with a capital S) has never been part of bokeh.plotting. It used to be a part of the old bokeh.charts API that was removed several years ago. However, it is not needed at all to create basic scatter plots, since all the glyph methods in bokeh.plotting (e.g circle, square) are all implicitly scatter-type functions to begin with:
from bokeh.plotting import figure, show
import pandas as pd
df = pd.DataFrame({"X" :[1,2,3,4,5,6,7],
"Y": [23,43,32,12,34,54,33]})
p = figure(x_axis_label="Tempetature", y_axis_label="Day",
title="Day Temperature measurement")
p.circle("X", "Y", size=15, source=df)
show(p)
Which yields:
You can also just pass the data directly to circle as in the other answer.
If you want to do fancier things, like map the marker type based on a column there is also a plot.scatter (lower case s) methods on the figure:
from bokeh.plotting import figure, show
from bokeh.sampledata.iris import flowers
from bokeh.transform import factor_cmap, factor_mark
SPECIES = ['setosa', 'versicolor', 'virginica']
MARKERS = ['hex', 'circle_x', 'triangle']
p = figure(title = "Iris Morphology")
p.xaxis.axis_label = 'Petal Length'
p.yaxis.axis_label = 'Sepal Width'
p.scatter("petal_length", "sepal_width", source=flowers, legend_field="species", fill_alpha=0.4, size=12,
marker=factor_mark('species', MARKERS, SPECIES),
color=factor_cmap('species', 'Category10_3', SPECIES))
show(p)
which yields:
If you look up "scatter" in the docs, you'll find
Scatter Markers
To scatter circle markers on a plot, use the circle() method of Figure:
from bokeh.plotting import figure, output_file, show
# output to static HTML file
output_file("line.html")
p = figure(plot_width=400, plot_height=400)
# add a circle renderer with a size, color, and alpha
p.circle([1, 2, 3, 4, 5], [6, 7, 2, 4, 5], size=20, color="navy", alpha=0.5)
# show the results
show(p)
To work with dataframes, just pass in the columns like df.X and df.Y to the x and y args.
from bokeh.plotting import figure, show, output_file
import pandas as pd
df = pd.DataFrame(columns=["X","Y"])
df["X"] = [1,2,3,4,5,6,7]
df["Y"] = [23,43,32,12,34,54,33]
p = figure()
p.scatter(df.X, df.Y, marker="circle")
#from bokeh.io import output_notebook
#output_notebook()
show(p) # or output to a file...
I have a pandas dataframe of 10 columns and trying to get bar plot using Bokeh.
The HTML file has the complete plot when I use plot_width=10000.
However when I increase the plot width(so that there is space between x axes values) to 30000, the plot does not fill beyond 2010. Here is the complete code. Please suggest the way forward.
from bokeh.palettes import Viridis6 as palette
from bokeh.transform import factor_cmap
from bokeh.models import ColumnDataSource,FactorRange,HoverTool
from bokeh.palettes import Spectral6
from flask import Flask, request, render_template, session, redirect,send_file
import numpy as np
import pandas as pd
from bokeh.plotting import figure, show, output_file,save
from bokeh.embed import components,file_html
from bokeh.resources import CDN
from bokeh.layouts import row,column
from bokeh.core.properties import value
dates = pd.date_range('20050101', periods=3900)
df = pd.DataFrame(np.random.randn(3900, 10), index=dates, columns=list('ABCDEFGHIJ'))
s = df.resample('M').mean().stack()
s.index = [s.index.get_level_values(0).strftime('%Y-%m-%d'),s.index.get_level_values(1)]
x = s.index.values
l1=list(s.index.levels[1])
counts = s.values
source = ColumnDataSource(data=dict(x=x, counts=counts))
p = figure(x_range=FactorRange(*x), plot_height=250,plot_width=30000, title='Plotting data',
toolbar_location=None, tools="")
p.vbar(x='x', top='counts', width=1, source=source, line_color="white")
p.y_range.start = s.values.min()
p.y_range.end = s.values.max()
p.x_range.range_padding = 0.01
p.y_range.range_padding = 0.01
p.xaxis.major_label_orientation = 1
p.xgrid.grid_line_color = None
output_file('test_plot.html')
save([p])
show(p)
This works fine for me with Bokeh 1.0.4 and OSX/Safari. I suspect this is a limitation/issue with the underlying HTML Canvas implementation in whatever browser you are using, in which case there is nothing we can do about it. The only suggestions I can make are to split the plot up into smaller subplots, or use a different browser (or possibly different version of the same browser)
I am intending to create a scatter plot with a linear colormapper. The dataset is the popular Female Literacy and Birthrate dataset.
The plot would have the "GDP per capita" on the x axis and "Life Expectancy at Birth" on the y axis. In addition to this (and this is where i am running into the issue), is to vary the color of the points according to "Birth rate".
Current Code:
#DATA MANIPULATION
# import Pandas, Bokeh, etc
import numpy as np
import pandas as pd
from bokeh.io import show, output_file
from bokeh.models import ColumnDataSource
from bokeh.palettes import Viridis256 as palette
from bokeh.plotting import figure
from bokeh.sampledata.autompg import autompg as df
from bokeh.transform import linear_cmap
# load the data file
excel_file = '../factbook.xlsx'
#(removed url above since it is private)
factbook = pd.read_excel(excel_file)
source = ColumnDataSource(factbook)
colormapper = linear_cmap(field_name = factbook["Birth rate"], palette=palette, low=min(factbook["Birth rate"]), high=max(factbook["Birth rate"]))
p = figure(title = "UN Factbook Bubble Visualization",
x_axis_label = 'GDP per capita', y_axis_label = 'Life expectancy at birth')
p.circle(x = 'GDP per capita', y = 'Life expectancy at birth', source = source, color =colormapper)
output_file("file", title="Bubble Graph")
show(p)
the p.circle line is having an issue with consuming the colormapper. I would like help on understanding how to resolve this.
The field_name parameter should be provided with the name of a column. You are supplying the entire data column itself. Since you have not provided a complete runnable example, it is impossible to test for sure, but presumably you want:
linear_cmap(field_name="Birth rate", ...)
I am running the following code to render a plot with dates in the x axis and floats in the y axis:
import pandas as pd
from bokeh.plotting import figure, output_file, show
from bokeh.models import DatetimeTickFormatter
from bokeh.charts import Bar, Line, show
def datetime(x):
return pd.DataFrame(x, dtype='datetime64')
openxbids = pd.read_csv('data')
openxbids.sort_values('date')
output_file("lines.html")
p = figure(width=800, height=250, x_axis_type="datetime")
p.line(datetime(openxbids['date']), openxbids['bids'], color = 'navy', alpha=0.5)
show(p)
However, when I run this, I get a graph without any data plotted. The x and y axis ranges seem to be correctly detected. What am I missing?