I tried to write these codes to display the dataseries plot, but no data was not displayed.
I dont know where is the problem exactly.
data=pd.read_csv('weather.csv')[['STA','Date','Precip','MaxTemp','MinTemp','MeanTemp','Snowfall']].dropna()
data = data[data['Precip'] != 'T']
data['Precip'].astype(float)
data['STA']=data['STA'].astype("string")
data['Date']=pd.to_datetime(data['Date'])
stations=list(set(data['STA']))
stations.sort()
select_inital=select.value
colors = list(Category20_16)
colors.sort()
subset=data[data['STA']==select_inital]
initial_values= list(set(subset['STA']))
for i, j in enumerate(initial_values):
subset=data[data['STA']==j]
d=subset[['Date','Precip']]
d.sort_values('Date')
x=d['Date']
y=d['Precip']
d = ColumnDataSource(d)
p = figure(plot_width=700, plot_height=700, x_range=(0,200), title='Weather Evolution',x_axis_label='Date', y_axis_label='Precip',x_axis_type='datetime')
p.line(x,y, legend_label="Evolution", line_width=2)
show(p)
This is just guessing but I believe the problem is, that you are trying to set limits to the x_range. Bokeh is evaluating dates as milliseconds from 1970-01-01 00:00 and your x_range=(0,200) is also interpreted as millisecond. This means the visible area is very small and starts at January 1st 1970. You could use the defaults by bokeh instead.
Minimal example
This is your code for the figure except I removed the x_range.
import pandas as pd
from bokeh.plotting import figure, show, output_notebook
output_notebook()
x = pd.date_range('2022-12-01', '2022-12-24', freq='D')
y = list(range(1,25))
p = figure(
plot_width=700,
plot_height=700,
# x_range=(0,200),
title='Weather Evolution',
x_axis_label='Date',
y_axis_label='Precip',
x_axis_type='datetime'
)
p.line(x,y, legend_label="Evolution", line_width=2)
show(p)
Bokeh default x_range
x_range by user
Comment
If you want to set the x_range for a axis with type "datetime" you can pass timestamp objects to it.
Valid are among other things (e.g. float)
# datetime
from datetime import datetime
x_range=(datetime(2022,12, 7),datetime(2022,12, 10))
# pandas
import pandas as pd
x_range=(pd.Timestamp('2022-12-07'),pd.Timestamp('2022-12-10'))
Related
I get two different results when I use bokehs circle (ordiamond_cross` function) and line function. The line function includes negative values and the circle does not.
Plot with line
and a plot with diamond_cross
I want to plot the temperatures for a certain place over a timespan. I have a lot of values therefore I would like to make a scatterplot through bokeh.
I also get the same problem when I use the x function.
In my code below you can change the diamond_cross with line and remove the fill_alpha and size then you will probably also get two different graphs.
import pandas as pd
import numpy as np
import bokeh as bk
import scipy.special
from bokeh.plotting import figure, output_file, show
from bokeh.models import ColumnDataSource
from bokeh.models.tools import HoverTool
from bokeh.models.glyphs import Quad
from bokeh.layouts import gridplot
df = pd.read_csv('KNMI2.csv', sep=';',
usecols= ['YYYYMMDD','Techt', 'YYYY', 'MM', 'D'])
jan= df[df['MM'].isin(['1'])]
source_jan = ColumnDataSource(jan)
p = figure(plot_width = 800, plot_height = 800,
x_range=(0,32), y_range=(-20,20))
p.diamond_cross(x='D', y='Techt', source=source_jan,
fill_alpha=0.2, size=2)
p.title.text = 'Temperatuur per uur vanaf 1951 tot 2019'
p.xaxis.axis_label = 'januari'
p.yaxis.axis_label = 'Temperatuur (C)'
show(p)
If both the circle/ diamond_cross function work the same as the line function then their plots will also show negative values.
I had a similar issue where the the data type of the variable I was trying to plot was string instead of int.
Try using
jan = df[df['MM'].isin(['1'])]
jan['Techt'] = jan['Techt'].astype(int)
source_jan = ColumnDataSource(jan)
I'm using the datetime axis of Bokeh. In the Bokeh data source, I have my x in numpy datetime format and others are y numbers. I'm looking for a way to show the label of the x datetimx axis right below the point. I want Bokeh to show the exact datetime that I provided via my data source, not some approximation! For instance, I provide 5:15:00 and it shows 5:00:00 somewhere before the related point.I plan to stream data to the chart every 1 hour, and I want to show 5 points each time. Therefore, I need 5 date-time labels. How can I do that? I tried p.yaxis[0].ticker.desired_num_ticks = 5 but it didn't help. Bokeh still shows as many number of ticks as it wants! Here is my code and result:
import numpy as np
from bokeh.models.sources import ColumnDataSource
from bokeh.plotting import figure
from bokeh.io import show
from bokeh.palettes import Category10
p = figure(x_axis_type="datetime", plot_width=800, plot_height=500)
data = {'x':
[np.datetime64('2019-01-26T03:15:10'),
np.datetime64('2019-01-26T04:15:10'),
np.datetime64('2019-01-26T05:15:10'),
np.datetime64('2019-01-26T06:15:10'),
np.datetime64('2019-01-26T07:15:10')],
'A': [10,25,15,55,40],
'B': [60,50,80,65,120],}
source = ColumnDataSource(data=data)
cl = Category10[3][1:]
r11 = p.line(source=source, x='x', y='A', color=cl[0], line_width=3)
r12 = p.line(source=source, x='x', y='B', color=cl[1], line_width=3)
p.xaxis.formatter=DatetimeTickFormatter(
seconds=["%H:%M:%S"],
minsec=["%H:%M:%S"],
minutes=["%H:%M:%S"],
hourmin=["%H:%M:%S"],
hours=["%H:%M:%S"],
days=["%H:%M:%S"],
months=["%H:%M:%S"],
years=["%H:%M:%S"],
)
p.y_range.start = -100
p.x_range.range_padding = 0.1
p.yaxis[0].ticker.desired_num_ticks = 5
p.xaxis.major_label_orientation = math.pi/2
show(p)
and here is the result:
As stated in the docs, num_desired_ticks is only a suggestion. If you want a ticks at specific locations that do not change, then you can use a FixedTicker, which can be set by plain list as convenience:
p.xaxis.ticker = [2, 3.5, 4]
For datetimes, you would pass the values as milliseconds since epoch.
If you want a fixed number of ticks, but the locations may change (i.e. because the range may change), then there is nothing built in to do that. You could make a custom ticker extension.
If I run the following, it appears to work as expected, but the y-axis is limited to the earliest and latest times in the data. I want it to show midnight to midnight. I thought I could do that with the code that's commented out. But when I uncomment it, I get the correct y-axis, yet nothing plots. Where am I going wrong?
from datetime import datetime
import matplotlib.pyplot as plt
data = ['2018-01-01 09:28:52', '2018-01-03 13:02:44', '2018-01-03 15:30:27', '2018-01-04 11:55:09']
x = []
y = []
for i in range(0, len(data)):
t = datetime.strptime(data[i], '%Y-%m-%d %H:%M:%S')
x.append(t.strftime('%Y-%m-%d')) # X-axis = date
y.append(t.strftime('%H:%M:%S')) # Y-axis = time
plt.plot(x, y, '.')
# begin = datetime.strptime('00:00:00', '%H:%M:%S').strftime('%H:%M:%S')
# end = datetime.strptime('23:59:59', '%H:%M:%S').strftime('%H:%M:%S')
# plt.ylim(begin, end)
plt.show()
Edit: I also noticed that the x-axis isn't right either. The data skips Jan 2, but I want that on the axis so the data is to scale.
This is a dramatically simplified version of code dealing with over a year's worth of data with over 2,500 entries.
If Pandas is available to you, consider this approach:
import pandas as pd
data = pd.to_datetime(data, yearfirst=True)
plt.plot(data.date, data.time)
_=plt.ylim(["00:00:00", "23:59:59"])
Update per comments
X-axis date formatting can be adjusted using the Locator and Formatter methods of the matplotlib.dates module. Locator finds the tick positions, and Formatter specifies how you want the labels to appear.
Sometimes Matplotlib/Pandas just gets it right, other times you need to call out exactly what you want using these extra methods. In this case, I'm not sure why those numbers are showing up, but this code will remove them.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
f, ax = plt.subplots()
data = pd.to_datetime(data, yearfirst=True)
ax.plot(data.date, data.time)
ax.set_ylim(["00:00:00", "23:59:59"])
days = mdates.DayLocator()
d_fmt = mdates.DateFormatter('%m-%d')
ax.xaxis.set_major_locator(days)
ax.xaxis.set_major_formatter(d_fmt)
I am plotting time series using pandas .plot() and want to see every month shown as an x-tick.
Here is the dataset structure
Here is the result of the .plot()
I was trying to use examples from other posts and matplotlib documentation and do something like
ax.xaxis.set_major_locator(
dates.MonthLocator(revenue_pivot.index, bymonthday=1,interval=1))
But that removed all the ticks :(
I also tried to pass xticks = df.index, but it has not changed anything.
What would be the rigth way to show more ticks on x-axis?
No need to pass any args to MonthLocator. Make sure to use x_compat in the df.plot() call per #Rotkiv's answer.
import pandas as pd
import numpy as np
import matplotlib.pylab as plt
import matplotlib.dates as mdates
df = pd.DataFrame(np.random.rand(100,2), index=pd.date_range('1-1-2018', periods=100))
ax = df.plot(x_compat=True)
ax.xaxis.set_major_locator(mdates.MonthLocator())
plt.show()
formatted x-axis with set_major_locator
unformatted x-axis
You could also format the x-axis ticks and labels of a pandas DateTimeIndex "manually" using the attributes of a pandas Timestamp object.
I found that much easier than using locators from matplotlib.dates which work on other datetime formats than pandas (if I am not mistaken) and thus sometimes show an odd behaviour if dates are not converted accordingly.
Here's a generic example that shows the first day of each month as a label based on attributes of pandas Timestamp objects:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# data
dim = 8760
idx = pd.date_range('1/1/2000 00:00:00', freq='h', periods=dim)
df = pd.DataFrame(np.random.randn(dim, 2), index=idx)
# select tick positions based on timestamp attribute logic. see:
# https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Timestamp.html
positions = [p for p in df.index
if p.hour == 0
and p.is_month_start
and p.month in range(1, 13, 1)]
# for date formatting, see:
# https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior
labels = [l.strftime('%m-%d') for l in positions]
# plot with adjusted labels
ax = df.plot(kind='line', grid=True)
ax.set_xlabel('Time (h)')
ax.set_ylabel('Foo (Bar)')
ax.set_xticks(positions)
ax.set_xticklabels(labels)
plt.show()
yields:
Hope this helps!
The right way to do that described here
Using the x_compat parameter, it is possible to suppress automatic tick resolution adjustment
df.A.plot(x_compat=True)
If you want to just show more ticks, you can also dive deep into the structure of pd.plotting._converter:
dai = ax.xaxis.minor.formatter.plot_obj.date_axis_info
dai['fmt'][dai['fmt'] == b''] = b'%b'
After plotting, the formatter is a TimeSeries_DateFormatter and _set_default_format has been called, so self.plot_obj.date_axis_info is not None. You can now manipulate the structured array .date_axis_info to be to your liking, namely contain less b'' and more b'%b'
Remove tick labels:
ax = df.plot(x='date', y=['count'])
every_nth = 10
for n, label in enumerate(ax.xaxis.get_ticklabels()):
if n % every_nth != 0:
label.set_visible(False)
Lower every_nth to include more labels, raise to keep fewer.
Using Bokeh 0.8.1, how can i display a long timeserie, but start 'zoomed-in' on one part, while keeping the rest of data available for scrolling ?
For instance, considering the following time serie (IBM stock price since 1980), how could i get my chart to initially display only price since 01/01/2014 ?
Example code :
import pandas as pd
import bokeh.plotting as bk
from bokeh.models import ColumnDataSource
bk.output_notebook()
TOOLS="pan,wheel_zoom,box_zoom,reset,save"
# Quandl data, too lazy to generate some random data
df = pd.read_csv('https://www.quandl.com/api/v1/datasets/GOOG/NYSE_IBM.csv')
df['Date'] = pd.to_datetime(df['Date'])
df = df[['Date', 'Close']]
#Generating a bokeh source
source = ColumnDataSource()
dtest = {}
for col in df:
dtest[col] = df[col]
source = ColumnDataSource(data=dtest)
# plotting stuff !
p = bk.figure(title='title', tools=TOOLS,x_axis_type="datetime", plot_width=600, plot_height=300)
p.line(y='Close', x='Date', source=source)
bk.show(p)
outputs :
but i want to get this (which you can achieve with the box-zoom tool - but I'd like to immediately start like this)
So, it looks (as of 0.8.1) that we need to add some more convenient ways to set ranges with datetime values. That said, although this is a bit ugly, it does currently work for me:
import time, datetime
x_range = (
time.mktime(datetime.datetime(2014, 1, 1).timetuple())*1000,
time.mktime(datetime.datetime(2016, 1, 1).timetuple())*1000
)
p = bk.figure(
title='title', tools=TOOLS,x_axis_type="datetime",
plot_width=600, plot_height=300, x_range=x_range
)