Bokeh plot time series - python

I'm trying to plot the following dataframe with Bokeh (data_frame in the code), in my example I only have two columns 0 and 1 (and Dates which is the x-axis). But in my real dataset I have more than 10, so I'm trying to find a better version than mine which does not generalize well. (I thought of a for loop but it doesn't seem optimal)
from bokeh.plotting import figure, show
from bokeh.charts import TimeSeries
from bokeh.io import output_notebook
output_notebook()
data_frame = pd.DataFrame({0: [0.17, 0.189, 0.185, 0.1657], 1: [0.05, 0.0635, 0.0741, 0.0925], 'Date': [2004, 2005, 2006, 2007]})
p = figure(x_axis_label = 'date',
y_axis_label='Topics Distribution')
p.circle(data_frame.Date, data_frame.iloc[:, 0])
p.circle(data_frame.Date, data_frame.iloc[:, 1])
show(p)
I've tried this as well, but it does not work and I don't want lines only points:
p = TimeSeries(data_frame, index='Date', legend=True,
title = 'T', ylabel='topics distribution')
Thanks for your help!

Let's try a different approach and see if this makes a little more sense:
Reshape the data to be in a
"tidy" data format
Use Bokeh high-level Scatter chart with color argument
Code:
chartdata = data_frame.set_index('Date').stack().reset_index().rename(columns={'level_1':'Category',0:'Value'})
print(chartdata)
Output "tidy" data format:
Date Category Value
0 2004 0 0.1700
1 2004 1 0.0500
2 2005 0 0.1890
3 2005 1 0.0635
4 2006 0 0.1850
5 2006 1 0.0741
6 2007 0 0.1657
7 2007 1 0.0925
Build chart:
from bokeh.charts import Scatter
p = Scatter(chartdata, x='Date', y='Value', color='Category',xlabel='date', ylabel='Topics Distribution')

Related

How to plot groups of points on a map by associating them with the date of detection in Python

i'm trying to assess the displacement of a particular fish on the seabed according to seasonality. Thus, i would like to create a map with different colored points according to the month in which the detection occured (e.g., all points from August in blue, all points from Sept in red, all points from Oct in yellow).
In my dataframe i have both coordinates for each point (Lat, Lon) and the dates (Dates) of detection:
LAT
LON
Dates
0
49.302005
-67.684971
2019-08-06
1
49.302031
-67.684960
2019-08-12
2
49.302039
-67.684983
2019-08-21
3
49.302039
-67.684979
2019-08-30
4
49.302041
-67.684980
2019-09-03
5
49.302041
-67.684983
2019-09-10
6
49.302042
-67.684979
2019-09-18
7
49.302043
-67.684980
2019-09-25
8
49.302045
-67.684980
2019-10-01
9
49.302045
-67.684983
2019-10-09
10
49.302048
-67.684979
2019-10-14
11
49.302049
-67.684981
2019-10-21
12
49.302049
-67.684982
2019-10-29
Would anyone know how to create this kind of map? I know to create a simple map with all points, but i really wonder how plot points associated to the date of detection.
Thank you very much
Here's one way to do it entirely with Pandas and matplotlib:
import pandas as pd
from matplotlib import pyplot as plt
# I'll just create some fake data for the exmaple
df = pd.DataFrame(
{
"LAT": [49.2, 49.2, 49.3, 45.6, 467.8],
"LON": [-67.7, -68.1, -65.2, -67.8, -67.4],
"Dates": ["2019-08-06", "2019-08-03", "2019-07-17", "2019-06-12", "2019-05-29"]})
}
)
# add a column containing the months
df["Month"] = pd.DatetimeIndex(df["Dates"]).month
# make a scatter plot with the colour based on the month
fig, ax = plt.subplots()
ax = df.plot.scatter(x="LAT", y="LON", c="Month", ax=ax, colormap="viridis")
fig.show
If you want the months as names rather than indexes, and a slightly more fancy plot (e.g., with a legend labelling the dates) using seaborn, you could do:
import seaborn as sns
# get month as name
df["Month"] = pd.to_datetime(df["Dates"]).dt.strftime("%b")
fig, ax = plt.subplots()
sns.scatterplot(df, x="LAT", y="LON", hue="Month", ax=ax)
fig.show()

Pandas: Plotting / annotating from DataFrame

There is this boring dataframe with stock data I have:
date close MA100 buy sell
2022-02-14 324.95 320.12 0 0
2022-02-13 324.87 320.11 1 0
2022-02-12 327.20 321.50 0 0
2022-02-11 319.61 320.71 0 1
Then I am plotting the prices
import pandas as pd
import matplotlib.pyplot as plt
df = ...
df['close'].plot()
df['MA100'].plot()
plt.show()
So far so good...
Then I'd like to show a marker on the chart if there was buy (green) or sell (red) on that day.
It's just to highlight if there was a transaction on that day. The exact intraday price at which the trade happened is not important.
So the x/y-coordinates could be the date and the close if there is a 1 in column buy (sell).
I am not sure how to implement this.
Would I need a loop to iterate over all rows where buy = 1 (sell = 1) and then somehow add these matches to the plot (probably with annotate?)
I'd really appreciate it if someone could point me in the right direction!
You can query the data frame for sell/buy and scatter plot:
fig, ax = plt.subplots()
df.plot(x='date', y=['close', 'MA100'], ax=ax)
df.query("buy==1").plot.scatter(x='date', y='close', c='g', ax=ax)
df.query("sell==1").plot.scatter(x='date', y='close', c='r', ax=ax)
Output:

python stack stacked bar plot for group by values

So I'm currently made a clustering for a dataset for Facebook and I put a label data for each row with each cluster that I have and the data frame looks like this
so I would like to plot the data into a stacked bar chart
so I did group the data like
dfff=x_df.groupby("cluster")["page_type"].value_counts()
and the output like this
cluster page_type
0 government 5387
company 3231
politician 3149
tvshow 1679
1 government 563
company 9
politician 2
2 company 3255
politician 2617
tvshow 1648
government 930
Name: page_type, dtype: int64
so how can I plot this series into a stacked bar chart of 3 columns (0 ,1 ,2) which they are the cluster that I have?
In order to produce a stacked bar plot, .unstack the groupby dataframe, dfff.
pandas User Guide: Visualization
import pandas as pd
import matplotlib.pyplot as plt
# given dfff and a groupby dataframe
dfp = dfff.unstack()
# display(dfp)
page_type company government politician tvshow
id
0 3231.0 5387.0 3149.0 1679.0
1 9.0 563.0 2.0 NaN
2 3255.0 930.0 2617.0 1648.0
# plot stacked bar
dfp.plot.bar(stacked=True)
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
Seaborn look
import matplotlib.pyplot as plt
# set style parameter
plt.style.use('seaborn')
# plot stacked bar
dfp.plot.bar(stacked=True)
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')

How to turn off order of the line graph in plotly Python?

I want to create a line chart using Plotly. I have 3 variables(date,shift,runt).I want to include date with runt(also i want to display shift as well).
Dataframe:
What I want is to plot a line chart using both date and shift to x-axis.
This is what i got from excel. i want to plot a same graph in python
But I can't take two values.I tried to concatenate the date and shift to one column. But it shows first day values and then night values.
import plotly.express as px
fig = px.line(df, x="Day-Shift", y="RUNT", title='Yo',template="plotly_dark")
fig.show()
Is there any way to turn off order. what i want is shown in the above excel graph
I've created a column that combines the date and the shift and specified it on the x-axis. Does this meet the intent of your question?
import pandas as pd
import numpy as np
import io
data = '''
Date Shift RUNT
0 June-16 Day 350
1 June-16 Night 20
2 June-17 Day 350
3 June-17 Night 20
4 June-18 Day 350
5 June-18 Night 20
6 June-19 Day 350
7 June-19 Night 20
8 June-20 Day 350
9 June-20 Night 20
10 June-21 Day 350
11 June-21 Night 20
'''
df = pd.read_csv(io.StringIO(data), sep='\s+')
df['Day-Shift'] = df['Date'].str.cat(df['Shift'], sep='-')
import plotly.express as px
fig = px.line(df, x="Day-Shift", y="RUNT", title='Yo',template="plotly_dark")
fig.show()

MatPlotLib Barchart is not being produced while tried to be shown with a pie chart in same figure

I am trying to show a barchart above a pie chart using matplotlib in SAME FIGURE. The code is as follows:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.read_csv('stats.csv')
agg_df = df.groupby(['Area','Sex']).sum()
agg_df.reset_index(inplace=True)
piv_df = agg_df.pivot(index='Area', columns='Sex', values='Count')
plt.figure(1)
plt.subplot(211)
piv_df.plot.bar(stacked=True)
df = pd.read_csv('stats.csv', delimiter=',', encoding="utf-8-sig")
df=df.loc[df['"Year"']==2015]
agg_df = df.groupby(['Sex']).sum()
agg_df.reset_index(inplace=True)
plt.subplot(212)
plt.pie(agg_df["Count"],labels=agg_df["Sex"],autopct='%1.1f%%',startangle=90)
plt.show()
after execution, there are two problems.
The Bar chart is not being produced
The barchart is in figure 1 and Pie chart is in figure 2
If I execute the barchart code and pie chart code seperately,they just work fine.
Here is the sample dataframe:
Year Sex Area Count
2015 W Dhaka 6
2015 M Dhaka 3
2015 W Khulna 1
2015 M Khulna 8
2014 M Dhaka 13
2014 W Dhaka 20
2014 M Khulna 9
2014 W Khulna 6
2013 W Dhaka 11
2013 M Dhaka 2
2013 W Khulna 8
2013 M Khulna 5
2012 M Dhaka 12
2012 W Dhaka 4
2012 W Khulna 7
2012 M Khulna 1
and the barchart output is as follows:
what can possibly the problem here?seeking help from matploltlib experts.
You have to pass axes to pandas plotting function with ax parameter to let them know where to draw the pictures. (In the snippet below I use the code from the question but I removed the code that calculates dataframes we use to draw picture and replaced them with the actual resulting dataframes hardcoded. As this question is about figures, it is not important how we obtain these dataframes, and new version is easier to reproduce.)
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
piv_df = pd.DataFrame([[3, 6], [8, 1]],
columns=pd.Series(['M', 'W'], name='Sex'),
index=pd.Series(['Dhaka', 'Khulna'], name='Area'))
fig = plt.figure()
ax1 = fig.add_subplot(211)
piv_df.plot.bar(stacked=True, ax=ax1)
agg_df = pd.DataFrame({'Count': {0: 11, 1: 7},
'Sex': {0: 'M', 1: 'W'},
'Year': {0: 4030, 1: 4030}})
ax2 = fig.add_subplot(212)
ax2.pie(agg_df["Count"], labels=agg_df["Sex"], autopct='%1.1f%%',
startangle=90)

Categories

Resources