result
year Month Min_days Avg_days Median_days Count MonthName-Year
2015 1 9 12.56 10 4 2015-Jan
2015 2 10 13.67 9 3 2015-Feb
........................................................
2016 12 12 15.788 19 2 2016-Dec
and so on...
I created a line plot plotting min_days, avg_days, median_days, count according to month and
year say. Code used for that(which works perfectly):
import matplotlib.pyplot as plt
from matplotlib import dates as mdates
result = freq_start_month_year_to_date_1(df,'Jan','2015','Dec','2019')
result['Date'] = pd.to_datetime(result[['Year', 'Month']].assign(Day=1))
# Plot the data
fig, ax = plt.subplots(figsize=(10, 2))
for col in ['Min_days','Avg_days','Median_days','Count']:
ax.plot(result['Date'], result[col], label=col)
years = mdates.YearLocator() # only print label for the years
months = mdates.MonthLocator() # mark months as ticks
years_fmt = mdates.DateFormatter('%Y')
ax.xaxis.set_major_locator(years)
ax.xaxis.set_minor_locator(months)
ax.xaxis.set_major_formatter(years_fmt)
ax.legend(loc='center left', bbox_to_anchor=(1, 0.5))
Now I want to be able to see the data while hovering over it in the plot. Any idea how to do that?
Related
i'm trying to assess the displacement of a particular fish on the seabed according to seasonality. Thus, i would like to create a map with different colored points according to the month in which the detection occured (e.g., all points from August in blue, all points from Sept in red, all points from Oct in yellow).
In my dataframe i have both coordinates for each point (Lat, Lon) and the dates (Dates) of detection:
LAT
LON
Dates
0
49.302005
-67.684971
2019-08-06
1
49.302031
-67.684960
2019-08-12
2
49.302039
-67.684983
2019-08-21
3
49.302039
-67.684979
2019-08-30
4
49.302041
-67.684980
2019-09-03
5
49.302041
-67.684983
2019-09-10
6
49.302042
-67.684979
2019-09-18
7
49.302043
-67.684980
2019-09-25
8
49.302045
-67.684980
2019-10-01
9
49.302045
-67.684983
2019-10-09
10
49.302048
-67.684979
2019-10-14
11
49.302049
-67.684981
2019-10-21
12
49.302049
-67.684982
2019-10-29
Would anyone know how to create this kind of map? I know to create a simple map with all points, but i really wonder how plot points associated to the date of detection.
Thank you very much
Here's one way to do it entirely with Pandas and matplotlib:
import pandas as pd
from matplotlib import pyplot as plt
# I'll just create some fake data for the exmaple
df = pd.DataFrame(
{
"LAT": [49.2, 49.2, 49.3, 45.6, 467.8],
"LON": [-67.7, -68.1, -65.2, -67.8, -67.4],
"Dates": ["2019-08-06", "2019-08-03", "2019-07-17", "2019-06-12", "2019-05-29"]})
}
)
# add a column containing the months
df["Month"] = pd.DatetimeIndex(df["Dates"]).month
# make a scatter plot with the colour based on the month
fig, ax = plt.subplots()
ax = df.plot.scatter(x="LAT", y="LON", c="Month", ax=ax, colormap="viridis")
fig.show
If you want the months as names rather than indexes, and a slightly more fancy plot (e.g., with a legend labelling the dates) using seaborn, you could do:
import seaborn as sns
# get month as name
df["Month"] = pd.to_datetime(df["Dates"]).dt.strftime("%b")
fig, ax = plt.subplots()
sns.scatterplot(df, x="LAT", y="LON", hue="Month", ax=ax)
fig.show()
I want to plot the data between two variables. In that I want to plot monthly data using a special color.
My code and expectedoutput:
import matplotlib.pyplot as plt
df
A B
2019-01-01 10 20
2019-01-02 20 30
2019-02-01 10 15
2019-02-02 20 40
2019-03-01 12 32
2019-03-02 5 14
plt.plot(df['A'],df['B'])
plt.show()
My current plot plots all the data as usual but I am expecting something different as given below.
My expected output:
2019-03-01 10 20
You can do something like this:
markers = 'dsxo'
months = pd.to_datetime(df.index).to_period('M')
for i, (k,d) in enumerate(df.groupby(months) ):
plt.plot(d['A'],d['B'], label=k, marker=markers[i])
plt.legend()
Output:
Check this code:
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_csv('data.csv')
df['Month'] = df.index.map(lambda x: x[:-3])
fig, ax = plt.subplots(1, 1, figsize = (6, 6))
for month in df['Month'].unique():
ax.plot(df[df['Month'] == month]['A'],
df[df['Month'] == month]['B'],
label = month)
plt.legend()
plt.show()
that gives this graph:
I've made a dataframe that has dates and 2 values that looks like:
Date Year Level Price
2008-01-01 2008 56 11
2008-01-03 2008 10 12
2008-01-05 2008 52 13
2008-02-01 2008 66 14
2008-05-01 2008 20 10
..
2009-01-01 2009 12 11
2009-02-01 2009 70 11
2009-02-05 2009 56 12
..
2018-01-01 2018 56 10
2018-01-11 2018 10 17
..
I'm able to plot these by colors on their year by creating a column on their years with df['Year'] = df['Date'].dt.year but I want to also have labels on each Year in the legend.
My code right now for plotting by year looks like:
colors = ['turquoise','orange','red','mediumblue', 'orchid', 'limegreen']
fig = plt.figure(figsize=(15,10))
ax = fig.add_subplot(111)
ax.scatter(df['Price'], df['Level'], s=10, c=df['Year'], marker="o", label=df['Year'], cmap=matplotlib.colors.ListedColormap(colors))
plt.title('Title', fontsize=16)
plt.ylabel('Level', fontsize=14)
plt.xlabel('Price', fontsize=14)
plt.legend(loc='upper left', prop={'size': 12});
plt.show()
How can I adjust the labels in the legend to show the year? The way I've done it is just using the Year column but that obviously just gives me results like this:
When you are scattering your points, you will want to make sure that you are accessing a col in your dataframe that exists. In your code, you are trying to access a column called 'Year' which doesn't exist. See below for the problem:
ax.scatter(df['Price'], df['Level'], s=10, c=df['Year'], marker="o", label=df['Year'], cmap=matplotlib.colors.ListedColormap(colors)
In this line of code, where you specify the color (c) you are looking for a column that doesn't exist. As well, you have the same problem with your label that you are passing in. To solve this you need to create a column that contains the year:
Extract all the dates
Grab just the year from each date
Add this to your dataframe
Below is some code to implement these steps:
# Create a list of all the dates
dates = df.Date.values
#Create a list of all of the years using list comprehension
years = [x[0] for x in dates.split('-')]
# Add this column to your dataframe
df['Year'] = years
As well I would direct you to this course to learn more about plotting in python!
https://exlskills.com/learn-en/courses/python-data-modeling-intro-for-machine-learning-python_modeling_for_machine_learning/content
I'm trying to plot the following dataframe with Bokeh (data_frame in the code), in my example I only have two columns 0 and 1 (and Dates which is the x-axis). But in my real dataset I have more than 10, so I'm trying to find a better version than mine which does not generalize well. (I thought of a for loop but it doesn't seem optimal)
from bokeh.plotting import figure, show
from bokeh.charts import TimeSeries
from bokeh.io import output_notebook
output_notebook()
data_frame = pd.DataFrame({0: [0.17, 0.189, 0.185, 0.1657], 1: [0.05, 0.0635, 0.0741, 0.0925], 'Date': [2004, 2005, 2006, 2007]})
p = figure(x_axis_label = 'date',
y_axis_label='Topics Distribution')
p.circle(data_frame.Date, data_frame.iloc[:, 0])
p.circle(data_frame.Date, data_frame.iloc[:, 1])
show(p)
I've tried this as well, but it does not work and I don't want lines only points:
p = TimeSeries(data_frame, index='Date', legend=True,
title = 'T', ylabel='topics distribution')
Thanks for your help!
Let's try a different approach and see if this makes a little more sense:
Reshape the data to be in a
"tidy" data format
Use Bokeh high-level Scatter chart with color argument
Code:
chartdata = data_frame.set_index('Date').stack().reset_index().rename(columns={'level_1':'Category',0:'Value'})
print(chartdata)
Output "tidy" data format:
Date Category Value
0 2004 0 0.1700
1 2004 1 0.0500
2 2005 0 0.1890
3 2005 1 0.0635
4 2006 0 0.1850
5 2006 1 0.0741
6 2007 0 0.1657
7 2007 1 0.0925
Build chart:
from bokeh.charts import Scatter
p = Scatter(chartdata, x='Date', y='Value', color='Category',xlabel='date', ylabel='Topics Distribution')
I am trying to show a barchart above a pie chart using matplotlib in SAME FIGURE. The code is as follows:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.read_csv('stats.csv')
agg_df = df.groupby(['Area','Sex']).sum()
agg_df.reset_index(inplace=True)
piv_df = agg_df.pivot(index='Area', columns='Sex', values='Count')
plt.figure(1)
plt.subplot(211)
piv_df.plot.bar(stacked=True)
df = pd.read_csv('stats.csv', delimiter=',', encoding="utf-8-sig")
df=df.loc[df['"Year"']==2015]
agg_df = df.groupby(['Sex']).sum()
agg_df.reset_index(inplace=True)
plt.subplot(212)
plt.pie(agg_df["Count"],labels=agg_df["Sex"],autopct='%1.1f%%',startangle=90)
plt.show()
after execution, there are two problems.
The Bar chart is not being produced
The barchart is in figure 1 and Pie chart is in figure 2
If I execute the barchart code and pie chart code seperately,they just work fine.
Here is the sample dataframe:
Year Sex Area Count
2015 W Dhaka 6
2015 M Dhaka 3
2015 W Khulna 1
2015 M Khulna 8
2014 M Dhaka 13
2014 W Dhaka 20
2014 M Khulna 9
2014 W Khulna 6
2013 W Dhaka 11
2013 M Dhaka 2
2013 W Khulna 8
2013 M Khulna 5
2012 M Dhaka 12
2012 W Dhaka 4
2012 W Khulna 7
2012 M Khulna 1
and the barchart output is as follows:
what can possibly the problem here?seeking help from matploltlib experts.
You have to pass axes to pandas plotting function with ax parameter to let them know where to draw the pictures. (In the snippet below I use the code from the question but I removed the code that calculates dataframes we use to draw picture and replaced them with the actual resulting dataframes hardcoded. As this question is about figures, it is not important how we obtain these dataframes, and new version is easier to reproduce.)
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
piv_df = pd.DataFrame([[3, 6], [8, 1]],
columns=pd.Series(['M', 'W'], name='Sex'),
index=pd.Series(['Dhaka', 'Khulna'], name='Area'))
fig = plt.figure()
ax1 = fig.add_subplot(211)
piv_df.plot.bar(stacked=True, ax=ax1)
agg_df = pd.DataFrame({'Count': {0: 11, 1: 7},
'Sex': {0: 'M', 1: 'W'},
'Year': {0: 4030, 1: 4030}})
ax2 = fig.add_subplot(212)
ax2.pie(agg_df["Count"], labels=agg_df["Sex"], autopct='%1.1f%%',
startangle=90)