I want to plot the data between two variables. In that I want to plot monthly data using a special color.
My code and expectedoutput:
import matplotlib.pyplot as plt
df
A B
2019-01-01 10 20
2019-01-02 20 30
2019-02-01 10 15
2019-02-02 20 40
2019-03-01 12 32
2019-03-02 5 14
plt.plot(df['A'],df['B'])
plt.show()
My current plot plots all the data as usual but I am expecting something different as given below.
My expected output:
2019-03-01 10 20
You can do something like this:
markers = 'dsxo'
months = pd.to_datetime(df.index).to_period('M')
for i, (k,d) in enumerate(df.groupby(months) ):
plt.plot(d['A'],d['B'], label=k, marker=markers[i])
plt.legend()
Output:
Check this code:
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_csv('data.csv')
df['Month'] = df.index.map(lambda x: x[:-3])
fig, ax = plt.subplots(1, 1, figsize = (6, 6))
for month in df['Month'].unique():
ax.plot(df[df['Month'] == month]['A'],
df[df['Month'] == month]['B'],
label = month)
plt.legend()
plt.show()
that gives this graph:
Related
i'm trying to assess the displacement of a particular fish on the seabed according to seasonality. Thus, i would like to create a map with different colored points according to the month in which the detection occured (e.g., all points from August in blue, all points from Sept in red, all points from Oct in yellow).
In my dataframe i have both coordinates for each point (Lat, Lon) and the dates (Dates) of detection:
LAT
LON
Dates
0
49.302005
-67.684971
2019-08-06
1
49.302031
-67.684960
2019-08-12
2
49.302039
-67.684983
2019-08-21
3
49.302039
-67.684979
2019-08-30
4
49.302041
-67.684980
2019-09-03
5
49.302041
-67.684983
2019-09-10
6
49.302042
-67.684979
2019-09-18
7
49.302043
-67.684980
2019-09-25
8
49.302045
-67.684980
2019-10-01
9
49.302045
-67.684983
2019-10-09
10
49.302048
-67.684979
2019-10-14
11
49.302049
-67.684981
2019-10-21
12
49.302049
-67.684982
2019-10-29
Would anyone know how to create this kind of map? I know to create a simple map with all points, but i really wonder how plot points associated to the date of detection.
Thank you very much
Here's one way to do it entirely with Pandas and matplotlib:
import pandas as pd
from matplotlib import pyplot as plt
# I'll just create some fake data for the exmaple
df = pd.DataFrame(
{
"LAT": [49.2, 49.2, 49.3, 45.6, 467.8],
"LON": [-67.7, -68.1, -65.2, -67.8, -67.4],
"Dates": ["2019-08-06", "2019-08-03", "2019-07-17", "2019-06-12", "2019-05-29"]})
}
)
# add a column containing the months
df["Month"] = pd.DatetimeIndex(df["Dates"]).month
# make a scatter plot with the colour based on the month
fig, ax = plt.subplots()
ax = df.plot.scatter(x="LAT", y="LON", c="Month", ax=ax, colormap="viridis")
fig.show
If you want the months as names rather than indexes, and a slightly more fancy plot (e.g., with a legend labelling the dates) using seaborn, you could do:
import seaborn as sns
# get month as name
df["Month"] = pd.to_datetime(df["Dates"]).dt.strftime("%b")
fig, ax = plt.subplots()
sns.scatterplot(df, x="LAT", y="LON", hue="Month", ax=ax)
fig.show()
result
year Month Min_days Avg_days Median_days Count MonthName-Year
2015 1 9 12.56 10 4 2015-Jan
2015 2 10 13.67 9 3 2015-Feb
........................................................
2016 12 12 15.788 19 2 2016-Dec
and so on...
I created a line plot plotting min_days, avg_days, median_days, count according to month and
year say. Code used for that(which works perfectly):
import matplotlib.pyplot as plt
from matplotlib import dates as mdates
result = freq_start_month_year_to_date_1(df,'Jan','2015','Dec','2019')
result['Date'] = pd.to_datetime(result[['Year', 'Month']].assign(Day=1))
# Plot the data
fig, ax = plt.subplots(figsize=(10, 2))
for col in ['Min_days','Avg_days','Median_days','Count']:
ax.plot(result['Date'], result[col], label=col)
years = mdates.YearLocator() # only print label for the years
months = mdates.MonthLocator() # mark months as ticks
years_fmt = mdates.DateFormatter('%Y')
ax.xaxis.set_major_locator(years)
ax.xaxis.set_minor_locator(months)
ax.xaxis.set_major_formatter(years_fmt)
ax.legend(loc='center left', bbox_to_anchor=(1, 0.5))
Now I want to be able to see the data while hovering over it in the plot. Any idea how to do that?
I have a Series with more than 100 000 rows that I want to plot. I have problem with the x-axis of my figure. Since my x-axis is made of several dates, you can't see anything if you plot all of them.
How can I choose to show only 1 out of every x on the x-axis ?
Here is an example of a code which produces a graphic with an ugly x-axis :
sr = pd.Series(np.array(range(15)))
sr.index = [ '2018-06-' + str(x).zfill(2) for x in range(1,16)]
Out :
2018-06-01 0
2018-06-02 1
2018-06-03 2
2018-06-04 3
2018-06-05 4
2018-06-06 5
2018-06-07 6
2018-06-08 7
2018-06-09 8
2018-06-10 9
2018-06-11 10
2018-06-12 11
2018-06-13 12
2018-06-14 13
2018-06-15 14
fig = plt.plot(sr)
plt.xlabel('Date')
plt.ylabel('Sales')
Using xticks you can achieve the desired effect:
In your example:
sr = pd.Series(np.array(range(15)))
sr.index = [ '2018-06-' + str(x).zfill(2) for x in range(1,16)]
fig = plt.plot(sr)
plt.xlabel('Date')
plt.xticks(sr.index[::4]) #Show one in every four dates
plt.ylabel('Sales')
Output:
Also, if you want to set the number of ticks, instead, you can use locator_params:
sr.plot(xticks=sr.reset_index().index)
plt.locator_params(axis='x', nbins=5) #Show five dates
plt.ylabel('Sales')
plt.xlabel('Date')
Output:
I am trying to show a barchart above a pie chart using matplotlib in SAME FIGURE. The code is as follows:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.read_csv('stats.csv')
agg_df = df.groupby(['Area','Sex']).sum()
agg_df.reset_index(inplace=True)
piv_df = agg_df.pivot(index='Area', columns='Sex', values='Count')
plt.figure(1)
plt.subplot(211)
piv_df.plot.bar(stacked=True)
df = pd.read_csv('stats.csv', delimiter=',', encoding="utf-8-sig")
df=df.loc[df['"Year"']==2015]
agg_df = df.groupby(['Sex']).sum()
agg_df.reset_index(inplace=True)
plt.subplot(212)
plt.pie(agg_df["Count"],labels=agg_df["Sex"],autopct='%1.1f%%',startangle=90)
plt.show()
after execution, there are two problems.
The Bar chart is not being produced
The barchart is in figure 1 and Pie chart is in figure 2
If I execute the barchart code and pie chart code seperately,they just work fine.
Here is the sample dataframe:
Year Sex Area Count
2015 W Dhaka 6
2015 M Dhaka 3
2015 W Khulna 1
2015 M Khulna 8
2014 M Dhaka 13
2014 W Dhaka 20
2014 M Khulna 9
2014 W Khulna 6
2013 W Dhaka 11
2013 M Dhaka 2
2013 W Khulna 8
2013 M Khulna 5
2012 M Dhaka 12
2012 W Dhaka 4
2012 W Khulna 7
2012 M Khulna 1
and the barchart output is as follows:
what can possibly the problem here?seeking help from matploltlib experts.
You have to pass axes to pandas plotting function with ax parameter to let them know where to draw the pictures. (In the snippet below I use the code from the question but I removed the code that calculates dataframes we use to draw picture and replaced them with the actual resulting dataframes hardcoded. As this question is about figures, it is not important how we obtain these dataframes, and new version is easier to reproduce.)
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
piv_df = pd.DataFrame([[3, 6], [8, 1]],
columns=pd.Series(['M', 'W'], name='Sex'),
index=pd.Series(['Dhaka', 'Khulna'], name='Area'))
fig = plt.figure()
ax1 = fig.add_subplot(211)
piv_df.plot.bar(stacked=True, ax=ax1)
agg_df = pd.DataFrame({'Count': {0: 11, 1: 7},
'Sex': {0: 'M', 1: 'W'},
'Year': {0: 4030, 1: 4030}})
ax2 = fig.add_subplot(212)
ax2.pie(agg_df["Count"], labels=agg_df["Sex"], autopct='%1.1f%%',
startangle=90)
I am plotting a simple chart and adding a number taken from a DataFrame, via plot.text(). The number is plotting as intended, but detail of its properties are also being displayed. I would like to suppress the display of properties and plot just the number.
The following code reproduces the issue.
import numpy as np
import pandas as pd
from pandas import *
import matplotlib.pyplot as plot
%matplotlib inline
rand = np.random.RandomState(1)
index = np.arange(8)
df = DataFrame(rand.randn(8, 1), index=index, columns=list('A'))
df['date'] = date_range('1/1/2014', periods=8)
print df
A date
0 1.624345 2014-01-01
1 -0.611756 2014-01-02
2 -0.528172 2014-01-03
3 -1.072969 2014-01-04
4 0.865408 2014-01-05
5 -2.301539 2014-01-06
6 1.744812 2014-01-07
7 -0.761207 2014-01-08
df2 = pd.DataFrame(index = ['1'], columns=['example'])
df2['example'] = 1.436792
print df2
example
1 1.436792
fig, ax = plot.subplots(figsize=(15,10))
df.plot(x='date', y='A')
plot.text(0.05, 0.95, df2['example'],
horizontalalignment='left',
verticalalignment='center',
transform = ax.transAxes)
The plot is showing index, name and dtype data along with the example number. Can anybody show how to suppress this detail and just plot the number? Any help much appreciated.
Just plot with the DataFrame values:
plot.text(0.05, 0.95, df2['example'].values,
horizontalalignment='left',
verticalalignment='center',
transform = ax.transAxes)
Or just set visible=True to hide everything:
plot.text(0.05, 0.95, df2['example'],
horizontalalignment='left',
verticalalignment='center',
transform = ax.transAxes, visible=False)