how do we plot a scatter plot for time in python - python

i want to plot a scatter plot between mentioned two columns k and s . k should be on x axis showing time on hourly basis for 24 hours and s should be on y axis. i have already tried some code using using sns.relplot but got attribute error.
data columns in which we want scatter plot
code which we tried with error

Try:
import matplotlib.dates as mdates
from datetime import time
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame([['2020-05-26 06:15:07','105'], ['2020-05-26 06:15:07','41'], ['2020-05-26 06:16:51','95']], columns=["k", "s"])
df.k = pd.to_datetime(df.k, format='%Y-%m-%d %H:%M:%S')
ax = sns.scatterplot(df.k, df.s)
ax.xaxis.set_major_formatter(mdates.DateFormatter("%Y-%m-%d %H:%M:%S"))
ax.tick_params(axis="x", rotation=45)
ax.plot()
import matplotlib.dates as mdates
from datetime import time
import matplotlib.pyplot as plt
df = pd.DataFrame([['2020-05-26 06:15:07','105'], ['2020-05-26 06:15:07','41'], ['2020-05-26 06:16:51','95']], columns=["k", "s"])
df.k = pd.to_datetime(df.k, format='%Y-%m-%d %H:%M:%S')
df.set_index(['k'],inplace=True)
ax = sns.scatterplot(df.index, df.s)
# ax.set(xlabel="time", ylabel="values")
ax.set_xlim(df.index[0], df.index[-1])
ax.xaxis.set_major_locator(mdates.HourLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter("%Y-%m-%d %H:%M:%S"))
ax.tick_params(axis="x", rotation=45)
ax.plot()

Related

matplotlib to show x-axis with custom date formats and interval

Using matplotlib and mpl_finance to plot candlesticks. Data is in csv AAPL.
I want to show the x-axis as year and month only, i.e."yyyy-mmm", so:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from mpl_finance import candlestick2_ohlc
import matplotlib.dates as mdates
data = pd.read_csv('C:\\AAPL.csv', delimiter = "\t")
data = data.sort_values(['Date'], ascending=True)
data = data.tail(100)
fig = plt.figure(figsize=(6,4))
plt.ylim(60, 200)
ax1 = fig.add_subplot(111)
cl =candlestick2_ohlc(ax=ax1,opens=data['Open'],highs=data['High'],lows=data['Low'],closes=data['Close'],width=0.6)
ax1.set_xticks(np.arange(len(data)))
ax1.set_xticklabels(data['Date'], fontsize=10, rotation=90)
# every month of the year like 2008-Jan, 2008-Feb...
locator = mdates.MonthLocator()
fmt = mdates.DateFormatter('%Y-%b')
X = plt.gca().xaxis
X.set_major_locator(locator)
X.set_major_formatter(fmt)
plt.show()
It doesn't show anything.
Also tried below but doesn't work neither:
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%m/%d/%Y'))
plt.gca().xaxis.set_major_locator(mdates.DayLocator())
How can I have the x-axis only show the year and month??
Thank you.
Try following solution,
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from mpl_finance import candlestick_ohlc
import matplotlib.dates as mdates
data = pd.read_csv('C:\AAPL.csv')
data = data.sort_values(['Date'], ascending=True)
data = data.tail(100)
from matplotlib.dates import date2num, DayLocator, DateFormatter
data['Date'] = date2num(pd.to_datetime(data['Date']).tolist())
fig, ax=plt.subplots(figsize=(10, 10))
candlestick_ohlc(ax, data.as_matrix(),width=0.6)
ax.set(xlabel='AAPL')
ax.xaxis.set_major_locator(DayLocator())
ax.xaxis.set_major_formatter(DateFormatter('%Y-%b'))
ax.xaxis.set_major_locator(mdates.WeekdayLocator(interval=4))
plt.show()
Note: I have used candlestick_ohlc instead of candlestick2_ohlc.
Output :

Plot timedelta in matplotlib

I'm reading in some year, time (duration) data and I want to plot a chart of year on the x axis and time (duration) on the y axis. I want the y axis to have a HH:MM:SS format. I can't figure out how to do it. Here's my code (data is synthesized, real data set is much larger).
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter, date2num
df = pd.DataFrame({'Year':[2010, 2011, 2012],
'Time': ['2:19:15', '2:11:16', '2:20:17']})
df['Time'] = pd.to_timedelta(df['Time'])
fig, ax = plt.subplots()
myFmt = DateFormatter("%H:%M:%S")
ax.yaxis.set_major_formatter(myFmt)
ax.plot(df['Year'], df['Time'].dt.total_seconds())
plt.gcf().autofmt_xdate()
plt.show()
If I don't convert to total_seconds, I get an error. It seems like the total seconds values are being interpreted as days. I tried dividing total_seconds by 24*60*60, but that gave me a message about a 0 date. I can't persuade date2num to work for me either.
I've checked on previous similar questions, but the code no longer works.
Does anyone know how to plot Pandas timedeltas in matplotlib?
Convert the timedetlas to datetime with pd.to_datetime. It will give everything a date in 1970, but if all you want is to plot and display then it wont matter. You then need to get rid of the .dt.total_seconds() too.
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter, date2num
df = pd.DataFrame({'Year':[2010, 2011, 2012],
'Time': ['2:19:15', '2:11:16', '2:20:17']})
df['Time'] = pd.to_timedelta(df['Time'])
df['Time'] = pd.to_datetime(df['Time'])
fig, ax = plt.subplots()
myFmt = DateFormatter("%H:%M:%S")
ax.yaxis.set_major_formatter(myFmt)
ax.plot(df['Year'], df['Time'])
plt.gcf().autofmt_xdate()
plt.show()

matplotlib - plot wrong datetime

I'd plot a figure with matplotlib in which the x-axis there are timestamp with yy-mm-dd hh-mm-ss. I have ts in datetime64 (pandas series) and to show also (right) minutes and seconds i follow the hint in this link using date2num. The problem is that it plots no-sense dates:
import datetime as dt
import matplotlib.pyplot as plt
import matplotlib.dates as md
for df in dfs:
datenums=md.date2num(df.toPandas()["timestamp"])
plt.xticks(rotation=25)
xfmt = md.DateFormatter('%Y-%m-%d %H:%M:%S')
ax.xaxis.set_major_formatter(xfmt)
plt.plot(datenums,x)
plt.show()
where df.toPandas()["timestamp"] is:
0 2015-12-15 03:53:13
Name: timestamp, dtype: datetime64[ns]
I tried to convert datetime64 in datetime but the result doesn't change.
If you have your timestamp values on seconds, use this to create a list for all the tick labels and then add them to the plot considering your data is related to an array of timestamps
import matplotlib.pyplot as plt
import numpy as np
import datetime
OX_ticks_name = [datetime.datetime.fromtimestamp(x).strftime('%Y-%m-%d %H:%M:%S') for x in arrayTmstmp]
OX_ticks_pos = np.arange(0,len(arrayTmstmp))
fig, ax = plt.subplots(figsize=(16, 9), dpi=100)
...
ax.set_xticks(OX_ticks_pos)
ax.set_xticklabels(OX_ticks_name, rotation=40, horizontalalignment='right', fontsize=10)
plt.tight_layout()
plt.show()
Of course, the position of each tick and the name for each can be configured as you want.

Matplotlib DateFormatter for axis label not working

I'm trying to adjust the formatting of the date tick labels of the x-axis so that it only shows the Year and Month values. From what I've found online, I have to use mdates.DateFormatter, but it's not taking effect at all with my current code as is. Anyone see where the issue is? (the dates are the index of the pandas Dataframe)
import matplotlib.dates as mdates
import matplotlib.pyplot as plt
import pandas as pd
fig = plt.figure(figsize = (10,6))
ax = fig.add_subplot(111)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m'))
basicDF['some_column'].plot(ax=ax, kind='bar', rot=75)
ax.xaxis_date()
Reproducible scenario code:
import numpy as np
import matplotlib.dates as mdates
import matplotlib.pyplot as plt
import pandas as pd
rng = pd.date_range('1/1/2014', periods=20, freq='m')
blah = pd.DataFrame(data = np.random.randn(len(rng)), index=rng)
fig = plt.figure(figsize = (10,6))
ax = fig.add_subplot(111)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m'))
blah.plot(ax=ax, kind='bar')
ax.xaxis_date()
Still can't get just the year and month to show up.
If I set the format after .plot , get an error like this:
ValueError: DateFormatter found a value of x=0, which is an illegal date. This usually occurs because you have not informed the axis that it is plotting dates, e.g., with ax.xaxis_date().
It's the same for if I put it before ax.xaxis_date() or after.
pandas just doesn't work well with custom date-time formats.
You need to just use raw matplotlib in cases like this.
import numpy
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas
N = 20
numpy.random.seed(N)
dates = pandas.date_range('1/1/2014', periods=N, freq='m')
df = pandas.DataFrame(
data=numpy.random.randn(N),
index=dates,
columns=['A']
)
fig, ax = plt.subplots(figsize=(10, 6))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m'))
ax.bar(df.index, df['A'], width=25, align='center')
And that gives me:
Solution with pandas only
You can create nicely formatted ticks by using the DatetimeIndex and taking advantage of the datetime properties of the timestamps. Tick locators and formatters from matplotlib.dates are not necessary for a case like this unless you would want dynamic ticks when using the interactive interface of matplotlib for zooming in and out (more relevant for time ranges longer than in this example).
import numpy as np # v 1.19.2
import pandas as pd # v 1.1.3
# Create sample time series with month start frequency, plot it with a pandas bar chart
rng = np.random.default_rng(seed=1) # random number generator
dti = pd.date_range('1/1/2014', periods=20, freq='m')
df = pd.DataFrame(data=rng.normal(size=dti.size), index=dti)
ax = df.plot.bar(figsize=(10,4), legend=None)
# Set major ticks and tick labels
ax.set_xticks(range(df.index.size))
ax.set_xticklabels([ts.strftime('%b\n%Y') if ts.year != df.index[idx-1].year
else ts.strftime('%b') for idx, ts in enumerate(df.index)])
ax.figure.autofmt_xdate(rotation=0, ha='center');
The accepted answer claims that "pandas won't work well with custom date-time formats", but you can make use of pandas' to_datetime() function to use your existing datetime Series in the dataframe:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter
import pandas as pd
rng = pd.date_range('1/1/2014', periods=20, freq='m')
blah = pd.DataFrame(data = np.random.randn(len(rng)), index=pd.to_datetime(rng))
fig, ax = plt.subplots()
ax.xaxis.set_major_formatter(DateFormatter('%m-%Y'))
ax.bar(blah.index, blah[0], width=25, align='center')
Will result in:
You can see the different available formats here.
I stepped into the same problem and I used an workaround to transform the index from date time format into the desired string format:
import numpy as np
import matplotlib.dates as mdates
import matplotlib.pyplot as plt
import pandas as pd
rng = pd.date_range('1/1/2014', periods=20, freq='m')
blah = pd.DataFrame(data = np.random.randn(len(rng)), index=rng)
fig = plt.figure(figsize = (10,6))
ax = fig.add_subplot(111)
# transform index to strings
blah_test = blah.copy()
str_index = []
for s_year,s_month in zip(blah.index.year.values,blah.index.month.values):
# build string accorind to format "%Y-%m"
string_day = '{}-{:02d}'.format(s_year,s_month)
str_index.append(string_day)
blah_test.index = str_index
blah_test.plot(ax=ax, kind='bar', rot=45)
plt.show()
which results in the following figure:

Plotting dates on the x-axis

I am trying to plot information against dates. I have a list of dates in the format "01/02/1991".
I converted them by doing the following:
x = parser.parse(date).strftime('%Y%m%d'))
which gives 19910102
Then I tried to use num2date
import matplotlib.dates as dates
new_x = dates.num2date(x)
Plotting:
plt.plot_date(new_x, other_data, fmt="bo", tz=None, xdate=True)
But I get an error. It says "ValueError: year is out of range". Any solutions?
You can do this more simply using plot() instead of plot_date().
First, convert your strings to instances of Python datetime.date:
import datetime as dt
dates = ['01/02/1991','01/03/1991','01/04/1991']
x = [dt.datetime.strptime(d,'%m/%d/%Y').date() for d in dates]
y = range(len(x)) # many thanks to Kyss Tao for setting me straight here
Then plot:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%m/%d/%Y'))
plt.gca().xaxis.set_major_locator(mdates.DayLocator())
plt.plot(x,y)
plt.gcf().autofmt_xdate()
Result:
I have too low reputation to add comment to #bernie response, with response to #user1506145. I have run in to same issue.
The answer to it is an interval parameter which fixes things up
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
import datetime as dt
np.random.seed(1)
N = 100
y = np.random.rand(N)
now = dt.datetime.now()
then = now + dt.timedelta(days=100)
days = mdates.drange(now,then,dt.timedelta(days=1))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
plt.gca().xaxis.set_major_locator(mdates.DayLocator(interval=5))
plt.plot(days,y)
plt.gcf().autofmt_xdate()
plt.show()
As #KyssTao has been saying, help(dates.num2date) says that the x has to be a float giving the number of days since 0001-01-01 plus one. Hence, 19910102 is not 2/Jan/1991, because if you counted 19910101 days from 0001-01-01 you'd get something in the year 54513 or similar (divide by 365.25, number of days in a year).
Use datestr2num instead (see help(dates.datestr2num)):
new_x = dates.datestr2num(date) # where date is '01/02/1991'
Adapting #Jacek Szałęga's answer for the use of a figure fig and corresponding axes object ax:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
import datetime as dt
np.random.seed(1)
N = 100
y = np.random.rand(N)
now = dt.datetime.now()
then = now + dt.timedelta(days=100)
days = mdates.drange(now,then,dt.timedelta(days=1))
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(days,y)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
ax.xaxis.set_major_locator(mdates.DayLocator(interval=5))
ax.tick_params(axis='x', labelrotation=45)
plt.show()

Categories

Resources