Set x axis locator at hour intervals on matplotlib subplot - python

I am trying to create a figure with four subplots using the Matplotlib object based approach. I am having trouble setting the x-axis to hourly markers on each plot. With my present code the hourly marks are retained only on the last of the four subplots
I have a list which contains four dataframes that were read in from CSV. I used pd.to_datetime to create an index. No problem.
I can loop through the four dataframes and plot my y variable (TS_comp) against time. this works fine and I get date/time on each x-axis. But what I want is to have just hour markers on each of the x axis. When I add code in the loop to set the major locator it ends up that the x-axis labels are wiped on the first three subplots. The two lines of code from the loop below are:
ax.xaxis.set_major_locator(hours)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H'))
I do not understand why this is happening as each time it goes through the loop it should be addressing a different axis object. Note x-axis time ranges are different so not a simple matter of sharing the x-axis across the subplots.
fig, ax = plt.subplots(nrows=2, ncols=2)
i=0
hours = mdates.HourLocator(interval = 1)
for ax in fig.get_axes():
ax.plot(dfs[i].TS_comp,'k-',markersize = 0.5)
ax.xaxis.set_major_locator(hours)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H'))
i=i+1;
Expect to get hourly markers on each of the subplots, ended up with hourly markers on just the last plot

Related

Creating a single tidy seaborn plot in a 'for' loop

I'm trying to generate a plot in seaborn using a for loop to plot the contents of each dataframe column on its own row.
The number of columns that need plotting can vary between 1 and 30. However, the loop creates multiple individual plots, each with their own x-axis, which are not aligned and with a lot of wasted space between the plots. I'd like to have all the plots together with a shared x-axis without any vertical spacing between each plot that I can then save as a single image.
The code I have been using so far is below.
comp_relflux = measurements.filter(like='rel_flux_C', axis=1) *# Extracts relevant columns from larger dataframe
comp_relflux=comp_relflux.reindex(comp_relflux.mean().sort_values().index, axis=1) # Sorts into order based on column mean.
plt.rcParams["figure.figsize"] = [12.00, 1.00]
for column in comp_relflux.columns:
plt.figure()
sns.scatterplot((bjd)%1, comp_relflux[column], color='b', marker='.')
This is a screenshot of the resultant plots.
I have also tried using FacetGrid, but this just seems to plot the last column's data.
p = sns.FacetGrid(comp_relflux, height=2, aspect=6, despine=False)
p.map(sns.scatterplot, x=(bjd)%1, y=comp_relflux[column])
To combine the x-axis labels and have just one instead of having it for each row, you can use sharex. Also, using plt.subplot() to the number of columns you have, you would also be able to have just one figure with all the subplots within it. As there is no data available, I used random numbers below to demonstrate the same. There are 4 columns of data in my df, but have kept as much of your code and naming convention as is. Hope this is what you are looking for...
comp_relflux = pd.DataFrame(np.random.rand(100, 4)) #Random data - 4 columns
bjd=np.linspace(0,1,100) # Series of 100 points - 0 to 1
rows=len(comp_relflux.columns) # Use this to get column length = subplot length
fig, ax = plt.subplots(rows, 1, sharex=True, figsize=(12,6)) # The subplots... sharex is assigned here and I move the size in here from your rcParam as well
for i, column in enumerate(comp_relflux.columns):
sns.scatterplot((bjd)%1, comp_relflux[column], color='b',marker='.', ax=ax[i])
1 output plot with 4 subplots

How to plot multiple lines in subplot using python and matplotlib

I've been following the solutions provided by Merge matplotlib subplots with shared x-axis. See solution 35. In each subplot, there is one line, but I would like to have multiple lines in each subplot. For example, the top plot has the price of IBM and a 30 day moving average. The bottom plot has a 180 day and 30 day variance.
To plot multiple lines in my other python programs I used (data).plot(figsize=(10, 7)) where data is a dataframe indexed by date, but in the author's solution he uses line0, = ax0.plot(x, y, color='r') to assign the data series (x,y) to the plot. In the case of multiple lines in solution 35, how does one assign a dataframe with multiple columns to the plot?
You'll need to use (data).plot(ax=ax0) to work with pandas plotting.
For the legend you can use:
handles0, labels0 = ax0.get_legend_handles_labels()
handles1, labels1 = ax1.get_legend_handles_labels()
ax0.legend(handles=handles0 + handles1, labels=labels0 + labels1)

Why I am getting two different plots for same data?

I have a data file. (duration : One day)
I want to plot time vs temp graph.
I tried to plot it using two different block of codes. The methods are standard but I am getting two different plots.
In first method I used time in datetime datatype
#method
t=pd.to_datetime(data['time'],format='%H:%M:%S.%f').dt.time
data['time']=t
data.drop('index',axis=1,inplace=True)
ax=data.plot(x='time',color=['tab:blue','tab:red'])
ax.tick_params(axis='x', rotation=45)
In second method I used time in string format
fig, ax = plt.subplots()
ax.plot(data['time'], data['temp'])
ax.plot(data['time'], data['temp_mean'],color='red')
fig.autofmt_xdate()
ax.tick_params(axis='x', rotation=45)
y1,y2=float(data.temp.min()),float(data.temp.max())
ax.yaxis.set_ticks(np.arange(y1,y2),10)
ax.set_xlabel('Time')
ax.set_ylabel('Temp')
plt.show()
The second plot is the expected one. But the one I got have overlapped x ticks. I need to have xticks in hourly format (1 hr frequency)
Why I am getting plots with two different pattern? How to add xticks with 1 hour frequency?

Matplotlib plotting data that doesnt exist

I am trying to plot three lines on one figure. I have data for three years for three sites and i am simply trying to plot them with the same x axis and same y axis. The first two lines span all three years of data, while the third dataset is usually more sparse. Using the object-oriented axes matplotlib format, when i try to plot my third set of data, I get points at the end of the graph that are out of the range of my third set of data. my third dataset is structured as tuples of dates and values such as:
data=
[('2019-07-15', 30.6),
('2019-07-16', 20.88),
('2019-07-17', 16.94),
('2019-07-18', 11.99),
('2019-07-19', 13.76),
('2019-07-20', 16.97),
('2019-07-21', 19.9),
('2019-07-22', 25.56),
('2019-07-23', 18.59),
...
('2020-08-11', 8.33),
('2020-08-12', 10.06),
('2020-08-13', 12.21),
('2020-08-15', 6.94),
('2020-08-16', 5.51),
('2020-08-17', 6.98),
('2020-08-18', 6.17)]
where the data ends in August 2020, yet the graph includes points at the end of 2020. This is happening with all my sites, as the first two datasets stay constant knowndf['DATE'] and knowndf['Value'] below.
Here is the problematic graph.
And here is what I have for the plotting:
fig, ax=plt.subplots(1,1,figsize=(15,12))
fig.tight_layout(pad=6)
ax.plot(knowndf['DATE'], knowndf['Value1'],'b',alpha=0.7)
ax.plot(knowndf['DATE'], knowndf['Value2'],color='red',alpha=0.7)
ax.plot(*zip(*data), 'g*', markersize=8) #when i plot this set of data i get nonexistent points
ax.tick_params(axis='x', rotation=45) #rotating for aesthetic
ax.set_xticks(ax.get_xticks()[::30]) #only want every 30th tick instead of every daily tick
I've tried ax.twinx() and that gives me two y axis that doesn't help me since i want to use the same x-axis and y-axis for all three sites. I've tried not using the axes approach, but there are things that come with axes that i need to plot with. Please please help!

Why doesn't Subplot using Pandas show x-axis

When I plot single plots with panda dataframes I have an x-axis.
However, when I make a subplot and try to make a shared x-axis the way I would when using numpy arrays without pandas, there are no numbers labels
I only want the numbers and label to appear on the last plot as they share the same x-axis.
The data loaded and the plot produced can be found here:
https://drive.google.com/open?id=1hTmTSkIcYl-usv_CCxLl8U6bAoO6tMRh
This is for combining and plotting the data logged from two different logging devices which represent the same time period.
import pandas as pd
import matplotlib.pyplot as plt
df1=pd.read_csv('data1.csv', sep=',',header=0)
df1.columns.values
cols1 = list(df1.columns.values)
df2=pd.read_csv('data2.dat', sep='\t',header=18)
df2.columns.values
cols2 = list(df2.columns.values)
start =10000
stop = 30000
fig, axes = plt.subplots(nrows=5, ncols=1, sharex=True, figsize=(10, 10))
df1.iloc[start:stop].plot(x=cols1[0], y=cols1[1], ax=axes[0])
df1.iloc[start:stop].plot(x=cols1[0], y=cols1[2], ax=axes[0])
df1.iloc[start:stop].plot(x=cols1[0], y=cols1[3], ax=axes[2])
df1.iloc[start:stop].plot(x=cols1[0], y=cols1[4], ax=axes[2])
df2.iloc[start:stop].plot(x=cols2[0], y=cols2[3], ax=axes[3])
ax3.set_xlabel("Time [s]")
plt.show()
I expect there to be numbers and a label on the x-axis but instead, it only gives the pandas label "#timestamp"
UPDATE: I have found something that hints at the problem. I think the problem is due to the two files not having identical time spacing, the first column of each file is time, they are roughly 1 sample per second but not exactly. If I remove the x=cols[x] parts it then shows numbers on the x-axis but then there is a shift in time between the two plots as they are not plotting against time but rather against the index in the dataframe.
I am currently trying to interpolate the data so that they have the same x-axis but I would not have expected that to be necessary.

Categories

Resources