Why I am getting two different plots for same data? - python

I have a data file. (duration : One day)
I want to plot time vs temp graph.
I tried to plot it using two different block of codes. The methods are standard but I am getting two different plots.
In first method I used time in datetime datatype
#method
t=pd.to_datetime(data['time'],format='%H:%M:%S.%f').dt.time
data['time']=t
data.drop('index',axis=1,inplace=True)
ax=data.plot(x='time',color=['tab:blue','tab:red'])
ax.tick_params(axis='x', rotation=45)
In second method I used time in string format
fig, ax = plt.subplots()
ax.plot(data['time'], data['temp'])
ax.plot(data['time'], data['temp_mean'],color='red')
fig.autofmt_xdate()
ax.tick_params(axis='x', rotation=45)
y1,y2=float(data.temp.min()),float(data.temp.max())
ax.yaxis.set_ticks(np.arange(y1,y2),10)
ax.set_xlabel('Time')
ax.set_ylabel('Temp')
plt.show()
The second plot is the expected one. But the one I got have overlapped x ticks. I need to have xticks in hourly format (1 hr frequency)
Why I am getting plots with two different pattern? How to add xticks with 1 hour frequency?

Related

How to plot multiple lines in subplot using python and matplotlib

I've been following the solutions provided by Merge matplotlib subplots with shared x-axis. See solution 35. In each subplot, there is one line, but I would like to have multiple lines in each subplot. For example, the top plot has the price of IBM and a 30 day moving average. The bottom plot has a 180 day and 30 day variance.
To plot multiple lines in my other python programs I used (data).plot(figsize=(10, 7)) where data is a dataframe indexed by date, but in the author's solution he uses line0, = ax0.plot(x, y, color='r') to assign the data series (x,y) to the plot. In the case of multiple lines in solution 35, how does one assign a dataframe with multiple columns to the plot?
You'll need to use (data).plot(ax=ax0) to work with pandas plotting.
For the legend you can use:
handles0, labels0 = ax0.get_legend_handles_labels()
handles1, labels1 = ax1.get_legend_handles_labels()
ax0.legend(handles=handles0 + handles1, labels=labels0 + labels1)

How to use a 3rd dataframe column as x axis ticks/labels in matplotlib scatter

I'm struggling to wrap my head around matplotlib with dataframes today. I see lots of solutions but I'm struggling to relate them to my needs. I think I may need to start over. Let's see what you think.
I have a dataframe (ephem) with 4 columns - Time, Date, Altitude & Azimuth.
I produce a scatter for alt & az using:
chart = plt.scatter(ephem.Azimuth, ephem.Altitude, marker='x', color='black', s=8)
What's the most efficient way to set the values in the Time column as the labels/ticks on the x axis?
So:
the scale/gridlines etc all remain the same
the chart still plots alt and az
the y axis ticks/labels remain as is
only the x axis ticks/labels are changed to the Time column.
Thanks
This isn't by any means the cleanest piece of code but the following works for me:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.scatter(ephem.Azimuth, ephem.Altitude, marker='x', color='black', s=8)
labels = list(ephem.Time)
ax.set_xticklabels(labels)
plt.show()
Here you will explicitly force the set_xticklabels to the dataframe Time column which you have.
In other words, you want to change the x-axis tick labels using a list of values.
labels = ephem.Time.tolist()
# make your plot and before calling plt.show()
# insert the following two lines
ax = plt.gca()
ax.set_xticklabels(labels = labels)
plt.show()

Set x axis locator at hour intervals on matplotlib subplot

I am trying to create a figure with four subplots using the Matplotlib object based approach. I am having trouble setting the x-axis to hourly markers on each plot. With my present code the hourly marks are retained only on the last of the four subplots
I have a list which contains four dataframes that were read in from CSV. I used pd.to_datetime to create an index. No problem.
I can loop through the four dataframes and plot my y variable (TS_comp) against time. this works fine and I get date/time on each x-axis. But what I want is to have just hour markers on each of the x axis. When I add code in the loop to set the major locator it ends up that the x-axis labels are wiped on the first three subplots. The two lines of code from the loop below are:
ax.xaxis.set_major_locator(hours)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H'))
I do not understand why this is happening as each time it goes through the loop it should be addressing a different axis object. Note x-axis time ranges are different so not a simple matter of sharing the x-axis across the subplots.
fig, ax = plt.subplots(nrows=2, ncols=2)
i=0
hours = mdates.HourLocator(interval = 1)
for ax in fig.get_axes():
ax.plot(dfs[i].TS_comp,'k-',markersize = 0.5)
ax.xaxis.set_major_locator(hours)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H'))
i=i+1;
Expect to get hourly markers on each of the subplots, ended up with hourly markers on just the last plot

matplotlib formatting x axis with timestamps from big data

I am trying to create a plot that has a lot of data on it. This one in particular has about 550 points on it, each with its own timestamp. When I plot this, there are so many timestamps that I just get a black bar. I know it is not reasonable to expect to be able to make all timestamps visible, but is there a way to format the ticks t=so that they represent the range of values?
Here is my code:
plt.figure(1)
plt.scatter(x_axis_input, y_axis_input, s=DOT_SIZE)
plt.xlabel('timestamp')
plt.ylabel('value')
plt.title('test')
plt.savefig('plot_test.png')
plt.close()
and here is the resulting plot:
Link to plot

Pandas DataFrame plotting - Tick labels

this is a follow-up question on a piece of code I have posted previously here.
I am plotting a Dataframe object using data_CO2_PBPROD.T.plot(marker='o', color='k', alpha=0.3, lw=2) but I get on the x-axis double labels, as you can see in this picture
I tried to work on the set_major_formatter property of matplotlib.pyplot.axes() but then I get a separate graph with the correct tick labels - but no data displayed - along with the previous graph, unchanged.
You can use the argument xticks to set the values of your axis as explained in the documentation here:
xticks = [date for _ ,date in data_CO2_PBPROD.Column1]
Where Column1 is the column of your DataFrame containing the tuples (Values, Year)
Then put the xticks as a parameter to your plot function :
data_CO2_PBPROD.T.plot(marker='o', xticks=xticks, color='k', alpha=0.3, lw=2)

Categories

Resources