X-axis over crowding using matplotlib - python

I'm plotting data points for ever day of the year but the x-axis is very overcrowded as a result.
How can I reduce the number of ticks on the x-axis?
I've tried using:
ax.set_xticks(ax.get_xticks()[::2])
and
plt.locator_params(axis='x', nbins=10)
But that then just seems to cut the range of xticks instead of condensing them:

Related

Why I am getting two different plots for same data?

I have a data file. (duration : One day)
I want to plot time vs temp graph.
I tried to plot it using two different block of codes. The methods are standard but I am getting two different plots.
In first method I used time in datetime datatype
#method
t=pd.to_datetime(data['time'],format='%H:%M:%S.%f').dt.time
data['time']=t
data.drop('index',axis=1,inplace=True)
ax=data.plot(x='time',color=['tab:blue','tab:red'])
ax.tick_params(axis='x', rotation=45)
In second method I used time in string format
fig, ax = plt.subplots()
ax.plot(data['time'], data['temp'])
ax.plot(data['time'], data['temp_mean'],color='red')
fig.autofmt_xdate()
ax.tick_params(axis='x', rotation=45)
y1,y2=float(data.temp.min()),float(data.temp.max())
ax.yaxis.set_ticks(np.arange(y1,y2),10)
ax.set_xlabel('Time')
ax.set_ylabel('Temp')
plt.show()
The second plot is the expected one. But the one I got have overlapped x ticks. I need to have xticks in hourly format (1 hr frequency)
Why I am getting plots with two different pattern? How to add xticks with 1 hour frequency?

How to properly display date from csv in matplotlib plot?

I have a csv with the following columns: recorded, humidity and temperature. I want to display the recorded values(date and time) on the x axis and the humidity on the y axis. How can I properly display the dates(it is quite a big csv), as my current plot has black instead of proper date numbers... My date format is like this: 2019-09-12T07:26:55, having the date and also the time displayed in the csv.
I have displayed the plot using this code:
from matplotlib import pyplot as plt
import pandas as pd
data = pd.read_csv('home_data.csv')
plt.plot(data.recorded, data.humidity)
plt.xlabel('date')
plt.ylabel('humidity')
plt.title('Visualizing date and humidity')
plt.show()
This is a print screen of the plot:
https://snipboard.io/d4hfS7.jpg
Actually, the plot is displaying every date in your dataset. They are so many that they seem just like a black blob. You can downsample the xticks in order to increase the readability. Do something like this:
fig, ax = plt.subplots()
ax.plot(data.recorded, data.humidity)
# some axes labelling
# Reduce now the number of the ticks printed in the figure
ax.set_xticks(ax.get_xticks()[::4])
ax.get_xticklabels(ax.get_xticks(), rotation=45)
In line ax.set_xticks(ax.get_xticks()[::4]) you are setting the ticks of the x-axis
picking 1 date every 4 using the property of the list. It will reduce the number of dates printed. You can increase the number as much as you want.
To increase the readibility, you can rotate the tick labels as I suggested in the line
ax.get_xticklabels(ax.get_xticks(), rotation=45).
Hope this helps.

Set x axis locator at hour intervals on matplotlib subplot

I am trying to create a figure with four subplots using the Matplotlib object based approach. I am having trouble setting the x-axis to hourly markers on each plot. With my present code the hourly marks are retained only on the last of the four subplots
I have a list which contains four dataframes that were read in from CSV. I used pd.to_datetime to create an index. No problem.
I can loop through the four dataframes and plot my y variable (TS_comp) against time. this works fine and I get date/time on each x-axis. But what I want is to have just hour markers on each of the x axis. When I add code in the loop to set the major locator it ends up that the x-axis labels are wiped on the first three subplots. The two lines of code from the loop below are:
ax.xaxis.set_major_locator(hours)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H'))
I do not understand why this is happening as each time it goes through the loop it should be addressing a different axis object. Note x-axis time ranges are different so not a simple matter of sharing the x-axis across the subplots.
fig, ax = plt.subplots(nrows=2, ncols=2)
i=0
hours = mdates.HourLocator(interval = 1)
for ax in fig.get_axes():
ax.plot(dfs[i].TS_comp,'k-',markersize = 0.5)
ax.xaxis.set_major_locator(hours)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H'))
i=i+1;
Expect to get hourly markers on each of the subplots, ended up with hourly markers on just the last plot

X-axis tick labels are too dense when drawing plots with matplotlib

I am drawing matplotlib plots and my x axis consists of YYYYMM formatted strings of year and month like 201901 for January of 2019.
My problem is that some of the data spans on a long period of time and this makes the x axis tick labels so dense that they pile up on each other and they become unreadable.
I tried making the font smaller and I rotated the labels 90 degrees which helped a lot but it is still not enough for some of my data.
Here is an example of one of my x axis which looks ok:
And here is an example of an x axis which is too dense because the data spans on a long period of time:
So I want matplotlib to skip printing a few tick labels when the tick labels start piling up on each other. For example, print the label for January, skip printing the labels for February, March, April and May, print the label for June, and skip printing the labels for July, August etc. But I don't know how to do this?
Or are there any other kind of solutions I can use to overcome this problem?
A quick dirty solution would be the following:
ax.set_xticks(ax.get_xticks()[::2])
This would only display every second xtick. If you wanted to only display every n-th tick you would use
ax.set_xticks(ax.get_xticks()[::n])
If you don't have a handle on ax you can get one as ax = plt.gca().
Alternatively, you could specify the number of xticks to use with:
plt.locator_params(axis='x', nbins=10)
An alternate solution could be as below:
x = df['Date']
y = df['Value']
# Risize the figure (optional)
plt.figure(figsize=(20,5))
# Plot the x and y values on the graph
plt.plot(x, y)
# Here you specify the ticks you want to display
# You can also specify rotation for the tick labels in degrees or with keywords.
plt.xticks(x[::5], rotation='vertical')
# Add margins (padding) so that markers don't get clipped by the axes
plt.margins(0.2)
# Display the graph
plt.show()

matplotlib formatting x axis with timestamps from big data

I am trying to create a plot that has a lot of data on it. This one in particular has about 550 points on it, each with its own timestamp. When I plot this, there are so many timestamps that I just get a black bar. I know it is not reasonable to expect to be able to make all timestamps visible, but is there a way to format the ticks t=so that they represent the range of values?
Here is my code:
plt.figure(1)
plt.scatter(x_axis_input, y_axis_input, s=DOT_SIZE)
plt.xlabel('timestamp')
plt.ylabel('value')
plt.title('test')
plt.savefig('plot_test.png')
plt.close()
and here is the resulting plot:
Link to plot

Categories

Resources