I am trying to create a simple bar chart with ticks on the x-axis represented by strings in the format YYYY-WW i.e. Year and Week of the Year.
My x labels are too dense as shown below.
In trying to alternate every other tick I came across this post - X-axis tick labels are too dense when drawing plots with matplotlib. This user had the same exact issue as myself and was also using the same format.
The suggestions described that the best way about this is to make the strings datetime objects, as matplotlib will automatically take care of the spacing in that case.
However I have not found any help suggesting how to convert this custom format into a datetime object.
Related
So I have a dataframe that consists of several variables, such as the amount of comments from an instagram page and the upload date of each post. For this, I used the package Instascrape
Now I want to make a scatter plot from these two variables, with the date on the x-axis and amount of comments on the y-axis using the following command:
plt.scatter(posts_df["upload_date"], posts_df["likes"])
plt.show()
Now I face the following problem, I get the scatter plot but the dates are overlapping. I believe that this is due to the fact that it shows every date on the x-axis, if they are close to each other the dates are overlapping.
I tried to convert it using:
plt.scatter(posts_df["upload_date"], posts_df["comments"])
plt.show()
But that did not change anything. The dates look like this: upload_date
0 2020-12-26 01:23:57
how can I solve this?
I solved it by using simply the command:
plt.gcf().autofmt_xdate()
I have the recurrent issue of having matplotlib bar graphs containing too many categorical values in the X axis. Resize a figure automatically in matplotlib and Python matplotlib multiple bars does not make the trick because my x values are not x. I am having the idea of splitting the graph into two graphs when it get past a certain amount of data point in the graph. I cannot find anything about in the matplotlib document, nor anywhere.
Is there a matplotlib tool to do that? or i would need to write an algorithm that detects the length of the dataset?
I am going through a notebook available to plot time series using data shader and noticed that they have converted the time series vales to 'ms' and then used these values for x-axis
https://anaconda.org/jbednar/tseries/notebook
Can I have x-axis as datetime values while plotting time series data or does it have to converted to integer or float format ?
Thanks
Datashader itself supports only real valued axes, but it is relatively simple to use HoloViews to construct a Bokeh plot of Datashader-rendered data labeled with date-time axes. You can see examples in Datashader's HoloViews_Datashader notebook:
Basically, you can provide the real (actually int in this case) values to Datashader that it understands, but then convert them to human-readable dates before you label the axes.
Bokeh's low level, foundational representation of datetime values is "floating point milliseconds since epoch". So sending that is always an option. However, Bokeh can recognize and generally convert most common datetime data types automatically: numpy datetime arrays, Pandas datetime indices and series, python datetime objects, etc. so there is usually no need to convert to ms yourself.
I'm working on creating a graph that represents activity over time, modelled after a chart in Sleep As Android. It's similar to a heatmap, but does not use color variation. Each column is a date, and the y axis is the full duration of the day. Whatever intervals of time where an activity occurs are blocked off with a bar. Here's what I have so far:
So far I have only been able to accomplish this by manually plotting rectangles on the figure. I loop through a list of events with dates, start times, and end times, plotting them like so:
ax.add_patch(patches.Rectangle((31*event_month + event_day, end_time), # (x,y)
0.75, # width
duration)) # height
This date handling is clearly wrong - it's just for the purpose of demonstration.
Normally when creating a histogram, I can plot using date objects directly with something like this:
fig.autofmt_xdate()
ax.fmt_xdata = matplotlib.dates.DateFormatter('%Y-%m-%d')
I'd like to be able to somehow just use the date objects directly when plotting. Is there a way to accomplish this in pyplot, or do I need to do something like converting the x axis to use POSIX time and just calculating where I put date labels?
matplotlib's axis-formatting options tend to fall flat when it comes to plotting and effectively labeling dense time-series data.
One problem is that tick labels are tied to ticks, so if you set axis ticks at an appropriate frequency, there are usually too many labels. This also means that if you are plotting, say, daily data over a period of several years, there is no good way to label the x-axis with each year in its natural position: centered under the year's data (i.e., under the x-axis position for July 2, or thereabouts).
The trick described in this example—set major ticks where you want them, then use invisible minor ticks to place the labels elsewhere—works, but it limits you to one visible set of axis ticks (since each axis is limited to one set of major and one set of minor ticks). You can't show, say, major ticks at the start of each year and minor ticks at the start of each month without giving up the ability to put year labels centered appropriately between the major (yearly) ticks, as you would find in publication-quality plots.
Is there a work-around that doesn't involve drawing everything fully manually?
Have you looked at the tsplot capability in scikits.timeseries? It hasn't been maintained much recently, but it works pretty well. I'll be porting that code into pandas at some point in the relative near future.