Using matplotlib to draw rectangles with dates as coordinates - python

I'm working on creating a graph that represents activity over time, modelled after a chart in Sleep As Android. It's similar to a heatmap, but does not use color variation. Each column is a date, and the y axis is the full duration of the day. Whatever intervals of time where an activity occurs are blocked off with a bar. Here's what I have so far:
So far I have only been able to accomplish this by manually plotting rectangles on the figure. I loop through a list of events with dates, start times, and end times, plotting them like so:
ax.add_patch(patches.Rectangle((31*event_month + event_day, end_time), # (x,y)
0.75, # width
duration)) # height
This date handling is clearly wrong - it's just for the purpose of demonstration.
Normally when creating a histogram, I can plot using date objects directly with something like this:
fig.autofmt_xdate()
ax.fmt_xdata = matplotlib.dates.DateFormatter('%Y-%m-%d')
I'd like to be able to somehow just use the date objects directly when plotting. Is there a way to accomplish this in pyplot, or do I need to do something like converting the x axis to use POSIX time and just calculating where I put date labels?

Related

Combining two dataframes with different time intervals

I am doing a study in school about the effect of noise in a person's environment and his/her activity.
I have two dataframes with data I would like to compare. The data was recorded at the same time, but the time intervals between measurements are different. This makes it hard for me to overlay a plot and look at possible correlations.
The data frames look like this:
Volume level:
steps:
When I try to put these two dataframes in one plot with a sync timeline, the steps graph looks way smaller than the volume level graph. I have tried to plot the two graphs in multiple ways, but I keep ending up with something like this:
How about this.
This code uses multi y axis so it will help you with your problem that the graph size doesn't fit.
ax = steps_Niels_1st["steps"].plot()
ax1 = ax.twinx()
ax = volume_data_Niels_1st['size'].plot(ax=ax1)
plt.show()

Creating a scatter plot from a dataframe with dates on x-axis creates overlap

So I have a dataframe that consists of several variables, such as the amount of comments from an instagram page and the upload date of each post. For this, I used the package Instascrape
Now I want to make a scatter plot from these two variables, with the date on the x-axis and amount of comments on the y-axis using the following command:
plt.scatter(posts_df["upload_date"], posts_df["likes"])
plt.show()
Now I face the following problem, I get the scatter plot but the dates are overlapping. I believe that this is due to the fact that it shows every date on the x-axis, if they are close to each other the dates are overlapping.
I tried to convert it using:
plt.scatter(posts_df["upload_date"], posts_df["comments"])
plt.show()
But that did not change anything. The dates look like this: upload_date
0 2020-12-26 01:23:57
how can I solve this?
I solved it by using simply the command:
plt.gcf().autofmt_xdate()

Custom datetime ticks in matplotlib

I am trying to create a simple bar chart with ticks on the x-axis represented by strings in the format YYYY-WW i.e. Year and Week of the Year.
My x labels are too dense as shown below.
In trying to alternate every other tick I came across this post - X-axis tick labels are too dense when drawing plots with matplotlib. This user had the same exact issue as myself and was also using the same format.
The suggestions described that the best way about this is to make the strings datetime objects, as matplotlib will automatically take care of the spacing in that case.
However I have not found any help suggesting how to convert this custom format into a datetime object.

Plotting for a large number of time series data points using matplotlib

I've collected a sensor data every 5 minutes for a month (30 days).
That means, I have a timeseries data with 288*30 data points in total.
I'd like to scatterplot the data (x-axis: time, y-axis: sensor value).
the following code is for test.
import pandas as pd
from matplotlib import pyplot as plt
import numpy as np
# generate time series randomly (length: 1 month)
rng=pd.date_range("2015-11-11",periods=288*30,freq="5min")
ts=pd.Series(np.random.randn(len(rng)),rng)
nr=3
nc=1
fig=plt.figure(1)
fig.subplots_adjust(left=0.04,top=1,bottom=0.02,right=0.98,wspace=0.1,hspace=0.1)
for i in range(3):
ctr=i+1
ax=fig.add_subplot(nr,nc,ctr)
ax.scatter(ts.index,ts.values)
ax.set_xlim(ts.index.min(),ts.index.max())
plt.show()
I've generated random time series data having 288*30 observations and tried to draw it in scatter plot. However, as you can see, it is impossible to analyze the figure.
I want to redraw it satisfying the following conditions:
I want a zoomed-in version of the figure. In other words, a part of data points of some time range (e.g., 2~3 hours) is shown at once. Then, there should be enough space between adjacent points.
I want save the figure as png or pdf file. Then, if I open the file, the image (or pdf) viewer has a horizontal scroll bar which enables me to explore the whole figure.
Is there anyone who can solve it?
I do not think it will be not hard for a matplotlib expert, but quite hard for me, a beginner.
note to readers: answer changed significantly from v1 due to clarification of the question
I want a zoomed-in version of the figure. In other words, a part of data points of some time range (e.g., 2~3 hours) is shown at once. Then, there should be enough space between adjacent points.
Zooming in matplotlib is implemented with the x and y limits of the axis. So you can simply change the arguments to your call to ax.set_xlim such that the corresponding times differ by 2-3 hours or however long you want. Knowing that you have a sample every 5 minutes, since 2 hours/(5 min/sample) = 24, you could use
ax.set_xlim(ts.index.min(),ts.index.min() + 24)
to get a 2-hour range.
I want save the figure as png or pdf file. Then, if I open the file, the image (or pdf) viewer has a horizontal scroll bar which enables me to explore the whole figure.
Use savefig to save the figure to a file. Note that if you have set the axis limits using set_xlim or xlim or equivalent, this will save only the portion of the figure that is visible within the given limits. So to save the entire figure (with all data points visible), you will need to set the axis limits to the minimum and maximum values, respectively.
When you open the image/PDF file in a viewer, whether it displays a scroll bar (and how much of the figure is shown) is entirely up to the viewer. You cannot control this in Python. But you can give it some chance of showing up with a horizontal scroll bar by making the figure very large in the horizontal direction. To do so, you can pass the figsize=(width, height) keyword argument when creating the figure, or use the set_size_inches(width, height) method on an existing Figure object. The measurements are in inches in both cases. Pass a value for width that is much larger than that for height and you will get a very wide figure; for example, 40 for width and 4 for height. You'll have to experiment with these values to find which ones give your figure the proportions you want.

How do I convert (or scale) axis values and redefine the tick frequency in matplotlib?

I am displaying a jpg image (I rotate this by 90 degrees, if this is relevant) and of course
the axes display the pixel coordinates. I would like to convert the axis so that instead of displaying the pixel number, it will display my unit of choice - be it radians, degrees, or in my case an astronomical coordinate. I know the conversion from pixel to (eg) degree. Here is a snippet of what my code looks like currently:
import matplotlib.pyplot as plt
import Image
import matplotlib
thumb = Image.open(self.image)
thumb = thumb.rotate(90)
dpi = plt.rcParams['figure.dpi']
figsize = thumb.size[0]/dpi, thumb.size[1]/dpi
fig = plt.figure(figsize=figsize)
plt.imshow(thumb, origin='lower',aspect='equal')
plt.show()
...so following on from this, can I take each value that matplotlib would print on the axis, and change/replace it with a string to output instead? I would want to do this for a specific coordinate format - eg, rather than an angle of 10.44 (degrees), I would like it to read 10 26' 24'' (ie, degrees, arcmins, arcsecs)
Finally on this theme, I'd want control over the tick frequency, on the plot. Matplotlib might print the axis value every 50 pixels, but I'd really want it every (for example) degree.
It sounds like I would like to define some kind of array with the pixel values and their converted values (degrees etc) that I want to be displayed, having control over the sampling frequency over the range xmin/xmax range.
Are there any matplotlib experts on Stack Overflow? If so, thanks very much in advance for your help! To make this a more learning experience, I'd really appreciate being prodded in the direction of tutorials etc on this kind of matplotlib problem. I've found myself getting very confused with axes, axis, figures, artists etc!
Cheers,
Dave
It looks like you're dealing with the matplotlib.pyplot interface, which means that you'll be able to bypass most of the dealing with artists, axes, and the like. You can control the values and labels of the tick marks by using the matplotlib.pyplot.xticks command, as follows:
tick_locs = [list of locations where you want your tick marks placed]
tick_lbls = [list of corresponding labels for each of the tick marks]
plt.xticks(tick_locs, tick_lbls)
For your particular example, you'll have to compute what the tick marks are relative to the units (i.e. pixels) of your original plot (since you're using imshow) - you said you know how to do this, though.
I haven't dealt with images much, but you may be able to use a different plotting method (e.g. pcolor) that allows you to supply x and y information. That may give you a few more options for specifying the units of your image.
For tutorials, you would do well to look through the matplotlib gallery - find something you like, and read the code that produced it. One of the guys in our office recently bought a book on Python visualization - that may be worthwhile looking at.
The way that I generally think of all the various pieces is as follows:
A Figure is a container for all the Axes
An Axes is the space where what you draw (i.e. your plot) actually shows up
An Axis is the actual x and y axes
Artists? That's too deep in the interface for me: I've never had to worry about those yet, even though I rarely use the pyplot module in production plots.

Categories

Resources