Basically, I'm plotting a graph based on a list of times(HH:MM:SS, x-axis) and float values (y-axis) stored in a txt file like this:
15 52 27 0.00
15 52 37 0.2
15 52 50 0.00
15 53 12 2.55
15 54 21 10.00
15 55 15 13.55
I want to plot the last float values (as an annotation text label) in correspondence of the last time available. Using the txt above, I want to plot "13.55 mL" in correspondence of the point [15 55 15, 13.55].
Here's the code to plot my graph:
datefunc = lambda x: mdates.date2num(datetime.strptime(x.decode("utf-8"), '%H %M %S'))
dates, levels = np.genfromtxt('sensor1Text.txt', # Data to be read
delimiter=8, # First column is 8 characters wide
converters={0: datefunc}, # Formatting of column 0
dtype=float, # All values are floats
unpack=True) # Unpack to several variables
# Configure x-ticks
plot_fs1.set_xticks(dates) # Tickmark + label at every plotted point
plot_fs1.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M:%S'))
plot_fs1.set_ylabel('Fluid (mL)')
plot_fs1.grid(True)
# Format the x-axis for dates (label formatting, rotation)
fs1.autofmt_xdate(rotation= 45)
plot_fs1.plot_date(dates, levels, color='orange', ls='-', marker='o')
Here's my attempt to plot the annotation label on my last plotted value:
lastxValue= len(dates)-1
lastyValue= len(levels)-1
lastValue = levels[lastyValue]
lastDate = dates[lastxValue]
plot_fs1.annotate(lastValue, (lastDate,
lastValue),xytext=(15, 15),textcoords='offset points')
fs1.tight_layout()
This is what I get:
The annotation is not completely displayed within the plot window and x-axis values tend to overlap on one another.
Any thoughts?
To avoid having x-axis entries for every point you plot, you could use a locator to just mark for example every minute on your graph.
Secondly, avoid using tight_layout() and instead make use of subplots_adjust() to add additional spacing where you need it. For example:
import numpy as np
import matplotlib
import matplotlib.dates as mdates
import matplotlib.pyplot as plt
datefunc = lambda x: mdates.date2num(datetime.strptime(x.decode("utf-8"), '%H %M %S'))
dates, levels = np.genfromtxt('sensor1Text.txt', # Data to be read
delimiter=8, # First column is 8 characters wide
converters={0: datefunc}, # Formatting of column 0
dtype=float, # All values are floats
unpack=True) # Unpack to several variables
plot_fs1 = plt.gca()
fig = plt.gcf()
p = plt.plot(dates, levels)
plot_fs1.set_xticks(dates) # Tickmark + label at every plotted point
plot_fs1.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
plot_fs1.xaxis.set_major_locator(matplotlib.dates.MinuteLocator())
plot_fs1.set_ylabel('Fluid (mL)')
plot_fs1.grid(True)
fig.autofmt_xdate(rotation= 45)
plot_fs1.plot_date(dates, levels, color='orange', ls='-', marker='o')
lastxValue = len(dates)-1
lastyValue = len(levels)-1
lastValue = levels[lastyValue]
lastDate = dates[lastxValue]
plot_fs1.annotate("{} mL".format(lastValue), (lastDate, lastValue), xytext=(15, 15), textcoords='offset points')
fig.subplots_adjust(bottom=0.15, right=0.85) # Add space at bottom and right
plt.show()
This would give you a graph looking:
One option is use a similar approach as this question and have independent and linear spaced xticks and use dates just as the name of the ticks.
That question uses bars, but you can use your difference in seconds and see how much total time you have passed, make your tick spacing in order to cover the whole time, but with the same step. Your time difference (that has difference spacing) you just need in the plot of your points. The xticks get nicer with a proper spacing. It also help to add an extra space that you need for your text.
Related
I want to plot a dataset on one x-axis and two y-axes (eV and nm). The two y-axis are linked together with the equation: nm = 1239.8/eV.
As you can see from my picture output, the values are not in the correct positions. For instance, at eV = 0.5 I need to have nm = 2479.6, at eV = 2.9, nm = 423, etc…
How can I fix this?
My data.txt:
number eV nm
1 2.573 481.9
2 2.925 423.9
3 3.174 390.7
4 3.242 382.4
5 3.387 366.1
The code I am using:
#!/usr/bin/env python3
import matplotlib.pyplot as plt
import pandas as pd
import matplotlib.ticker as tck
# data handling
file = "data.txt"
df = pd.read_csv(file, delimiter=" ") # generate a DataFrame with data
no = df[df.columns[0]]
eV = df[df.columns[1]].round(2) # first y-axis
nm = df[df.columns[2]].round(1) # second y-axis
# generate a subplot 1x1
fig, ax1 = plt.subplots(1,1)
# first Axes object, main plot (lollipop plot)
ax1.stem(no, eV, markerfmt=' ', basefmt=" ", linefmt='blue', label="Gas")
ax1.set_ylim(0.5,4)
ax1.yaxis.set_minor_locator(tck.MultipleLocator(0.5))
ax1.set_xlabel('Aggregation', labelpad=12)
ax1.set_ylabel('Transition energy [eV]', labelpad=12)
# adding second y-axis
ax2 = ax1.twinx()
ax2.set_ylim(2680,350) # set the corresponding ymax and ymin,
# but the values are not correct anyway
ax2.set_yticklabels(nm)
ax2.set_ylabel('Wavelength [nm]', labelpad=12)
# save
plt.tight_layout(pad=1.5)
plt.show()
The resulting plot is the following. I just would like to obtain a second axis by dividing the first one by 1239.8, and I don't know what else to look for!
You can use ax.secondary_yaxis, as described in this example. See the below code for an implementation for your problem. I have only included the part of the code relevant for the second y axis.
# adding second y-axis
def eV_to_nm(eV):
return 1239.8 / eV
def nm_to_eV(nm):
return 1239.8 / nm
ax2 = ax1.secondary_yaxis('right', functions=(eV_to_nm, nm_to_eV))
ax2.set_yticks(nm)
ax2.set_ylabel('Wavelength [nm]', labelpad=12)
Note that I am also using set_yticks instead of set_yticklabels. Furthermore, if you remove set_yticks, matplotlib will automatically determine y tick positions assuming a linear distribution of y ticks. However, because nm is inversely proportional to eV, this will lead to a (most likely) undesirable distribution of y ticks. You can manually change these using a different set of values in set_yticks.
I figured out how to solve this problem (source of the hint here).
So, for anyone who needs to have one dataset with one x-axis but two y-axes (one mathematically related to the other), a working solution is reported. Basically, the problem is to have the same ticks as the main y-axis, but change them proportionally, according to their mathematical relationship (that is, in this case, nm = 1239.8/eV). The following code has been tested and it is working.
This method of course works if you have two x-axes and 1 shared y-axis, etc.
Important note: you must define an y-range (or x-range if you want the opposite result), otherwise you might get some scaling problems.
#!/usr/bin/env python3
import matplotlib.pyplot as plt
import pandas as pd
import matplotlib.ticker as tck
from matplotlib.text import Text
# data
file = "data.txt"
df = pd.read_csv(file, delimiter=" ") # generate a DataFrame with data
no = df[df.columns[0]]
eV = df[df.columns[1]].round(2) # first y-axis
nm = df[df.columns[2]].round(1) # second y-axis
# generate a subplot 1x1
fig, ax1 = plt.subplots(1,1)
# first Axes object, main plot (lollipop plot)
ax1.stem(no, eV, markerfmt=' ', basefmt=" ", linefmt='blue', label="Gas")
ax1.set_ylim(0.5,4)
ax1.yaxis.set_minor_locator(tck.MultipleLocator(0.5))
ax1.set_xlabel('Aggregation', labelpad=12)
ax1.set_ylabel('Transition energy [eV]', labelpad=12)
# function that correlates the two y-axes
def eV_to_nm(eV):
return 1239.8 / eV
# adding a second y-axis
ax2 = ax1.twinx() # share x axis
ax2.set_ylim(ax1.get_ylim()) # set the same range over y
ax2.set_yticks(ax1.get_yticks()) # put the same ticks as ax1
ax2.set_ylabel('Wavelength [nm]', labelpad=12)
# change the labels of the second axis by apply the mathematical
# function that relates the two axis to each tick of the first
# axis, and then convert it to text
# This way you have the same axis as y1 but with the same ticks scaled
ax2.set_yticklabels([Text(0, yval, f'{eV_to_nm(yval):.1f}')
for yval in ax1.get_yticks()])
# show the plot
plt.tight_layout(pad=1.5)
plt.show()
data.txt is the same as above:
number eV nm
1 2.573 481.9
2 2.925 423.9
3 3.174 390.7
4 3.242 382.4
5 3.387 366.1
Output image here
While plotting time-series date, i'm trying to plot the number of data points per hour:
fig, ax = plt.subplots()
ax.hist(x = df.index.hour,
bins = 24, # draw one bar per hour
align = 'mid' # this is where i need help
rwidth = 0.6, # adding a bit of space between each bar
)
I want one bar per hour, each hour labeled, so we set:
ax.set_xticks(ticks = np.arange(0, 24))
ax.set_xticklabels(labels = [str(x) for x in np.arange(0, 24)])
The x-axis ticks are shown and labelled correctly, yet the bars themselves are not correctly aligned above the ticks. Bars are more drawn to the center, setting them right of the ticks on the left, while left of the ticks on the right.
The align = 'mid' option allows us, to shift the xticks to 'left' / 'right', yet neither of those is helping with the problem at hand.
Is there a way of setting the bars exactly above the corresponding ticks in a histogram?
To not skip details, here a few params set for better visibility via black background at imgur
fig.patch.set_facecolor('xkcd:mint green')
ax.set_xlabel('hour of the day')
ax.set_ylim(0, 800)
ax.grid()
plt.show()
When you put bins=24, you don't get one bin per hour. Supposing your hours are integers from 0 up to 23, bins=24 will create 24 bins, dividing the range from 0.0 to 23.0 into 24 equal parts. So, the regions will be 0-0.958, 0.958-1.917, 1.917-2.75, ... 22.042-23. Weirder things will happen in case the values don't contain 0 or 23 as the ranges will be created between the lowest and highest value encountered.
As your data is discrete, it is highly recommended to explicitly set the bin edges. For example number -0.5 - 0.5, 0.5 - 1.5, ... .
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
ax.hist(x=np.random.randint(0, 24, 500),
bins=np.arange(-0.5, 24), # one bin per hour
rwidth=0.6, # adding a bit of space between each bar
)
ax.set_xticks(ticks=np.arange(0, 24)) # the default tick labels will be these same numbers
ax.margins(x=0.02) # less padding left and right
plt.show()
I'm having with my xticks on my plot.
I have an hh:mm:ss format data on my x vector, but the xticks label are just eating up space on my x vector.
I'm trying to use only major xticks which would show the x vector label on 5 minutes basis.
but, the label not showing correctly.
right now this is the code that i wrote:
# -*- coding: utf-8 -*-
from os import listdir
from os.path import isfile, join
import pandas as pd
from Common import common as comm
from matplotlib.font_manager import FontProperties
import matplotlib.pyplot as plt
fp = FontProperties(fname="../templates/fonts/msgothic.ttc")
config = comm.configRead()
commonConf = comm.getCommonConfig(config)
peopleBhvConf = comm.getPeopleBhvConf(config)
files = [f for f in listdir(commonConf['resultFilePath']) if isfile(join(commonConf['resultFilePath'], f))]
waitTimeGraphInput = [s for s in files if peopleBhvConf['resultFileName'] in s]
waitTimeGraphFile = commonConf['inputFilePath'] + waitTimeGraphInput[0]
waitTimeGraph = pd.read_csv(waitTimeGraphFile)
# Create data
N = len(waitTimeGraph.index)
x = waitTimeGraph['ホール入時間']
y = waitTimeGraph['滞留時間(出-入sec)']
xTicks = pd.date_range(min(x), max(x), freq="5min")
fig, ax = plt.subplots()
ax.scatter(x, y)
ax.set_xticklabels(xTicks, rotation='vertical')
plt.axhline(y=100, xmin=min(x), xmax=max(x), linewidth=2, color = 'red')
plt.setp(ax.get_xticklabels(), visible=True, rotation=30, ha='right')
plt.savefig(commonConf['resultFilePath'] + '1人1人の待ち時間分布.png')
plt.show()
and this is the result:
as you can see, the labels are still being printed only on the front of my plotting.
I'm expecting it would being printed on my major xticks position only.
The problem
If I understand correctly what is going on, xTicks array is shorter than x, am I right? If so, this is the issue.
I don't see in your code where you set the tick position, but I guess you are showing all of them, one per each element of x. But since you set the tick labels manually with ax.set_xticklabels(xTicks, rotation='vertical'), matplotlib has no way to know at which ticks those labels should go, hence it fills the first available ticks, and if there are more ticks, they are left without labels.
If you were able to read the labes, you would see that the written dates do not correspond to the labelled positions on the axis.
How to fix it
The general rule, be sure when you set tick labels manually, that the array containing the label has the same length of the array of the ticks. Add empty strings for the ticks where you do not want to have a labels.
However, since you spoke of major ticks and minor ticks, I show you how to set them in your case, where you have dates on the x axis.
Drop the xTicks, is not needed. Don't set the tick labels manually, hence don't use ax.set_xticklabels().
Your code should be:
fig, ax = plt.subplots()
ax.scatter(x, y)
plt.axhline(y=100, xmin=min(x), xmax=max(x), linewidth=2, color = 'red')
ax.xaxis.set_major_locator(MinuteLocator(interval=5))
ax.xaxis.set_minor_locator(MinuteLocator(interval=1))
ax.xaxis.set_major_formatter(DateFormatter('%H:%M:%S'))
plt.setp(ax.get_xticklabels(), visible=True, rotation=30, ha='right')
plt.savefig(commonConf['resultFilePath'] + '1人1人の待ち時間分布.png')
Remember to import the locator and formatter:
from matplotlib.dates import MinuteLocator, DateFormatter
A brief explanation: MinuteLocator finds each minute interval in your x axis and place a tick. The parameter interval allows you to set a tick each N minutes. So in the above code a major tick is placed each 5 minutes, a minor tick each minute.
DateFormatter simply format the date accordingly to the string (here I choose the format hour, minute, second). Note that no formatter has been set for minor ticks, so by default matplotlib uses the null formatter (no labels for minor ticks).
Here the documentation on the dates module of matplotlib.
To give you an idea of the result, here is an image I created using the code above with random data (just look at the x axis).
Groups Counts
1 0-9 38
3 10-19 41
5 20-29 77
7 30-39 73
9 40-49 34
I want to create a bar graph using matplotlib.pyplot library with groups on x-axis and Counts on y-axis. I tried it out using following code
ax = plt.subplots()
rects1 = ax.bar(survived_df["Groups"], survived_df["Counts"], color='r')
plt.show()
but I'm getting following error
invalid literal for float(): 0-9
The first array given to the plt.bar function must be numbers corresponding to the x coordinates of the left sides of the bars. In your case, [0-9, 10-19, ...] is not recognized as valid argument.
You can however make the bar plot using the index of your DataFrame, then defining the position of your x-ticks (where you want your label to be positioned on the x axis) and then changing the labels of your x ticks with your Groups name.
fig,ax = plt.subplots()
ax.bar(survived_df.index, survived_df.Counts, width=0.8, color='r')
ax.set_xticks(survived_df.index+0.4) # set the x ticks to be at the middle of each bar since the width of each bar is 0.8
ax.set_xticklabels(survived_df.Groups) #replace the name of the x ticks with your Groups name
plt.show()
Note that you can also use the Pandas plotting capabilities directly with a one liner:
survived_df.plot('Groups', 'Counts', kind='bar', color='r')
I want to plot a series of values against a date range in matplotlib. I changed the tick base parameter to 7, to get one tick at the beginning of every week (plticker.IndexLocator, base = 7). The problem is that the set_xticklabels function does not accept a base parameter. As a result, the second tick (representing day 8 on the beginning of week 2) is labelled with day 2 from my date range list, and not with day 8 as it should be (see picture).
How to give set_xticklabelsa base parameter?
Here is the code:
my_data = pd.read_csv("%r_filename_%s_%s_%d_%d.csv" % (num1, num2, num3, num4, num5), dayfirst=True)
my_data.plot(ax=ax1, color='r', lw=2.)
loc = plticker.IndexLocator(base=7, offset = 0) # this locator puts ticks at regular intervals
ax1.set_xticklabels(my_data.Date, rotation=45, rotation_mode='anchor', ha='right') # this defines the tick labels
ax1.xaxis.set_major_locator(loc)
Here is the plot:
Plot
Many thanks - your solution perfectly works. For the case that other people run into the same issue in the future: i have implemented the above-mentioned solution but also added some code so that the tick labels keep the desired rotation and also align (with their left end) to the respective tick. May not be pythonic, may not be best-practice, but it works
x_fmt = mpl.ticker.IndexFormatter(x)
ax.set_xticklabels(my_data.Date, rotation=-45)
ax.tick_params(axis='x', pad=10)
ax.xaxis.set_major_formatter(x_fmt)
labels = my_data.Date
for tick in ax.xaxis.get_majorticklabels():
tick.set_horizontalalignment("left")
The reason your ticklabels went bad is that setting manual ticklabels decouples the labels from your data. The proper approach is to use a Formatter according to your needs. Since you have a list of ticklabels for each data point, you can use an IndexFormatter. It seems to be undocumented online, but it has a help:
class IndexFormatter(Formatter)
| format the position x to the nearest i-th label where i=int(x+0.5)
| ...
| __init__(self, labels)
| ...
So you just have to pass your list of dates to IndexFormatter. With a minimal, pandas-independent example (with numpy only for generating dummy data):
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
# create dummy data
x = ['str{}'.format(k) for k in range(20)]
y = np.random.rand(len(x))
# create an IndexFormatter with labels x
x_fmt = mpl.ticker.IndexFormatter(x)
fig,ax = plt.subplots()
ax.plot(y)
# set our IndexFormatter to be responsible for major ticks
ax.xaxis.set_major_formatter(x_fmt)
This should keep your data and labels paired even when tick positions change:
I noticed you also set the rotation of the ticklabels in the call to set_xticklabels, you would lose this now. I suggest using fig.autofmt_xdate to do this instead, it seems to be designed exactly for this purpose, without messing with your ticklabel data.