Plot spectroscopic data from pandas dataframe in 3D with different array length

Plot spectroscopic data from pandas dataframe in 3D with different array length - python

Is it possible to get something like this plot
from a pandas dataframe, in a a similar fashion as I would just simply do to do 2d-plots (df.plot())?
More precisely:
I have data that I read from csv files into pandas DataFrames with following structure:
1st level header A B C D E F
2nd level header 2.0 1.0 0.2 0.4 0.6 0.8
Index
126.4348 -467048 -814795 301388 298430 -187654 -1903170
126.4310 -468329 -810060 304366 305343 -192035 -1881625
126.4272 -469209 -804697 305795 312472 -197013 -1854848
126.4234 -469685 -799604 305647 318936 -200957 -1827665
126.4195 -469795 -795708 304101 323922 -202192 -1805153
126.4157 -469610 -793795 301497 326780 -199323 -1791743
126.4119 -469213 -794362 298257 327092 -191547 -1790418
126.4081 -468687 -797499 294817 324717 -178875 -1802122
126.4043 -468097 -802853 291546 319800 -162225 -1825540
126.4005 -467486 -809663 288700 312745 -143334 -1857270
126.3967 -466863 -816878 286401 304170 -124505 -1892389
126.3929 -466210 -823335 284645 294827 -108228 -1925312
126.3890 -465485 -827966 283331 285520 -96733 -1950795
126.3852 -464637 -829997 282315 277018 -91559 -1964894
126.3814 -463617 -829104 281457 269965 -93242 -1965702
126.3776 -462399 -825487 280670 264824 -101170 -1953728
126.3738 -460982 -819857 279942 261819 -113660 -1931820
126.3700 -459408 -813317 279344 260927 -128242 -1904669
126.3662 -457757 -807177 279009 261885 -142112 -1877955
126.3624 -456143 -802715 279090 264233 -152667 -1857303
126.3585 -454700 -800940 279722 267380 -158023 -1847241
126.3547 -453566 -802397 280969 270692 -157406 -1850358
126.3509 -452862 -807050 282792 273579 -151350 -1866803
126.3471 -452672 -814262 285033 275591 -141627 -1894249
126.3433 -453030 -822898 287426 276486 -130942 -1928303
126.3395 -453910 -831501 289627 276273 -122426 -1963297
126.3357 -455223 -838544 291266 275222 -119021 -1993312
126.3319 -456834 -842695 292004 273824 -122882 -2013246
126.3280 -458571 -843048 291599 272725 -134907 -2019718
126.3242 -460252 -839292 289952 272620 -154497 -2011656
... ... ... ... ... ... ...
What I would like to do with that
I would like to plot each of these columns (they are NMR spectra) against the index.
In a 2D overlay, this is simple usage of the pandas wrapper around matplotlib.
However, I would like to plot each spectrum in its own "line", along a third axis that has the second level headers as
ticks.
I tried to use matplotlib´s 3D plotting functionality, but it seems to only be suitable if you actually have three arrays of equal length,
which in the case of my data does just not make sense, because each spectrum is recorded for one of the values from the second level header.
Am I maybe thinking too complicated when I try to make a 3D plot?
Is the figure I would like my plot to look like maybe not an actual 3D plot but rather some special version of overlaid 2D plots?
How I would prefer to do it
Bonus points for:
Using only python
Using only pandas and matplotlib
Already implemented functionality
If there is no obvious python way to do it, I would as well be happy about libraries of other languages that can do the same, such as
R or Octave. I am just not as familiar with these, so I would probably not be able to adapt more hacky solutions in these languages to suit my requirements.
This question might be very similar, but as I understand it, it does not necessarily extend to software other than python and doesn't have an example of what the result should look like, so I am not sure if answers to that question might actually be helpful for this specific purpose.
What is wrong with matplotlib´s gallery examples
As lanery pointed out, polygon3D from the matplotlib gallery gets close to what I wish for.
However it has some drawbacks some of which are not acceptable for most scientific publications:
With negative values, the whole plot gets shifted to what I would
call "the middle of the screen", which looks kind of ugly, makes
it hard to extract information from the figure and makes it different
from the provided examples
You get that interactive plot window, which requires you to find an
angle from which you can see everything you need to see. That
might be good for some data exploration tasks, but if you use
scripts for your visualization and a minor change to the graphic
would force you to do some manual work again, this decreases the
advantage you expect from scripting
If you have values that differ strongly and are not linear, something
like [0,1,1.7,2.5,6.2], for your third dimension i.e. the second
level header in this case, the 2d plots have very different distances
from another, which is unacceptable, at least for any
non-programming audience reading the publications
It is quite long and technical for a quite common plotting operation
in spectroscopy. The amount of code would be fine if I wanted to
build software that can make 3D plots in some context. For science it
would be preferable to be able to accomplish something like this
with a low amount of code.

I gave you an example of plotting with the data from the continuous X and Y, and just hard-coded z based on your second level header.
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import matplotlib
%matplotlib inline
df = pd.read_csv("C:\Users\User\SkyDrive\Documents\import_data.tcsv.txt",header=None)
fig = plt.figure()
ax = fig.gca(projection='3d')
# Plot a sin curve using the x and y axes.
x = df[0]
ax.plot(x, df[1], zs=2, zdir='z', label='A')
ax.plot(x, df[2], zs=1, zdir='z', label='B')
ax.plot(x, df[3], zs=0.2, zdir='z', label='C')
ax.plot(x, df[4], zs=0.4, zdir='z', label='D')
ax.plot(x, df[5], zs=0.6, zdir='z', label='E')
ax.plot(x, df[6], zs=0.8, zdir='z', label='F')
# Customize the view angle so it's easier to see that the scatter points lie
# on the plane y=0
ax.view_init(elev=-150., azim=40)
plt.show()
Your going to have to play with the options on view_init to rotate around and get the axes where you want. I'm not really clear with what your end goal was, but this is the end plot.

Related

How can I calculate the time lag between two similar time series?

I'm trying to compute/visualize the time lag between 2 time series (I want to know the time lag between the humidity progression of outside and inside a room).
Each data point of my series was taken hourly. Plotting the 2 series together, I can clearly see a shift between them: Sorry for hiding the axis
Here are a part of my time series data. I will pack them in 2 arrays:
inside_humidity =
[11.77961297, 11.59755268, 12.28761522, 11.88797553, 11.78122077, 11.5694668,
11.70421932, 11.78122077, 11.74272005, 11.78122077, 11.69438733, 11.54126933,
11.28460592, 11.05624965, 10.9611012, 11.07527934, 11.25417308, 11.56040908,
11.6657186, 11.51171572, 11.49246536, 11.78594142, 11.22968373, 11.26840678,
11.26840678, 11.29447992, 11.25553344, 11.19711371, 11.17764047, 11.11922075,
11.04132778, 10.86996123, 10.67410607, 10.63493504, 10.74922916, 10.74922916,
10.6294765, 10.61011497, 10.59075345, 10.80373021, 11.07479154, 11.15223764,
11.19711371, 11.17764047, 11.15816723, 11.22250051, 11.22250051, 11.202915,
11.18332948, 11.16374396, 11.14415845, 11.12457293, 11.10498742, 11.14926578,
11.16896413, 11.16896413, 11.14926578, 10.8307902, 10.51742195, 10.28187137,
10.12608544, 9.98977276, 9.62267727, 9.31289289, 8.96438546, 8.77077022,
8.69332413, 8.51907042, 8.30609366, 8.38353975, 8.4513867, 8.47085994,
8.50980642, 8.52927966, 8.50980642, 8.55887037, 8.51969934, 8.48052831,
8.30425867, 8.2177078, 7.98402891, 7.92560918, 7.89950166, 7.83489682,
7.75789537, 7.5984808, 7.28426807, 7.39778913, 7.71943214, 8.01149931,
8.18276652, 8.23009255, 8.16215295, 7.93822471, 8.00350215, 7.93843482,
7.85072729, 7.49778011, 7.31782649, 7.29862668, 7.60162032, 8.29665484,
8.58797834, 8.50011383, 8.86757784, 8.76600556, 8.60491125, 8.4222628,
8.24923231, 8.14470714, 8.17351638, 8.52530093, 8.72220151, 9.26745883,
9.1580007, 8.61762692, 8.22187405, 8.43693644, 8.32414835, 8.32463974,
8.46833012, 8.55865487, 8.72647164, 9.04112806, 9.35578449, 9.59465974,
10.47339785, 11.07218093, 10.54091351, 10.56138918, 10.46099958, 10.38129168,
10.16434831, 10.10612612, 10.009246, 10.53502351, 10.8307902, 11.13420052,
11.64337309, 11.18958511, 10.49630791, 10.60856932, 10.37029108, 9.86281478,
9.64699826, 9.95341012, 10.24329812, 10.6848196, 11.47604231, 11.30505352,
10.72194974, 10.30058448, 10.05022037, 10.06318411, 9.90118897, 9.68530059,
9.47790657, 9.48585784, 9.61639418, 9.86244265, 10.29009361, 10.28297229,
10.32073088, 10.65389513, 11.09656351, 11.20188562, 11.24124169, 10.40503955,
9.74632512, 9.07606098, 8.85145589, 9.37080152, 9.65082743, 10.0707891,
10.68776091, 11.25879751, 11.0416348, 10.89558456, 10.7908258, 10.66539685,
10.7297755, 10.77571398, 10.9268264, 11.16021492, 11.60961709, 11.43827534,
11.96155427, 12.16116437, 12.80412266, 12.52540805, 11.96752965, 11.58099292]
outside_humidity =
[10.17449206, 10.4823292, 11.06818167, 10.82768699, 11.27582592, 11.4196233,
10.99393027, 11.4122507, 11.18192837, 10.87247831, 10.68664321, 10.37949651,
9.57155882, 10.86611665, 11.62547196, 11.32004266, 11.75537602, 11.51292063,
11.03107569, 10.7297755, 10.4345622, 10.61271497, 9.49271162, 10.15594248,
9.99053828, 9.80915398, 9.6452438, 10.06900573, 11.18075689, 11.8289847,
11.83334752, 11.27480708, 11.14370467, 10.88149985, 10.73930381, 10.7236597,
10.26210496, 11.01260226, 11.05428228, 11.58321342, 12.70523808, 12.5181118,
11.90023799, 11.67756426, 11.28859471, 10.86878222, 9.73984486, 10.18253902,
9.80915398, 10.50980784, 11.38673459, 11.22751685, 10.94171823, 10.56484228,
10.38220753, 10.05388847, 9.96147203, 9.90698862, 9.7732203, 9.85262125,
8.7412938, 8.88281702, 8.07919545, 8.02883587, 8.32341424, 8.07357711,
7.27302616, 6.73660684, 6.66722819, 7.29408637, 7.00046542, 6.46322019,
6.07150988, 6.00207234, 5.8818402, 6.82443881, 7.20212882, 7.52167696,
7.88857771, 8.351627, 8.36547023, 8.24802846, 8.18520693, 7.92420816,
7.64926024, 7.87944972, 7.82118727, 8.02091833, 7.93071882, 7.75789457,
7.5416447, 6.94430133, 6.65907535, 6.67454591, 7.25493614, 7.76939457,
7.55357806, 6.61479472, 7.17641357, 7.24664082, 8.62732387, 8.66913548,
8.70925667, 9.0477017, 8.24558224, 8.4330502, 8.44366397, 8.17995798,
8.1875752, 9.33296518, 9.66567041, 9.88581085, 8.95449382, 8.3587624,
9.20584448, 8.90605388, 8.87494884, 9.12694892, 8.35055177, 7.91879933,
7.78867253, 8.22800878, 9.03685287, 12.49630018, 11.11819755, 10.98869374,
10.65897176, 10.36444573, 10.052609, 10.87627021, 10.07379564, 10.02233847,
9.62022856, 11.21575473, 10.85483543, 11.67324627, 11.89234248, 11.10068132,
10.06942096, 8.50405894, 8.13168561, 8.83616476, 8.35675085, 8.33616802,
8.35675085, 9.02209801, 9.5530404, 9.44738836, 10.89645958, 11.44771721,
11.79943601, 10.7765335, 11.1453622, 10.74874776, 10.55195175, 10.34494483,
9.83813522, 11.26931785, 11.20641798, 10.51555027, 10.90808954, 11.80923545,
11.68300879, 11.60313809, 7.95163365, 7.77213815, 7.54209557, 7.30603673,
7.17842173, 8.25899805, 8.56494995, 10.44245578, 11.08542758, 11.74129079,
11.67979686, 12.94362214, 11.96285343, 11.8289847, 11.01388413, 10.6793698,
11.20662595, 11.97684701, 12.46383177, 11.34178655, 12.12477078, 12.48698059,
12.89325064, 12.07470295, 12.6777319, 10.91689448, 10.7676326, 10.66710434]
I know cross correlation is the right term to use, but after a while I still don't get the idea of using scipy.signal.correlate and numpy.correlate, because all I got is an array full of NaNs. So clearly I need some more knowledge in this area.
What I expect to achieve is probably a plot like those in the answer section of this thread How to make a correlation plot with a certain lag of two time series where I can see at how many hours the time lag is most likely.
Thank you a lot in advance!

With the given data, you can use the numpy and matplotlib modules to achieve the desired result.
so, you can do something like this:
import numpy as np
from matplotlib import pyplot as plt
x = np.array(inside_humidity)
y = np.array(outside_humidity)
fig = plt.figure()
# fit a curve of your choice
a, b = np.polyfit(inside_humidity, outside_humidity, 1)
y_fit = a * x + b
# scatter plot, and fitted plot (best fit used)
plt.scatter(inside_humidity, outside_humidity)
plt.plot(x, y_fit)
plt.show()
which gives this:

plotnine geom_boxplot ignores required aesthetic and requires unnecessary aesthetic

I have data that looks like:
Scenario ymin lower middle upper ymax
One 16362.586379 20911.338893 27121.693254 35219.449009 46406.087619
Two 19779.003240 25390.096116 33108.174561 43545.202225 58464.277060
Rather than use all 50 k data points for every Scenario (there are many more than One and Two), I've computed the positions I need for the box and whiskers.
I try to plot this via
import pandas
import plotnine as p9
df = pandas.read_excel('boxplot_data.xlsx', sheet='Sheet1')
gg = p9.ggplot()
gg += p9.geoms.geom_boxplot(mapping=p9.aes(x='Scenario', ymin='ymin', lower='lower', middle='middle', upper='upper', ymax='ymax'), data=df, color='k', show_legend=False, inherit_aes=False)
gg += p9.themes.theme_seaborn()
gg += p9.labels.xlab('Scenario')
gg.save(filename='scenario_boxplot.png', dpi=300)
The documentation at https://plotnine.readthedocs.io/en/stable/generated/plotnine.geoms.geom_boxplot.html#plotnine.geoms.geom_boxplot indicates that the geom_boxplot line of code supplies the required aesthetic parameters to define the box and whiskers.
Running this, however, gives
plotnine.exceptions.PlotnineError: 'stat_boxplot requires the
following missing aesthetics: y'
Why is stat_boxplot being called, with its required aesthetics, not geom_boxplot?
And more importantly, does anybody know how to correct this?

You are using geom_boxplot with stat_boxplot instead of stat_identity.
geom_boxplot(stat='identity', ...)

pymc3 multivariate traceplot color coding

I am new to working with pymc3 and I am having trouble generating an easy-to-read traceplot.
I'm fitting a mixture of 4 multivariate gaussians to some (x, y) points in a dataset. The model runs fine. My question is with regard to manipulating the pm.traceplot() command to make the output more user-friendly.
Here's my code:
import matplotlib.pyplot as plt
import numpy as np
model = pm.Model()
N_CLUSTERS = 4
with model:
#cluster prior
w = pm.Dirichlet('w', np.ones(N_CLUSTERS))
#latent cluster of each observation
category = pm.Categorical('category', p=w, shape=len(points))
#make sure each cluster has some values:
w_min_potential = pm.Potential('w_min_potential', tt.switch(tt.min(w) < 0.1, -np.inf, 0))
#multivariate normal means
mu = pm.MvNormal('mu', [0,0], cov=[[1,0],[0,1]], shape = (N_CLUSTERS,2) )
#break symmetry
pm.Potential('order_mu_potential', tt.switch(
tt.all(
[mu[i, 0] < mu[i+1, 0] for i in range(N_CLUSTERS - 1)]), -np.inf, 0))
#multivariate centers
data = pm.MvNormal('data', mu =mu[category], cov=[[1,0],[0,1]], observed=points)
with model:
trace = pm.sample(1000)
A call to pm.traceplot(trace, ['w', 'mu']) produces this image:
As you can see, it is ambiguous which mean peak corresponds to an x or y value, and which ones are paired together. I have managed a workaround as follows:
from cycler import cycler
#plot the x-means and y-means of our data!
fig, (ax0, ax1) = plt.subplots(nrows=2)
plt.xlabel('$\mu$')
plt.ylabel('frequency')
for i in range(4):
ax0.hist(trace['mu'][:,i,0], bins=100, label='x{}'.format(i), alpha=0.6);
ax1.hist(trace['mu'][:,i,1],bins=100, label='y{}'.format(i), alpha=0.6);
ax0.set_prop_cycle(cycler('color', ['c', 'm', 'y', 'k']))
ax1.set_prop_cycle(cycler('color', ['c', 'm', 'y', 'k']))
ax0.legend()
ax1.legend()
This produces the following, much more legible plot:
I have looked through the pymc3 documentation and recent questions here, but to no avail. My question is this: is it possible to do what I have done here with matplotlib via builtin methods in pymc3, and if so, how?

Better differentiation between multidimensional variables and the different chains was recently added to ArviZ (the library PyMC3 relies on for plotting).
In ArviZ latest version, you should be able to do:
az.plot_trace(trace, compact=True, legend=True)
to get the different dimensions of each variable distinguished by color and the different chains distinguished by linestyle. The default setting is using matplotlib's default color cycle and 4 different linestyles, solid, dashed, dotted and dash-dotted. Both properties can be set to custom aesthetics and custom values by using compact_prop to customize dimension representation and chain_prop to customize chain representation. In addition, if using compact, it may also be a good idea to use combined=True to reduce the clutter in the first column. As an example:
az.plot_trace(trace, compact=True, combined=True, legend=True, chain_prop=("ls", "-"))
would plot the KDEs in the first column using the data from all chains, and would plot all chains using a solid linestyle (due to combined arg, only relevant for the second column). Two legends will be shown, one for the chain info and another for the compact info.

At least in recent versions, you can use compact=True as in:
pm.traceplot(trace, var_names = ['parameters'], compact=True)
to get one graph with all you params combined
Docs in: https://arviz-devs.github.io/arviz/_modules/arviz/plots/traceplot.html
However, I haven't been able to get the colors to differ between lines

Using python and matplotlib, fill between two lines not giving expected output

I am trying to plot a linear line with associated error.
I calculated values for slope (a) and intercepts (b). In addition, I calculated the error associated with these values. So I drew the line given by the typical formula below.
y=ax+b
However, in addition to the line, I also want to draw the associated error. I came up with the idea to draw the lines associated with these formulas and color the space between the lines gray.
y=(a+a_sd)x+(b+b_sd)
y=(a-a_sd)x+(b-b_sd)
Uisng the following piece of code, I am able to color part of the surface between the lines, but not the whole span (see included output).
I think this may be due to the fact that "distance" is not sorted, and fill_between is using distance[0] and distance[-1] as begin and end for the span, respectively.
As always, any help would be highly appreciated!
import matplotlib.pyplot as plt
distance=[0.35645334340084989, 0.55406894241607718, 0.10201413273193734, 0.13401365724625941, 0.71918808865838735, 0.14151335417722818]
time=[2.4004984846346171, 2.4909766335028447, 1.9852064018125195, 1.9083156734132103, 2.6380396934372863, 1.9114505780323543]
time_SD=[0.062393810960652669, 0.056945715242838917, 0.073960838867327183, 0.084111239062664475, 0.026912957190265499, 0.08595664694840538]
distance_SD=[0.035160608598240162, 0.032976715460514235, 0.02782911002465227, 0.035465701695038584, 0.043009444687382707, 0.038387585107200854]
a=1.17887019041
b=1.83339229489
a_sd=0.159771527859
b_sd=0.0762509747218
plt.errorbar(distance,time,yerr=time_SD, xerr=distance_SD, linestyle="None")
abline_values = [(a)*i + (b) for i in distance]
abline_values_plus = [(a+a_sd)*i + (b+b_sd) for i in distance]
abline_values_minus = [(a-a_sd)*i + (b-b_sd) for i in distance]
plt.plot(distance, abline_values,"r")
plt.fill_between(distance,abline_values_minus,abline_values_plus,facecolor='lightgrey', interpolate=True, edgecolors="None")
leg = plt.legend(loc="lower right", frameon=False, handlelength=0, handletextpad=0)
for item in leg.legendHandles:
item.set_visible(False)
plt.show()

In order to use pyplot.fill_between() the list to plot the horizontal coordinate should be sorted. Using an unsorted list of x values is possible, but can lead to undesired results.
Sorting a list can be done using sorted(list).
import matplotlib.pyplot as plt
distance=[0.35645334340084989, 0.55406894241607718, 0.10201413273193734, 0.13401365724625941, 0.71918808865838735, 0.14151335417722818]
time=[2.4004984846346171, 2.4909766335028447, 1.9852064018125195, 1.9083156734132103, 2.6380396934372863, 1.9114505780323543]
time_SD=[0.062393810960652669, 0.056945715242838917, 0.073960838867327183, 0.084111239062664475, 0.026912957190265499, 0.08595664694840538]
distance_SD=[0.035160608598240162, 0.032976715460514235, 0.02782911002465227, 0.035465701695038584, 0.043009444687382707, 0.038387585107200854]
a=1.17887019041
b=1.83339229489
a_sd=0.159771527859
b_sd=0.0762509747218
distance_sorted = sorted(distance)
plt.errorbar(distance,time,yerr=time_SD, xerr=distance_SD, linestyle="None")
abline_values = [(a)*i + (b) for i in distance_sorted]
abline_values_plus = [(a+a_sd)*i + (b+b_sd) for i in distance_sorted]
abline_values_minus = [(a-a_sd)*i + (b-b_sd) for i in distance_sorted]
plt.plot(distance_sorted, abline_values,"r")
plt.fill_between(distance_sorted,abline_values_minus,abline_values_plus, facecolor='lightgrey', edgecolors="None")
plt.show()
The documentation does not mention the requirement of x values being sorted. The reason is probably that fill_between actually works even with unsorted lists, just not the way one might expect. Maybe the following animation gives a more intuitive understanding on the issue:

You are right fill_between seems to expect the values to be sorted. The documentation is not clear about this behaviour though. The following example however shows the same effect:
import matplotlib.pyplot as plt
from numpy import random, array
#x = random.randn(20) #does not work
x = array(sorted(random.randn(20))) #works
a = 2
d = .5
y_h = x*(a+d)
y_l = x*(a-d)
plt.fill_between(x,y_h, y_l)
plt.show()
As a workaround just sort your values before calculating your errorlines using sorted.

matplotlib; fractional powers of ten; scientific notation

I deal with simulation data and have been using matplotlib a lot lately and have been encountering something (a bug?) that's annoying.
I have been allowing matplotlib to automatically set the tick labels and their type (scientific, etc) and with some data I get weird scientific ticker labels.
In searching for a resolution to this I found that you can call set_powerlimits((n,m)) to set the the limits of data that will be displayed using scientific notation. But I have encountered this problem (if I remember correctly) with data spanning several orders of magnitude, also my data is all over the place so I need a programmatic solution of some sort, not a hard set solution.
see: http://matplotlib.org/api/ticker_api.html
Below I have included example data, code, and a screenshot.
#! /usr/bin/env python
from matplotlib import pyplot as plt
data = [
[1.83186088e-08,0.03275],
[1.07139009e-07,0.03275],
[2.06376627e-07,0.03275],
[3.03918517e-07,0.03275],
[4.06032883e-07,0.03275],
[5.01194017e-07,0.03275],
[6.02195723e-07,0.03275],
[7.03536925e-07,0.03275],
[8.04625154e-07,0.03275],
[9.06401951e-07,0.03275],
[1.00041895e-06,0.03275],
[1.10230745e-06,0.03275],
[1.2042525e-06,0.03275],
[1.30647822e-06,0.03275],
[1.40109887e-06,0.03275],
[1.50380097e-06,0.03275],
[1.60683242e-06,0.03275],
[1.70208505e-06,0.03275],
[1.80545692e-06,0.03275],
[1.90090648e-06,0.03275],
[2.00453092e-06,0.03275],
[2.10018627e-06,0.03275],
[2.20401747e-06,0.03275],
[2.30009359e-06,0.03275],
[2.4043033e-06,0.03275],
[2.50066449e-06,0.03275],
[2.60513728e-06,0.03275],
[2.70165405e-06,0.03275],
[2.80635938e-06,0.03275],
[2.90331342e-06,0.03275],
[3.00021199e-06,0.03275],
[3.10546819e-06,0.03275],
[3.20257899e-06,0.03275],
[3.30032923e-06,0.0327499999],
[3.40612833e-06,0.0327499999],
[3.50401732e-06,0.0327499997],
[3.60153069e-06,0.0327499996],
[3.70700708e-06,0.0327499993],
[3.80456907e-06,0.0327499988],
[3.90259984e-06,0.0327499982],
[4.00084149e-06,0.0327499973],
[4.10700266e-06,0.0327499959],
[4.2047462e-06,0.0327499942],
[4.30209468e-06,0.0327499918],
[4.40018204e-06,0.0327499886],
[4.50712875e-06,0.032749984],
[4.60630591e-06,0.0327499785],
[4.70519881e-06,0.0327499715],
[4.80398305e-06,0.0327499628],
[4.90251297e-06,0.0327499521],
[5.00182752e-06,0.032749939],
[5.10157551e-06,0.0327499232],
[5.20157575e-06,0.0327499043],
[5.30145192e-06,0.0327498822],
[5.40127044e-06,0.0327498565],
[5.500537e-06,0.0327498272],
[5.60773155e-06,0.0327497911],
[5.70660709e-06,0.0327497534],
[5.80610521e-06,0.0327497112],
[5.90651786e-06,0.0327496642],
[6.00749437e-06,0.0327496124],
[6.10822094e-06,0.0327495566],
[6.20042255e-06,0.0327495018],
[6.30049028e-06,0.0327494386],
[6.40035803e-06,0.0327493715],
[6.50035477e-06,0.0327493004],
[6.60056805e-06,0.0327492251],
[6.70029936e-06,0.0327491461],
[6.80054193e-06,0.0327490625],
[6.90130872e-06,0.0327489743],
[7.00202598e-06,0.0327488818],
[7.10217348e-06,0.0327487855],
[7.20243015e-06,0.0327486847],
[7.30199609e-06,0.0327485801],
[7.40193254e-06,0.0327484707],
[7.50188319e-06,0.0327483567],
[7.60306205e-06,0.0327482367],
[7.70357184e-06,0.0327481129],
[7.80343389e-06,0.0327479853],
[7.90330165e-06,0.0327478532],
[8.00348513e-06,0.0327477162],
[8.10167039e-06,0.0327475777],
[8.206328e-06,0.0327474253],
[8.3020567e-06,0.0327472819],
[8.40527826e-06,0.0327471228],
[8.50095898e-06,0.0327469714],
[8.60536828e-06,0.0327468019],
[8.70106059e-06,0.0327466426],
[8.80396558e-06,0.032746467],
[8.90727378e-06,0.0327462865],
[9.00225164e-06,0.0327461166],
[9.10359892e-06,0.0327459311],
[9.20470894e-06,0.0327457418],
[9.30582982e-06,0.0327455481],
[9.40750123e-06,0.0327453488],
[9.50134495e-06,0.0327451608],
[9.60358199e-06,0.0327449513],
[9.70705637e-06,0.0327447344],
[9.80377546e-06,0.0327445269],
[9.90091941e-06,0.032744314],
]
times=[]
vals=[]
for elem in data:
times.append(elem[0])
vals.append(elem[1])
plt.plot(times,vals)
plt.show()
screen_shot

You might try using the Engineering Formatter:
times=[]
vals=[]
for elem in data:
times.append(elem[0])
vals.append(elem[1])
plt.plot(times,vals)
plt.show()
formatter = matplotlib.ticker.EngFormatter(unit='S', places=3)
formatter.ENG_PREFIXES[-6] = 'u'
plt.axes().yaxis.set_major_formatter(formatter)
Which will look like this:

This is a known problem. You'd be better to analyse the data manually for its limits, like you have done in the screen shot, and use ax.set_ylim(min, max) yourself after plotting. You can also turn off the offset with:
import matplotlib.ticker as mticker
# plot some stuff
# ...
y_formatter = mticker.ScalarFormatter(useOffset=False)
ax.yaxis.set_major_formatter(y_formatter)

I think that you best option is to use logaritmic axis, but if you need to create the graphic with linear axis, you must set the power limits yourself. You can compute the power limits using math.log10:
import math
from matplotlib import ticker
# Compute the span of the data
pow_min = math.floor(math.log10(min(vals)))
pow_max = math.ceil(math.log10(max(vals)))
# Create a scalar formatter without offset, in order to have
# the right exponent over the yaxis
fmt = ticker.ScalarFormatter(useOffset=False)
fmt.set_powerlimits((pow_min, pow_max))
fig = plt.figure()
ax1 = fig.add_subplot(1, 1, 1)
ax1.plot(times, vals)
ax1.yaxis.set_major_formatter(fmt) # Set the formatter

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Plot spectroscopic data from pandas dataframe in 3D with different array length - python

Related

How can I calculate the time lag between two similar time series?

plotnine geom_boxplot ignores required aesthetic and requires unnecessary aesthetic

pymc3 multivariate traceplot color coding

Using python and matplotlib, fill between two lines not giving expected output

matplotlib; fractional powers of ten; scientific notation

Categories

Resources