Display X,Y values on data points Python, Pandas, matplotlib - python

I have a large data set (growing number of x axis), the one shown at the bottom is enough to reproduce my case. I'm trying to plot this using Pandas and matplotlib. [I'm very new with python, so apologies in advance for any mistakes.]
import os
import sys
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('test.csv', delimiter=',', index_col='Header')
df.plot(marker='s', figsize=(18,9))
plt.title("test")
plt.ylabel('y axis')
plt.xlabel('x axis')
plt.show()
sys.exit()
When I use the code above, it plots the graph, but I've got few issues.
Y axis data points looks they have been scaled (which I did not ask)
Missing few x axis data points, I understated it may not possible display all X axis values on the axis, which is fine. I was wondering would it be possible to display them on the actual data point when I move the mouse over them.
The latter is really crucial feature I'm after. My script is intended to track some data and when there is a visible bump in the plot, I need to be able to know which X values actually cause the bump. If anyone has suggestion for achieving the similar effect, they are much appreciated.
Header,header1,header2,header3,header4,header5,header6
x1,2115211,2223666,13332666,8448144,655564,366361
x2,2115213,2223666,13332666,8448144,655564,366361
x3,2115213,2223666,13332666,8448144,655564,366361
x4,2115213,2223666,13332666,8448144,655564,366361
x5,2115262,2229973,13330187,8448756,655523,366379
x6,2115262,2229973,13330187,8448756,655523,366379
x7,2115262,2229973,13330187,8448756,655523,366379
x8,2115277,2228613,13335478,8448221,655556,366362
x9,2115277,2228613,13335478,8448221,655556,366362
x10,2115211,2223666,13332666,8448144,655564,366361
x11,2115213,2223666,13332666,8448144,655564,366361
x12,2115213,2223666,13332666,8448144,655564,366361
x13,2115213,2223666,13332666,8448144,655564,366361
x14,2115213,2223666,13332666,8448144,655564,366361
x15,2115262,2229973,13330187,8448756,655523,366379
x16,2115262,2229973,13330187,8448756,655523,366379
x17,2115262,2229973,13330187,8448756,655523,366379
x18,2115277,2228613,13335478,8448221,655556,366362
x19,2115277,2228613,13335478,8448221,655556,366362
Any help is much appreciated.

Related

Matplotlib - Y axis change my values automatically

I'm building a python application to keep track of the BTC values over time through a graph that updates in realtime; in the x axis there is time and in the y axis the value of the corresponding BTC. my problem is that at the beginning the BTC values in the y axis are correct as in the first figure, but after some data received, the graph decides to "zoom" and express all the data in a different notation, as in the second figure (open imgur link).
https://imgur.com/a/spogs9G
I tried these two lines of code but without success:
plt.autoscale(enable=False, axis='y')
ax.get_yaxis().get_major_formatter().set_scientific(False)
If it can help, i am using:
import matplotlib.pyplot as plt
import matplotlib.animation as animation
If you would like to see all or part of the code, please ask.
Thank you in advance.
Fix the y-axis by using (as an example):
ax.set_ylim(0,1000)
and define the lower and upper bound accordign to your problem.
ax.set_ylim(lower_bound,upper_bound)

Python using custom color in plot

I'm having a problem that (I think) should have a fairly simple solution. I'm still a relative novice in Python, so apologies if I'm doing something obviously wrong. I'm just trying to create a simple plot with multiple lines, where each line is colored by its own specific, user-defined color. When I run the following code as a test for one of the colors it ends up giving me a blank plot. What am I missing here? Thank you very much!
import numpy as np
import matplotlib.pyplot as plt
from colour import Color
dbz53 = Color('#DD3044')
*a bunch of arrays of data, two of which are called x and mpt1*
fig, ax = plt.subplots()
ax.plot(x, mpt1, color='dbz53', label='53 dBz')
ax.set_yscale('log')
ax.set_xlabel('Diameter (mm)')
ax.set_ylabel('$N(D) (m^-4)$')
ax.set_title('N(D) vs. D')
#ax.legend(loc='upper right')
plt.show()
The statement
ax.plot(x, mpt1, color='dbz53', label='53 dBz')
is wrong with 'dbz53' where python treated it as a string of unknown rgb value.
You can simply put
color='#DD3044'
and it will work.
Or you can try
color=dbz53.get_hex()
without quote if you want to use the colour module you imported.
In the plot command, you could enter Hex colours. A much more simple way to beautify your plot would be to simply use matplotlib styles. For instance, before any plot function, just write
plt.style.use('ggplot')

Python 2.7 time series non numeric values

I am using Python 2.7 and need to draw a time series using matplotlib library. My y axis data is numeric and everything is ok with it.
The problem is my x axis data which is not numeric, and matplotlib does not cooperate in this case. It does not draw me a time series even though it is not supposed to affect the correctness of the plot, because the x axis data is arranged by a given order anyway and it's order does not affect anything logically.
For example let's say the x data is ["i","like","python"] and the y axis data is [1,2,3].
I did not add my code because I've found that the code is ok, it works if I change the data to all numeric data.
Please explain me how can I use matplotlib to draw the time series, without making me to convert the x values to numeric stuff.
I've based my matplotlib code on following answers: How to plot Time Series using matplotlib Python, Time Series Plot Python.
Matplotlib requires someway of positioning those labels. See the following example:
import matplotlib.pyplot as plt
x = ["i","like","python"]
y = [1,2,3]
plt.plot(y,y) # y,y because both are numeric (you could create an xt = [1,2,3]
plt.xticks(y,x) # same here, the second argument are the labels.
plt.show()
, that results in this:
Notice how I've put the labels there but had to somehow say where they are supposed to be.
I also think you should put a part of your code so that it's easier for other people to suggest upon.

matplotlib data accessible outside of xlim range

Consider the following code
import matplotlib.pyplot as plt
import numpy as np
time=np.arange(-100,100,01)
val =np.sin(time/10.)
time=-1.0*time
plt.figure()
plt.plot(time,val)
plt.xlim([70,-70])
plt.savefig('test.pdf')
when I open the pdf in inkscape, I can select (with F2) the entire data, it's just invisible outside of the specified xlim interval:
The problem seems to be the line
time=-1.0*time
If I omit this line, everything works perfectly.. no idea why this is. I often need such transformations because I deal with paleo-climate data which are sometimes given in year B.C. and year A.D., respectively.
The problem I see with this behavior is that someone could in principle get the data outside the range which I want to show.
Has someone a clue how to solve this problem (except for an slice of the arrays before plotting)?
I use matplotlib 1.1.1rc2
You can mask your array when plotting according to the limits you choose. Yes, this also requires changes to the code, but maybe not as extensive as you might fear. Here's an updated version of your example:
import matplotlib.pyplot as plt
import numpy as np
time=np.arange(-100,100,01)
val =np.sin(time/10.)
time=-1.0*time
plt.figure()
# store the x-limites in variables for easy multi-use
XMIN = -70.0
XMAX = 70.0
plt.plot(np.ma.masked_outside(time,XMIN,XMAX),val)
plt.xlim([XMIN,XMAX])
plt.savefig('test.pdf')
The key change is using np.ma.masked_outside for your x-axis value (note: the order of XMIN and XMAX in the mask-command is not important).
That way, you don't have to change the array time if you wanted to use other parts of it later on.
When I checked with inkscape, no data outside of the plot was highlighted.

weird range value in the colorbar, matplotlib

I am a new user to the python & matplotlib, this might be a simple question but I searched the internet for hours and couldn't find a solution for this.
I am plotting precipitation data from which is in the NetCDF format. What I find weird is that, the data doesn't have any negative values in it.(I checked that many times,just to make sure). But the value in the colorbar starts with a negative value (like -0.0000312 etc). It doesnt make sense because I dont do any manipulations to the data, other that just selecting a part of the data from the big file and plotting it.
So my code doesn't much to it. Here is the code:
from mpl_toolkits.basemap import Basemap
import numpy as np
import matplotlib.pyplot as plt
from netCDF4 import Dataset
cd progs
f=Dataset('V21_GPCP.1979-2009.nc')
lats=f.variables['lat'][:]
lons=f.variables['lon'][:]
prec=f.variables['PREC'][:]
la=lats[31:52]
lo=lons[18:83]
pre=prec[0,31:52,18:83]
m = Basemap(width=06.e6,height=05.e6,projection='gnom',lat_0=15.,lon_0=80.)
x, y = m(*np.meshgrid(lo,la))
m.drawcoastlines()
m.drawmapboundary(fill_color='lightblue')
m.drawparallels(np.arange(-90.,120.,5.),labels=[1,0,0,0])
m.drawmeridians(np.arange(0.,420.,5.),labels=[0,0,0,1])
cs=m.contourf(x,y,pre,50,cmap=plt.cm.jet)
plt.colorbar()
The output that I got for that code was a beautiful plot, with the colorbar starting from the value -0.00001893, and the rest are positive values, and I believe are correct. Its just the minimum value thats bugging me.
I would like to know:
Is there anything wrong in my code? cos I know that the data is right.
Is there a way to manually change the value to 0?
Is it right for the values in the colorbar to change everytime we run the code, cos for the same data, the next time I run the code, the values go like this " -0.00001893, 2.00000000, 4.00000000, 6.00000000 etc"
I want to customize them to "0.0, 2.0, 4.0, 6.0 etc"
Thanks,
Vaishu
Yes, you can manually format everything about the colorbar. See this:
import matplotlib.colors as mc
import matplotlib.pyplot as plt
plt.imshow(X, norm=mc.Normalize(vmin=0))
plt.colorbar(ticks=[0,2,4,6], format='%0.2f')
Many plotting functions including imshow, contourf, and others include a norm argument that takes a Normalize object. You can set the vmin or vmax attributes of that object to adjust the corresponding values of the colorbar.
colorbar takes the ticks and format arguments to adjust which ticks to display and how to display them.

Categories

Resources