Multiple labels in Matplotlib - python

I've created a plot and I want the first item that I plot to have a label that is partly a string and partly element 0 of array "t". I tried setting the variable the_initial_state equal to a string:
the_initial_state = str('the initial state')
And the plotting as follows:
plt.figure(5)
fig = plt.figure(figsize=(6,6), dpi=1000)
plt.rc("font", size=10)
plt.title("Time-Dependent Probability Density Function")
plt.xlabel("x")
plt.xlim(-10,10)
plt.ylim(0,0.8)
plt.plot(x,U,'k--')
**plt.plot(x,Pd[0],'r',label= the_initial_state, label =t[0])**
plt.plot(x,Pd[1],'m',label=t[1])
plt.plot(x,Pd[50],'g',label=t[50])
plt.plot(x,Pd[100],'c',label=t[100])
plt.legend(title = "time", bbox_to_anchor=(1.05, 0.9), loc=2, borderaxespad=0.)
But I receive an "invalid syntax" error for the line that is indicated by ** **.
Is there any way to have a label that contains a string and an element of an array to a fixed number of decimal places?

'the initial state' already is a string, so you do not need to cast it again.
I do not see a syntax error for the moment, but surely you cannot set the label twice.
Concattenating a string and a float in python can e.g. be done using the format function.
the_initial_state = 'the initial state {}'.format(t[0])
plt.plot(x,Pd[0],'r',label= the_initial_state)
should work.
There is a nice page outside explaining the format syntax. For example, to format the float 2.345672 to 2 decimal places, use
"{:.2f}".format(2.345672)

Related

Create a scatter plot from an ndarray using the position in the row as the x-axis value and the value in the array for the y-axis

much like the title says I am trying to create a graph that shows 1-6 on the x-axis (the values position in the row of the array) and its value on the y-axis. A snippet from the array is shown below, with each column representing a coefficient number from 1-6.
[0.99105 0.96213 0.96864 0.96833 0.96698 0.97381]
[0.99957 0.99709 0.9957 0.9927 0.98492 0.98864]
[0.9967 0.98796 0.9887 0.98613 0.98592 0.99125]
[0.9982 0.99347 0.98943 0.96873 0.91424 0.83831]
[0.9985 0.99585 0.99209 0.98399 0.97253 0.97942]
It's already set up as a numpy array. I think it's relatively straightforward, just drawing a complete mental blank.
Any ideas?
Do you want something like this?
a = np.array([[0.99105, 0.96213, 0.96864, 0.96833, 0.96698, 0.97381],
[0.99957, 0.99709, 0.9957, 0.9927, 0.98492, 0.98864],
[0.9967, 0.98796, 0.9887, 0.98613, 0.98592, 0.99125],
[0.9982, 0.99347, 0.98943, 0.96873, 0.91424, 0.83831],
[0.9985, 0.99585, 0.99209, 0.98399, 0.97253, 0.97942]])
import matplotlib.pyplot as plt
plt.scatter(x=np.tile(np.arange(a.shape[1]), a.shape[0])+1, y=a)
output:
Note that you can emulate the same with groups using:
plt.plot(a.T, marker='o', ls='')
x = np.arange(a.shape[0]+1)
plt.xticks(x, x+1)
output:

Matplotlib - Several lines on the same plot

I am converting some old Python 2.7 code to 3.6.
My routine plots the first line OK but subsequent lines seem to start where the previous line left off. (Running on-line at www.pythonanywhere.com)
My code:
import matplotlib
from matplotlib import pyplot;
k = 0
while k < len(Stations):
# Draw the graph
fig.patch.set_facecolor('black') # Outside border
pyplot.rcParams['axes.facecolor'] = 'black' # Graph background
pyplot.rcParams['axes.edgecolor'] = 'red'
pyplot.tick_params(axis='x', colors='yellow')
pyplot.tick_params(axis='y', colors='yellow')
pyplot.ylim(float(BtmLimit),float(TopLimit))
pyplot.ylabel("Percent of normal range.", size=10, color = "yellow")
pyplot.xticks([]) # Hide X axis
pyplot.title("Plotted at %sGMT, %s %s %s" % (thour, tday, tdate, tmonth), color = "yellow")
if Error == 'False': pyplot.plot(Epoch, Scaled, color = (Color), linewidth=1.9)
pyplot.plot(Epoch, Top, color = [0,0.5,0]) # Green lines
pyplot.plot(Epoch, Btm, color = [0,0.5,0])
k = k + 1
pyplot.savefig(SD+'RiverLevels.png', facecolor='black', bbox_inches='tight')
pyplot.show()
pyplot.close()
The data looks like this:
Epoch
['1638046800', '1638047700', '1638048600', '1638049500', '1638050400', '1638051300', '1638052200', '1638053100', '1638054000', '1638054900', '1638
055800', '1638056700', '1638057600', '1638058500', '1638059400', '1638060300', '1638061200', '1638062100', '1638063000', '1638063900', '1638064800
', '1638065700', '1638066600', '1638067500', '1638068400', '1638069300', '1638070200', '1638071100', '1638072000', '1638072900', '1638073800', '16
38074700', '1638075600', '1638076500', '1638077400', '1638078300', '1638079200', '1638080100', '1638081000', '1638081900', '1638082800', '16380837
00', '1638084600', '1638085500', '1638086400', '1638087300', '1638088200', '1638089100', '1638090000', '1638090900', '1638091800', '1638092700', '
1638093600', '1638094500', '1638095400']
Scaled
['32.475247524752476', '33.069306930693074', '33.76237623762376', '33.56435643564357', '33.56435643564357', '33.86138613861387', '34.1584158415841
6', '34.35643564356436', '34.554455445544555', '34.554455445544555', '34.75247524752476', '34.95049504950495', '35.049504950495056', '35.148514851
48515', '35.049504950495056', '35.14851485148515', '35.44554455445545', '35.54455445544555', '35.54455445544555', '35.34653465346535', '35.5445544
5544555', '35.64356435643565', '35.84158415841585', '35.742574257425744', '35.54455445544555', '35.44554455445545', '35.44554455445545', '35.34653
465346535', '35.24752475247525', '35.049504950495056', '34.95049504950495', '34.95049504950495', '34.851485148514854', '34.65346534653466', '34.35
643564356436', '34.15841584158416', '34.35643564356436', '34.35643564356436', '34.25742574257426', '34.05940594059406', '33.86138613861387', '33.6
63366336633665', '33.86138613861387', '33.663366336633665', '33.663366336633665', '33.46534653465347', '33.366336633663366', '33.56435643564357',
'33.663366336633665', '33.663366336633665', '33.663366336633665', '33.663366336633665', '33.960396039603964', '34.05940594059406', '34.05940594059
406']
Output image
I guess this may be due to using strings instead of numbers. When you use strings, the x values are taken as categories and not ordered numerically but in the order they appear in the list (unless a category is exactly repeated). I understand that the snippet is not complete, but the values of Epoch and Scaled actually change on each iteration.
After plotting the first set of data, any values not present in the first set will be positioned "afterwards" those of the first set (ie: to the right of first set's last point in x, and higher than the last point in y). When the second set of data is plotted, the first x values have not appeared in the previous set, so they are plotted afterwards (beginning of light blue line in the plot), regardless of their numeric value. Then, the final values are the same of those that had appeared in the first set, so the line goes back to the left of the figure.
You can try using [float(x) for x in Epoch] and [float(y) for y in Scaled] in the plots. As I see that there are spaces in the strings representing the numbers, you could use a function like this:
def flist_from_slist(data):
return [float(x.replace(' ', '')) for x in data]
And replace the pyplot.plot call by:
pyplot.plot(flist_from_slist(Epoch), flist_from_slist(Scaled), linewidth=1.9)
Moreover, there is a lot of code inside the loop that could be moved outside (setting the ticks, labels, etc).

Giving imshow a custom list of yaxis labels

I'm trying to give pyplot.imshow() a list of values that aren't necessarily linear to use as y-axis labels. The truncated list is:
run_numbers = array([815676, 815766, 815767, 815768, 815769, 815770, 815771, 815772,
815773, 815774, 815775, 815776, 815777, 815778, 815779, 815780,
815781, 815783, 815784, 815785, 815786, 815789, 815790, 815792,
815793, 815794, 815795, 815796, 815797, 815798, 815799, 815800,
815801, 815802, 815803, 815804, 815805, 815806, 815807, 815808,
815809, 815811, 815812, 815813, 815814, 815815, 815816, 815817,
815818, 815819, 815820, 815821, 815822, 815823, 815824, 815825,
815826, 815827, 815829, 815830, 815831, 815832, 815833, 815834,
815835, 815836, 815837, 815838, 815839, 815841, 815842, 815843,
815844, 815845, 815846, 815847, 815848, 815849, 815851, 815852,
815853, 815854, 815855, 815856, 815857, 815858, 815859, 815860,
815861, 815863, 815864, 815865, 815866, 815867, 815869, 815870,
815871, 815872, 815873, 815874, 815875, 815876, 815877, 815878])
My image looks like this:
At first glance, this seems fine. But imshow isn't using the values from the list, instead it's using a linear range from 815676 to the max value of the list. I've tried a few different things:
plt.imshow(np.array(profiles), aspect='auto', vmin=-5, vmax=20, extent=[0,500,max(run_numbers), min(run_numbers)])
The above code gave the image above, which makes sense given what I put in extent.
Is there a way to tell imshow to use the values in the list as the yaxis label? I've tried ax.yticklabels and ax.ytick, but those also give a linear progression of numbers instead of the list values.
Please let me know how I can clarify my question if there's any confusion. I can also provide an example data set if my question isn't clear.
Letting the extent go between 0 and the number of labels minus 1, and then use plt.yticks(range(N), run_numbers) would set the labels. As there are about 100 labels, this would look very crowded, which could be mitigated by setting a large figure size and a small font. Or the labels could be set every with steps, e.g. steps of 10:
import numpy as np
import matplotlib.pyplot as plt
run_numbers = np.array([815676, 815766, 815767, 815768, 815769, 815770, 815771, 815772, 815773, 815774, 815775, 815776, 815777, 815778, 815779, 815780, 815781, 815783, 815784, 815785, 815786, 815789, 815790, 815792, 815793, 815794, 815795, 815796, 815797, 815798, 815799, 815800, 815801, 815802, 815803, 815804, 815805, 815806, 815807, 815808, 815809, 815811, 815812, 815813, 815814, 815815, 815816, 815817, 815818, 815819, 815820, 815821, 815822, 815823, 815824, 815825, 815826, 815827, 815829, 815830, 815831, 815832, 815833, 815834, 815835, 815836, 815837, 815838, 815839, 815841, 815842, 815843, 815844, 815845, 815846, 815847, 815848, 815849, 815851, 815852, 815853, 815854, 815855, 815856, 815857, 815858, 815859, 815860, 815861, 815863, 815864, 815865, 815866, 815867, 815869, 815870, 815871, 815872, 815873, 815874, 815875, 815876, 815877, 815878])
N = len(run_numbers)
profiles = np.random.randn(N, 501).cumsum(axis=1)
plt.imshow(profiles, aspect='auto', extent=[0, 500, N-1, 0])
plt.yticks(range(0, N, 10), run_numbers[::10])
plt.show()
With
plt.figure(figsize=(10, 16))
plt.imshow(profiles, aspect='auto', extent=[0, 500, N-1, 0])
plt.yticks(range(N), run_numbers, fontsize=8)
It could look like

How do the x and y parameters in the Label object work for bokeh?

I've read the documentation for the Label class in Bokeh but the x and y parameters are quite confusing. Their behavior seems to change if you pass something to the x_units and y_units parameters but I don't understand what the units are supposed to be by default.
More specifically, I have a list of strings that I'm using for my x-axis:
xlab = [
'COREPCE2',
'COREPCE3',
'COREPCE4',
'COREPCE5',
'COREPCE6',
'',
'T5YIE'
]
p = figure(..., y_range = (0,.04), x_range = xlab)
If I wanted to draw basically anything else on the plot, I could just use those strings. For example I drew some lines like this:
p.line(['COREPCE2', 'T5YIE'], [.02,.02], color = 'black', line_dash = 'dashed')
p.line(['', ''], [0,.04], color = 'black')
And that works fine, this is the full chart.
Here's the issue though. I want to put a text label on the "COREPCE4" location of the x axis. If I try just passing the string for the x parameter in the Label class it just doesn't work:
section = Label(x = 'COREPCE4', y = .03, text = 'Survey of Professional Forecasters: August 9, 2019')
p.add_layout(section)
It throws an error: ValueError: expected a value of type Real, got COREPCE4 of type str. I don't really know what units its expecting. Is there a way to make Bokeh recognize that I want to use the x-axis label as my x parameter in the same way I've done with the other glyphs?
The propertied x_units, y_units, refer to screen (pixel) vs data-space (axis) units. As of Bokeh 1.3.4 the x and y properties of Label can only be set from floating point numbers, so they cannot be used directly with categorical coordinates. For now you should use LabelSet, even if you are only showing a single label, since it can work with categorical coordinates.

Struggling to correctly utilize the str.find() function

I'm trying to use the str.find() and it keeps raising an error, what am I doing wrong?
I have a matrix where the 1st column is numbers and the 2nd is an abbreviation assigned to those letters. the abbrevations are either ED, LI or NA, I'm trying to find the positions that correspond to those letters so that I can plot a scatter graph that is colour coded to match those 3 groups.
mat=sio.loadmat('PBMC_extract.mat') #loading the data file into python
data=mat['spectra']
data_name=mat['name'] #calling in varibale
data_name = pd.DataFrame(data_name) #converting intoa readable matrix
pca=PCA(n_components=20) # preforms pca on data with 20 components
pca.fit(data) #fits to data set
datatrans=pca.transform(data) #transforms data using PCA
# plotting the graph that accounts for majority of data and noise
plt.plot(np.cumsum(pca.explained_variance_ratio_))
plt.xlabel('Number of components')
plt.ylabel('Cumulative explained variance')
fig = plt.figure()
ax1 = Axes3D(fig)
#str.find to find individual positions of anticoagulants
str.find(data_name,'ED')
#renaming data for easiness
x_data=datatrans[0:35,0]
y_data=datatrans[0:35,1]
z_data=datatrans[0:35,2]
x2_data=datatrans[36:82,0]
y2_data=datatrans[36:82,1]
z2_data=datatrans[36:82,2]
x3_data=datatrans[83:97,0]
y3_data=datatrans[83:97,1]
z3_data=datatrans[83:97,2]
# scatter plot of score of PC1,2,3
ax1.scatter(x_data, y_data, z_data,c='b', marker="^")
ax1.scatter(x2_data, y2_data, z2_data,c='r', marker="o")
ax1.scatter(x3_data, y3_data, z3_data,c='g', marker="s")
ax1.set_xlabel('PC 1')
ax1.set_ylabel('PC 2')
ax1.set_zlabel('PC 3')
plt.show()
the error that keeps showing up is the following;
File "/Users/emma/Desktop/Final year project /working example of colouring data", line 49, in <module>
str.find(data_name,'ED')
TypeError: descriptor 'find' requires a 'str' object but received a 'DataFrame'
The error is because the find method expects a str object instead of a DataFrame object. As PiRK mentioned the problem is you're replacing the data_name variable here:
data_name = pd.DataFrame(data_name)
I believe it should be:
data = pd.DataFrame(data_name)
Also, although str.find(data_name,'ED') works, the suggested way to is to pass only the search term like this:
data_name.find('ED')
the proper syntax would be
data_name.find('ED')
look at the examples here
https://www.programiz.com/python-programming/methods/string/find
EDIT 1
though I just noticed data_name is a pandas DataFrame, so that won't work? What exactly are you trying to do?
your broken function call isn't even returning into a variable? So it's hard to answer your question?

Categories

Resources