Changing plot marker for every element inside the plot - python

I have a matrix that i am using scatter plot to visualize it of size 800x2. I am trying to change the marker type for every 100th element, for instance from 0 to 99 markers would be 'x' from 100 to 199 markers would be 'o' and so forth.
However i get the following error:
TypeError: only integer scalar arrays can be converted to a scalar
index
This is my actual code:
from matplotlib.pyplot import figure
import numpy as np
color=['b','r']
markers = ['x', 'o', '1', '.', '2', '>', 'D', 'v']
X_lda_colors= [ color[i] for i in list(np.array(y)%8) ]
X_lda_markers= [ markers[i] for i in list(np.array(y)%2) ]
plt.xlabel('1-eigenvector')
plt.ylabel('2-eigenvector')
for i in range(X_lda.shape[0]):
plt.scatter(
X_lda[i,0],
X_lda[i,1],
c=X_lda_colors[i],
marker=X_lda_markers[i],
cmap='rainbow',
alpha=0.7,
edgecolors='w')
plt.show()
My goal is to basically use any sort of marker to differentiate between every 100th element inside my x_lda[i, 1] label that are clusters being plotted. This code used to work following this question: Plotting different clusters markers for every class in scatter plot.
But for my case, it gives me the error described above.
Here's a reproducible example:
X_lda = np.asarray([([1, 2], [1,5], [2, 3],[3, 5], [3, 4], [6, 9], [7, 9], [7, 8], [7, 10], [7, 12], [13, 14], [15, 16], [12, 14], [13, 15], [12, 14], [14, 14], [13, 4], [12, 5], [13, 4], [13, 3], [12, 6])]).reshape(21, 2)
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
from matplotlib.pyplot import figure
plt.xlabel('LD1')
plt.ylabel('LD2')
plt.scatter(
X_lda[:,0],
X_lda[:,1],
c=['red', 'red', 'red', 'red', 'red', 'red', 'red', 'red', 'red', 'red', 'green', 'green', 'green', 'green', 'green', 'green', 'green', 'green', 'green', 'green', 'green'],
cmap='rainbow',
alpha=0.7,
edgecolors='w'
)
For this 21x2 array, i'd like for change the first 7 elements to 'x', next 7 elements to 'o', and the last 7 elements to '>' for instance.

I think you may be looking for something like this, exchanging row 5 and 6 in your code above:
X_lda_colors= [ color[i%2] for i in range(X_lda.shape[0]) ]
X_lda_markers= [ markers[i%8] for i in range(X_lda.shape[0]) ]
However, you should not loop throughout all of your 800 points and create one plot each. A workaround would be something like this:
# plot each point in blue
plt.scatter(
X_lda[:,0], X_lda[:,1]
c = "b",
...
)
# plot again using every 100th element in red
plt.scatter(
X_lda[::100,0], X_lda[::100,1]
c = "r",
...
)
this will overprint each 100th element with a second dot in red. You end up having only two plot objects with 800 and 8 points respectively.

Related

"nticks" not being respected for xaxis in plotly scene

Let's begin with the code:
b_min = [-3, 6, 0]
b_max = [24, 24, 9]
n_vertices = [10, 10, 10]
p_a = [-1, 11, 4.5]
p_b = [11, 9, 9]
p_c = [6, 8, 6]
p_d = [-1, 7, 10]
points = [p_a, p_b, p_c, p_d]
df = pd.DataFrame(points, columns=['x', 'y', 'z'])
figure = px.scatter_3d(df, x='x', y='y', z='z')
figure.update_layout(scene={
'xaxis': {'nticks': n_vertices[0], 'range': [b_min[0], b_max[0]]},
'yaxis': {'nticks': n_vertices[1], 'range': [b_min[1], b_max[1]]},
'zaxis': {'nticks': n_vertices[2], 'range': [b_min[2], b_max[2]]}
})
figure.show()
I expect there to be 10 ticks per axis. This is true for axis y and z, but not for x. Why?
Looking through the documentation for scatter3d traces here, nticks is the maximum number of a ticks for an axis, and the actual number of ticks is less than or equal to nticks.

how to plot a histogram by given points in python 3

I have 60 numbers divided into 8 intervals:
[[534, 540.0, 3], [540.0, 546.0, 3], [546.0, 552.0, 14], [552.0, 558.0, 8], [558.0, 564.0, 14], [564.0, 570.0, 9], [570.0, 576.0, 6], [576.0, 582.0, 3]]
The number of numbers in each interval is divided by 6:
[0.5, 0.5, 2.33, 1.33, 2.33, 1.5, 1.0, 0.5]
How do I create a histogram so that the height of the bars corresponds to the obtained values, while signing the intervals in accordance with my intervals? The result should be something like this
i do not have reputation to post images, so
Running F Blanchet's code generates the following graph in my IPython console:
That doesn't really look like your image. I think you're looking for something more like this, where the x-ticks are between the bars:
This is the code I used to generate the above plot:
import matplotlib.pyplot as plt
# Include one more value for final x-tick.
intervals = list(range(534, 583, 6))
# Include one more bar height that == 0.
bar_height = [0.5, 0.5, 2.33, 1.33, 2.33, 1.5, 1.0, 0.5, 0]
plt.bar(intervals,
bar_height,
width = [6] * 8 + [0], # Set width of 0 bar to 0.
align = "edge", # Align ticks at edge of bars.
tick_label = intervals) # Make tick labels explicit.
You can use matplotlib :
import matplotlib.pyplot as plt
data = [[534, 540.0, 3], [540.0, 546.0, 3], [546.0, 552.0, 14], [552.0, 558.0, 8], [558.0, 564.0, 14], [564.0, 570.0, 9], [570.0, 576.0, 6], [576.0, 582.0, 3]]
x = [element[0]+3 for element in data]
y = [element[2]/6 for element in data]
width = 6
plt.bar(x, y, width, color="blue")
plt.show()
More documentation here

Plot multiple circles with numbers in them in Python with loop (blank figure returned)

Similar to this question but for many circles with numbers in them. I don't know why but the figure that is generated comes out blank. I would like a figure with 9 circles (having 1 of 3 colors), with the "job_id" printed in the circle.
import matplotlib.pyplot as plt
import pandas as pd
d = {'job_id': [1, 2, 3, 4, 5, 6, 7, 8, 9],
'hub': ['ZH1', 'ZH1', 'ZH1', 'ZH2', 'ZH2', 'ZH3', 'ZH3', 'ZH3', 'ZH3'],
'alerts': [18, 35, 45, 8, 22, 34, 29, 20, 30],
'color': ['orange', 'orange', 'orange', 'green', 'green', 'lightblue', 'lightblue', 'lightblue', 'lightblue']}
df=pd.DataFrame(data=d)
ax=plt.subplot(111)
for index, row in df.iterrows():
print(row)
ax.text(index,row['alerts'],str(row['job_id']), transform=plt.gcf().transFigure,
bbox={"boxstyle" : "circle", "color":row['color']})
plt.show()
Two problems.
The transform is set to the figure transform. This would take numbers between 0 and 1 in both directions. However your data ranges much above 1. Since it seems you want to show the circles in data coordinates anyways, remove the transform=... part.
Text elements cannot be used to autoscale the axes. You would hence need to set the limits manually.
Complete code:
import matplotlib.pyplot as plt
import pandas as pd
d = {'job_id': [1, 2, 3, 4, 5, 6, 7, 8, 9],
'hub': ['ZH1', 'ZH1', 'ZH1', 'ZH2', 'ZH2', 'ZH3', 'ZH3', 'ZH3', 'ZH3'],
'alerts': [18, 35, 45, 8, 22, 34, 29, 20, 30],
'color': ['orange', 'orange', 'orange', 'green', 'green', 'lightblue', 'lightblue', 'lightblue', 'lightblue']}
df=pd.DataFrame(data=d)
ax=plt.subplot(111)
for index, row in df.iterrows():
ax.text(index, row['alerts'],str(row['job_id']),
bbox={"boxstyle" : "circle", "color":row['color']})
ax.set(xlim=(-1,len(df)), ylim=(df["alerts"].min()-5, df["alerts"].max()+5))
plt.show()
You need to map the x-y coordinates within 0-1 range. To do so I divide the x and y by the maximum value in the DataFrame. Later, I adjust the x- and y-limits accordingly and label the axes to display the actual values.
You also had only two 'green' but four 'lightblue' in your dictionary. I corrected it. I also replaced index bby row['job_id'] because index starts with 0 but you would want to plot the circle 1 at x=1
for index, row in df.iterrows():
ax.text(row['job_id']/max(d['job_id']),row['alerts']/max(d['alerts']),str(row['job_id']),
bbox={"boxstyle" : "circle", "color":row['color']})
plt.xlim(0, 1.1)
plt.ylim(0, 1.1)
plt.xticks(np.linspace(0,1,10), range(10))
plt.yticks(np.linspace(0,1,10), range(0,50,5))

Plot specific values on y axis instead of increasing scale from dataframe

When plotting 2 columns from a dataframe into a line plot, is it possible to, instead of a consistently increasing scale, have fixed values on your y axis (and keep the distances between the numbers on the axis constant)? For example, instead of 0, 100, 200, 300, ... to have 0, 21, 53, 124, 287, depending on the values from your dataset? So basically to have on the axis all your possible values fixed instead of an increasing scale?
Yes, you can use: ax.set_yticks()
Example:
df = pd.DataFrame([[13, 1], [14, 1.5], [15, 1.8], [16, 2], [17, 2], [18, 3 ], [19, 3.6]], columns = ['A','B'])
fig, ax = plt.subplots()
x = df['A']
y = df['B']
ax.plot(x, y, 'g-')
ax.set_yticks(y)
plt.show()
Or if the values are very distant each other, you can use ax.set_yscale('log').
Example:
df = pd.DataFrame([[13, 1], [14, 1.5], [15, 1.8], [16, 2], [17, 2], [18, 3 ], [19, 3.6], [20, 300]], columns = ['A','B'])
fig, ax = plt.subplots()
x = df['A']
y = df['B']
ax.plot(x, y, 'g-')
ax.set_yscale('log', basex=2)
ax.yaxis.set_ticks(y)
ax.yaxis.set_ticklabels(y)
plt.show()
What you need to do is:
get all distinct y values and sort them
set their y position on the plot according to their place on the ordered list
set the y labels according to distinct ordered values
The code below would do
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.DataFrame([[13, 1], [14, 1.8], [16, 2], [15, 1.5], [17, 2], [18, 3 ],
[19, 200],[20, 3.6], ], columns = ['A','B'])
x = df['A']
y = df['B']
y_keys = np.sort(y.unique())
y_values = range(len(y_keys))
y_dict = dict(zip(y_keys,y_values))
fig, ax = plt.subplots()
ax.plot(x,[y_dict[k] for k in y],'o-')
ax.set_yticks(y_values)
ax.set_yticklabels(y_keys)

deleting line from figure in bokeh

I am new to Bokeh. I made a widget where when I click a checkbox I want to be able to add/delete a line in a bokeh figure. I have 20 such checkboxes and I dont want to replot the whole figure, just to delete 1 line if a checkbox was unchecked.
This is done through a callback, where I have access to the figure object. I would imagine there is a way to do something like this:
F=figure()
F.line('x', 'y', source=source, name='line1')
F.line('x', 'z', source=source, name='line2')
%%in callback
selected_line_name = 'line1' # this would be determined by checkbox
selected_line = F.children[selected_line_name]
delete(selected_line)
However, I am unable to figure out how to
1) access a glyph from its parent object
2) delete a glyph
I tried setting the datasource 'y'=[], but since all column data sources have to be the same size, this removes all the plots...
There are several ways:
# Keep the glyphs in a variable:
line2 = F.line('x', 'z', source=source, name='line2')
# or get the glyph from the Figure:
line2 = F.select_one({'name': 'line2'})
# in callback:
line2.visible = False
This will work to maintain a shared 'x' data source column if glyphs are assigned as a variable and given a name attribute. The remove function fills the appropriate 'y' columns with nans, and the restore function replaces nans with the original values.
The functions require numpy and bokeh GlyphRenderer imports. I'm not sure that this method is worthwhile given the simple visible on/off option, but I am posting it anyway just in case this helps in some other use case.
Glyphs to remove or restore are referenced by glyph name(s), contained within a list.
src_dict = source.data.copy()
def remove_glyphs(figure, glyph_name_list):
renderers = figure.select(dict(type=GlyphRenderer))
for r in renderers:
if r.name in glyph_name_list:
col = r.glyph.y
r.data_source.data[col] = [np.nan] * len(r.data_source.data[col])
def restore_glyphs(figure, src_dict, glyph_name_list):
renderers = figure.select(dict(type=GlyphRenderer))
for r in renderers:
if r.name in glyph_name_list:
col = r.glyph.y
r.data_source.data[col] = src_dict[col]
Example:
from bokeh.plotting import figure, show
from bokeh.io import output_notebook
from bokeh.models import Range1d, ColumnDataSource
from bokeh.models.renderers import GlyphRenderer
import numpy as np
output_notebook()
p = figure(plot_width=200, plot_height=150,
x_range=Range1d(0, 6),
y_range=Range1d(0, 10),
toolbar_location=None)
source = ColumnDataSource(data=dict(x=[1, 3, 5],
y1=[1, 1, 2],
y2=[1, 2, 6],
y3=[1, 3, 9]))
src_dict = source.data.copy()
line1 = p.line('x', 'y1',
source=source,
color='blue',
name='g1',
line_width=3)
line2 = p.line('x', 'y2',
source=source,
color='red',
name='g2',
line_width=3)
line3 = p.line('x', 'y3',
source=source,
color='green',
name='g3',
line_width=3)
print(source.data)
show(p)
out:
{'x': [1, 3, 5], 'y1': [1, 1, 2], 'y2': [1, 2, 6], 'y3': [1, 3, 9]}
remove_glyphs(p, ['g1', 'g2'])
print(source.data)
show(p)
out:
{'x': [1, 3, 5], 'y1': [nan, nan, nan], 'y2': [nan, nan, nan], 'y3': [1, 3, 9]}
restore_glyphs(p, src_dict, ['g1', 'g3'])
print(source.data)
show(p)
('g3' was already on the plot, and is not affected)
out:
{'x': [1, 3, 5], 'y1': [1, 1, 2], 'y2': [nan, nan, nan], 'y3': [1, 3, 9]}
restore_glyphs(p, src_dict, ['g2'])
print(source.data)
show(p)
out:
{'x': [1, 3, 5], 'y1': [1, 1, 2], 'y2': [1, 2, 6], 'y3': [1, 3, 9]}

Categories

Resources