Pyplot contourf don't fill in "0" level - python

I'm plotting precipitation data from weather model output. I'm contouring the data I have, using contourf. However, I don't want it to fill in the "0" level with color (only the values >0). Is there a good way to do this? I've tried messing around with the levels.
Here's the code I'm using to plot:
m = Basemap(projection='stere', lon_0=centlon, lat_0=centlat,
lat_ts=centlat, width=width, height=height)
m.drawcoastlines()
m.drawstates()
m.drawcountries()
parallels = np.arange(0., 90, 10.)
m.drawparallels(parallels, labels=[1, 0, 0, 0], fontsize=10)
meridians = np.arange(180., 360, 10.)
m.drawmeridians(meridians, labels=[0, 0, 0, 1], fontsize=10)
lons, lats = m.makegrid(nx, ny)
x, y = m(lons, lats)
cs = m.contourf(x, y, snowfall)
cbar = plt.colorbar(cs)
cbar.ax.set_ylabel("Accumulated Snow (km/m^2)")
plt.show()
And here's the image I'm getting.
An example snowfall dataset would look something like:
0 0 0 0 0 0
0 0 1 1 1 0
0 1 2 2 1 0
0 2 3 2 1 0
0 1 0 1 2 0
0 0 0 0 0 0

This can also be achieved using 'locator' with MaxNLocator('prune = 'lower') from the ticker subclass. See docs.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
a = np.array([
[0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 0],
[0, 1, 2, 2, 1, 0],
[0, 2, 3, 2, 1, 0],
[0, 1, 0, 1, 2, 0],
[0, 0, 0, 0, 0, 0]
])
fig, ax = plt.subplots(1)
p = ax.contourf(a, locator = ticker.MaxNLocator(prune = 'lower'))
fig.colorbar(p)
plt.show()
Image of output
The 'nbins' parameter can be used to control the number of intervals (levels)
p = ax.contourf(a, locator = ticker.MaxNLocator(prune = 'lower'), nbins = 5)

If you don't include 0 in your levels, you won't plot a contour at the 0 level.
For example:
import numpy as np
import matplotlib.pyplot as plt
a = np.array([
[0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 0],
[0, 1, 2, 2, 1, 0],
[0, 2, 3, 2, 1, 0],
[0, 1, 0, 1, 2, 0],
[0, 0, 0, 0, 0, 0]
])
fig, ax = plt.subplots(1)
p = ax.contourf(a, levels=np.linspace(0.5, 3.0, 11))
fig.colorbar(p)
plt.show()
yields:
An alternative is to mask any datapoints which are 0:
p = ax.contourf(np.ma.masked_array(a, mask=(a==0)),
levels=np.linspace(0.0, 3.0, 13))
fig.colorbar(p)
Which looks like:
I suppose its up to you which of those matches your desired plot the most.

I was able to figure things out myself, there are two ways I found of solving this problem.
Mask out all data <0.01 from the data set using
np.ma.masked_less(snowfall, 0.01)
or
Set the levels of the plot to be from 0.01 -> whatever maximum value
levels = np.linspace(0.1, 10, 100)
then
cs = m.contourf(x, y, snowfall, levels)
I found that option 1 worked best for me.

Related

Plotting by ignoring missing data in matplotlib

I have been trying to make a program that plots the frequency of usage of a word during Whatsapp chats between 2 people. The word night for example has been used a couple of times on a few days, and 0 times on the most of the days. The graph I have is as follows
Here is the code
word_occurances = [0 for i in range(len(just_dates))]
for i in range(len(just_dates)):
for j in range(len(df_word)):
if just_dates[i].date() == word_date[j].date():
word_occurances[i] += 1
title = person2.rstrip(':') + ' with ' + person1.rstrip(':') + ' usage of the word - ' + word
plt.plot(just_dates, word_occurances, color = 'purple')
plt.gcf().autofmt_xdate()
plt.xlabel('Time')
plt.ylabel('number of times used')
plt.title(title)
plt.savefig('Graphs/Words/' + title + '.jpg', dpi = 200)
plt.show()
word_occurances is a list
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 2, 0, 0, 0, 1, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
What I want is for the graph to only connect the points where it has been used while showing the entire timeline on the x axis. I don't want the graph to touch 0. How can I do this? I have searched and found similar answers but none have worked the way I them.
You simply have to find the indices of word_occurances on which the corresponding value is greater than zero. With this you can index just_dates to get the corresponding dates.
word_counts = [] # Only word counts > 0
dates = [] # Date of > 0 word count
for i, val in enumerate(word_occurances):
if val > 0:
word_counts.append(val)
dates.append(just_dates[i])
You may want to plot with an underlying bar plot in order to maintain the original scale.
plt.bar(just_dates, word_occurances)
plt.plot(dates, word_counts, 'r--')
One way to address this is to plot only data that contain entries but label all dates where a conversation took place to indicate the zero values in your graph:
from matplotlib import pyplot as plt
import matplotlib.dates as mdates
from matplotlib.ticker import FixedLocator
#fake data generation, this block just imitates your unknown data and can be deleted
import numpy as np
import pandas as pd
np.random.seed(12345)
n = 30
just_dates = pd.to_datetime(np.random.randint(1, 100, n)+18500, unit="D").sort_values().to_list()
word_occurances = [0]*n
for i in range(10):
word_occurances[np.random.randint(n)] = np.random.randint(1, 10)
fig, ax = plt.subplots(figsize=(15,5))
#generate data to plot by filtering out zero values
plot_data = [(just_dates[i], word_occurances[i]) for i, num in enumerate(word_occurances) if num > 0]
#plot these data with marker to indicate each point
#think 1-1-1-1-1 would only be visible as two points with lines only
ax.plot(*zip(*plot_data), color = 'purple', marker="o")
#label all dates where conversations took place
ax.xaxis.set_major_locator(FixedLocator(mdates.date2num(just_dates)))
#prevent that matplotlib autoscales the y-axis
ax.set_ylim(0, )
ax.tick_params(axis="x", labelrotation= 90)
plt.xlabel('Time')
plt.ylabel('number of times used')
plt.title("Conversations at night")
plt.tight_layout()
plt.show()
Sample output:
This can get quite busy soon with all these date labels (and might or might not work with your datetime objects in just_dates that might differ in structure from my sample date). Another way would be to indicate each conversation with vlines:
...
fig, ax = plt.subplots(figsize=(15,5))
plot_data = [(just_dates[i], word_occurances[i]) for i, num in enumerate(word_occurances) if num > 0]
ax.plot(*zip(*plot_data), color = 'purple', marker="o")
ax.vlines((just_dates), 0, max(word_occurances), color="red", ls="--")
ax.set_ylim(0, )
plt.gcf().autofmt_xdate()
plt.xlabel('Time')
plt.ylabel('number of times used')
plt.title("Conversations at night")
plt.tight_layout()
plt.show()
Sample output:

Discrete data plots in matplotlib

I have two data arrays and I am looking to plot them in a single plot using matplotlib
The data arrays are:
date_array=['2018-03-26', '2018-03-27', '2018-03-28', '2018-03-29', '2018-04-02', '2018-04-03', '2018-04-04', '2018-04-05', '2018-04-06', '2018-04-09', '2018-04-10', '2018-04-11', '2018-04-12', '2018-04-13', '2018-04-16', '2018-04-17', '2018-04-18', '2018-04-19', '2018-04-20', '2018-04-23', '2018-04-24', '2018-04-25', '2018-04-26', '2018-04-27', '2018-04-30', '2018-05-01', '2018-05-02', '2018-05-03', '2018-05-04', '2018-05-07', '2018-05-08', '2018-05-09', '2018-05-10', '2018-05-11', '2018-05-14', '2018-05-15', '2018-05-16', '2018-05-17', '2018-05-18', '2018-05-21', '2018-05-22', '2018-05-23', '2018-05-24', '2018-05-25', '2018-05-29', '2018-05-30', '2018-05-31', '2018-06-01', '2018-06-04', '2018-06-05', '2018-06-06', '2018-06-07', '2018-06-08', '2018-06-11', '2018-06-12', '2018-06-13', '2018-06-14', '2018-06-15', '2018-06-18', '2018-06-19', '2018-06-20', '2018-06-21', '2018-06-22', '2018-06-25', '2018-06-26', '2018-06-27', '2018-06-28', '2018-06-29', '2018-07-02', '2018-07-03', '2018-07-05', '2018-07-06', '2018-07-09', '2018-07-10', '2018-07-11', '2018-07-12', '2018-07-13', '2018-07-16', '2018-07-17', '2018-07-18', '2018-07-19', '2018-07-20', '2018-07-23', '2018-07-24', '2018-07-25', '2018-07-26', '2018-07-27', '2018-07-30', '2018-07-31', '2018-08-01', '2018-08-02', '2018-08-03', '2018-08-06', '2018-08-07', '2018-08-08', '2018-08-09', '2018-08-10', '2018-08-13', '2018-08-14', '2018-08-15']
value_1 = [45.27, 44.53, 44.68, 45.29, 44.43, 44.88, 45.85, 45.7, 44.76, 44.22, 44.81, 44.54, 44.13, 44.0, 43.41, 43.68, 43.29, 42.33, 42.18, 41.8, 41.78, 42.46, 43.67, 43.92, 44.75, 44.33, 44.41, 45.7, 43.8, 44.16, 44.9, 45.07, 46.24, 48.3, 49.21, 49.84, 50.34, 50.4, 49.98, 50.7, 49.15, 48.5, 48.53, 47.65, 48.52, 47.36, 46.13, 46.01, 47.27, 48.04, 49.48, 49.96, 50.48, 51.3, 52.29, 51.86, 50.2, 49.42, 50.0, 52.42, 52.32, 52.62, 52.13, 51.13, 50.24, 48.66, 48.99, 48.05, 48.33, 49.22, 50.62, 51.39, 51.87, 47.37, 49.53, 49.54, 51.82, 51.65, 52.98, 52.09, 54.24, 53.98, 52.72, 51.09, 49.99, 48.55, 47.98, 48.67, 48.87, 48.45, 48.65, 50.06, 52.64, 54.6, 56.61, 55.77, 55.59, 56.5, 56.31, 54.0]
value_2 = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 95.39398869716304, 95.39398869716304, 0, 0, 95.39398869716304, 95.39398869716304, 0, 0, 0, 0, 0, 0, 0, 95.39398869716304]
The thing is that I have data points available for value_1 for all dates in date_array but not for value_2 so wherever I don't have the value available I have filled in a zero (That is one of my question as you'll see later).
When I plot it using this code:
x = date_array
y1 = value_1
y2 = value_2
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.scatter(x, y1, s=10, c='b', marker="s", label='fig 1')
ax1.scatter(x,y2, s=10, c='r', marker="o", label='fig 2')
plt.legend(loc='upper left');
plt.show()
I get this:
My questions:
How do I work my around the fact that I don't have all values available for value_2 and still get the plot? I don't want the red dots to show that have value 0 in the plot but am not sure how I'll get around to do that. Note An entry in value_2 can't have 0 value so if it is 0 that means its not present.
How to fix the messed up data labels on x-axis? If there are only 10-12 markers on the x-axis that would look neater.
Thanks!
You can convert the zeros to NaN and they wont be plotted:
value_2 = [np.nan if x==0 else x for x in value_2]
For the second questions, I would transform to datetime object and the distance is adjusted automatically(and after rotate them):
from datetime import datetime
date_array = [datetime.strptime(i, '%Y-%m-%d').date() for i in date_array]
plt.xticks(rotation=70)
Complete code:
import matplotlib.pyplot as plt
from datetime import datetime
date_array = [datetime.strptime(i, '%Y-%m-%d').date() for i in date_array]
value_2 = [np.nan if x==0 else x for x in value_2]
x = date_array
y1 = value_1
y2 = value_2
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.plot_date(x, y1, c='b', label='fig 1')
ax1.plot_date(x, y2, c='r', label='fig 2')
plt.legend(loc='upper left')
plt.xticks(rotation=70)
plt.show()

How do I color background based on 1 or 0 in Python

I want the background of the graph of x to be grey when y=1 and white when y=0
#some random data
x = np.random.random(12)
#0's and 1's
y = [0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1]
plt.plot(np.linspace(0, 12, 12), x);
So it looks something like this in stead of this
You can try manually drawing the rectangles using a loop:
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import numpy as np
idx = np.linspace(0, 12, 12)
x = np.random.random(12)
y = [0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1]
fig, ax = plt.subplots(1)
ax.plot(idx, x)
rect_height = np.max(x)
rect_width = 1
for i, draw_rect in enumerate(y):
if draw_rect:
rect = patches.Rectangle(
(i, 0),
rect_width,
rect_height,
linewidth=1,
edgecolor='grey',
facecolor='grey',
fill=True
)
ax.add_patch(rect)
plt.show()

How to annotate a stacked bar plot and add legend labels

In short:
Height of bars does not match the numbers.
Labels seem to be placed on the wrong height. (should be right in the middle of each bar)
On the very bottom I also see the '0' labels which I really don't want to see in the graph.
Explained:
I'm trying to make a stacked bar chart and label each bar with it's appropriate value in it. But for some reason the height of the bars is completely wrong. Like for the first week the green bar should be 20 points long but it is only 10. And the red bar should be 10 points long but it is only 8 or so. And week 17 should have multiple bars in it but instead has only one (the white one)
I am guessing that because of the wrong bar heights the labels are misplaced too. I have no idea why the 0's on the very bottom are also showing but that's a problem too.
I don't know if these are all separate questions and should be asked in separate posts, but I feel like they are all connected and that there is an answer that solves them all.
import matplotlib.pyplot as plt
import numpy as np
newYearWeek =[201613, 201614, 201615, 201616, 201617, 201618, 201619, 201620, 201621, 201622]
uniqueNames = ['Word1', 'Word2', 'Word3', 'Word4', 'Word5', 'Word6',
'Word7', 'Word8', 'Word9', 'Word10', 'Word11']
#Each column in the multiarray from top to bottom represents 1 week
#Each row from left to right represents the values of that word.
#So that makes 11 rows and 10 columns.
#And yes the multidimensional array have to be like this with the 0's in it.
keywordsMuliarray = [
[20, 3, 1, 0, 0, 1, 6, 3, 1, 2],
[10, 1, 0, 0, 3, 1, 3, 1, 0, 2],
[2, 2, 5, 3, 5, 4, 5, 4, 3, 2],
[0, 4, 3, 3, 1, 0, 2, 7, 1, 2],
[0, 0, 2, 0, 1, 1, 1, 0, 1, 3],
[0, 0, 3, 2, 0, 0, 0, 1, 0, 0],
[1, 0, 1, 0, 1, 0, 0, 0, 1, 1],
[0, 0, 0, 1, 0, 1, 0, 0, 0, 0],
[0, 1, 0, 0, 7, 6, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 2, 0, 1]]
fig = plt.figure(figsize=(8.5, 5.5))
ax = fig.add_subplot(111)
fig.subplots_adjust(top=0.85)
N = len(newYearWeek)
ind = np.arange(N) # the x locations for the groups
width = 0.35 # the width of the bars: can also be len(x) sequence
colors = ['seagreen', 'indianred', 'steelblue', 'darkmagenta', 'wheat',
'orange', 'mediumslateblue', 'silver',
'whitesmoke', 'black', 'darkkhaki', 'dodgerblue', 'crimson',
'sage', 'navy', 'plum', 'darkviolet', 'lightpink']
def autolabel(rects, values):
# Attach some text labels.
for (rect, value) in zip(rects, values):
ax.text(rect.get_x() + rect.get_width() / 2.,
rect.get_y() + rect.get_height() / 2.,
'%d'%value,
ha = 'center',
va = 'center')
left = np.zeros(len(uniqueNames)) # left alignment of data starts at zero
helpingNumber = 0
for i in range(0, len(newYearWeek)):
rects1 = plt.bar(ind, keywordsMuliarray[helpingNumber][:],width, color=colors[helpingNumber], label=uniqueNames[helpingNumber])
autolabel(rects1, keywordsMuliarray[helpingNumber][:])
helpingNumber = helpingNumber+1
# Shrink current axis by 20%
box = ax.get_position()
ax.set_position([box.x0, box.y0, box.width * 1, box.height])
# Put a legend to the right of the current axis
ax.legend(loc='center left', fontsize=9, bbox_to_anchor=(1, 0.5))
#plt.ylabel('Scores')
plt.xticks(ind + width/2., newYearWeek, fontsize=8)
#plt.yticks(np.arange(0, 81, 10))
plt.margins(x=0.02)
plt.tight_layout(rect=[0,0,0.8,1])
plt.show()
This is how the graph looks now:
To make what you want you have to sum heights of all previous bars in current column (list bot_heights), like here:
import matplotlib.pyplot as plt
import numpy as np
newYearWeek =[201613, 201614, 201615, 201616, 201617, 201618, 201619, 201620, 201621, 201622]
uniqueNames = ['Word1', 'Word2', 'Word3', 'Word4', 'Word5', 'Word6',
'Word7', 'Word8', 'Word9', 'Word10', 'Word11']
#Each column in the multiarray from top to bottom represents 1 week
#Each row from left to right represents the values of that word.
#So that makes 11 rows and 10 columns.
#And yes the multidimensional array have to be like this with the 0's in it.
keywordsMuliarray = [
[20, 3, 1, 0, 0, 1, 6, 3, 1, 2],
[10, 1, 0, 0, 3, 1, 3, 1, 0, 2],
[2, 2, 5, 3, 5, 4, 5, 4, 3, 2],
[0, 4, 3, 3, 1, 0, 2, 7, 1, 2],
[0, 0, 2, 0, 1, 1, 1, 0, 1, 3],
[0, 0, 3, 2, 0, 0, 0, 1, 0, 0],
[1, 0, 1, 0, 1, 0, 0, 0, 1, 1],
[0, 0, 0, 1, 0, 1, 0, 0, 0, 0],
[0, 1, 0, 0, 7, 6, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 2, 0, 1]]
fig = plt.figure(figsize=(8.5, 5.5))
ax = fig.add_subplot(111)
fig.subplots_adjust(top=0.85)
N = len(newYearWeek)
ind = np.arange(N) # the x locations for the groups
width = 0.35 # the width of the bars: can also be len(x) sequence
colors = ['seagreen', 'indianred', 'steelblue', 'darkmagenta', 'wheat',
'orange', 'mediumslateblue', 'silver',
'whitesmoke', 'black', 'darkkhaki', 'dodgerblue', 'crimson',
'sage', 'navy', 'plum', 'darkviolet', 'lightpink']
def autolabel(rects, values):
# Attach some text labels
for (rect, value) in zip(rects, values):
if value > 0:
ax.text(rect.get_x() + rect.get_width() / 2.,
rect.get_y() + rect.get_height() / 2.,
'%d'%value, ha = 'center', va = 'center', size = 9)
left = np.zeros(len(uniqueNames)) # left alignment of data starts at zero
# plot the first bars
rects1 = plt.bar(ind, keywordsMuliarray[0][:],width,
color=colors[0], label=uniqueNames[0])
autolabel(rects1, keywordsMuliarray[0][:])
# put other bars on previuos
bot_heights = [0.] * len(keywordsMuliarray[0][:])
for i in xrange(1,N):
bot_heights = [bot_heights[j] + keywordsMuliarray[i-1][j] for j in xrange(len(bot_heights))]
rects1 = plt.bar(ind, keywordsMuliarray[i][:],width,
color=colors[i], label=uniqueNames[i],
bottom=bot_heights)
autolabel(rects1, keywordsMuliarray[i][:])
# Shrink current axis by 20%
box = ax.get_position()
ax.set_position([box.x0, box.y0, box.width * 1, box.height])
# Put a legend to the right of the current axis
ax.legend(loc='center left', fontsize=9, bbox_to_anchor=(1, 0.5))
#plt.ylabel('Scores')
plt.xticks(ind + width/2., newYearWeek, fontsize=8)
plt.yticks(np.arange(0, 41, 5))
plt.margins(x=0.02)
plt.tight_layout(rect=[0,0,0.8,1])
plt.show()
To prevent overlapping of bar labels I recommend you do not add a label if a value is zero (look to modified autolabel function). As a result I get:
The other answer doesn't plot data for 'Word11'
Lists and arrays of data can most easily be plotted by loading them into pandas
Plot the dataframe with pandas.DataFrame.plot and kind='bar'
When plotting data from pandas, the index values become the axis tick labels and the column names are the segment labels
matplotlib.pyplot.bar_label can be used to add annotations
See Adding value labels on a matplotlib bar chart for more options using .bar_label.
Tested in pandas 1.3.1, python 3.81., and matplotlib 3.4.21.
Minimum version required
labels = [f'{v.get_height():0.0f}' if v.get_height() > 0 else '' for v in c ] without the assignment expression (:=).
import pandas as pd
import matplotlib.pyplot as plt
# create a dataframe from the data in the OP and transpose it with .T
df = pd.DataFrame(data=keywordsMuliarray, index=uniqueNames, columns=newYearWeek).T
# display(df.head())
Word1 Word2 Word3 Word4 Word5 Word6 Word7 Word8 Word9 Word10 Word11
201613 20 10 2 0 0 0 1 0 0 0 0
201614 3 1 2 4 0 0 0 0 1 0 0
201615 1 0 5 3 2 3 1 0 0 0 0
201616 0 0 3 3 0 2 0 1 0 0 0
201617 0 3 5 1 1 0 1 0 7 0 0
colors = ['seagreen', 'indianred', 'steelblue', 'darkmagenta', 'wheat', 'orange', 'mediumslateblue', 'silver', 'whitesmoke', 'black', 'darkkhaki']
# plot the dataframe
ax = df.plot(kind='bar', stacked=True, figsize=(9, 6), color=colors, rot=0, ec='k')
# Put a legend to the right of the current axis
ax.legend(loc='center left', fontsize=9, bbox_to_anchor=(1, 0.5))
# add annotations
for c in ax.containers:
# customize the label to account for cases when there might not be a bar section
labels = [f'{h:0.0f}' if (h := v.get_height()) > 0 else '' for v in c ]
# set the bar label
ax.bar_label(c, labels=labels, label_type='center', fontsize=8)
plt.show()

Plot labelled and unlabeled data matplotlib

I have three list which are X, Y, Z
X = [[0.67910803031180977, 0.1443997264255876], [0.57, 0.87], [0.545, 0.854], [0.645, 0.1254], [0.645, 0.1354], [0.62, 0.83], [0.6945, 0.144], [0.9945, 0.45244], [0.235, 0.7754], [0.7, 0.85]]
Y = [0, 1, -1, -1, -1, 1, -1, -1, -1, 1]
Z = [0 1 1 0 0 1 0 1 1 1]
Where,
X is the dataset,
Y is labelset where 0 means "Normal", 1 means "LL" and -1 means "Unlabelled"
Z is outputset in which labels from Y is propagated to unlabelled labels.
Now, i am trying to plot a figure where one subplot contains the dataset as cluster with respect to each label from Y it belongs to and another subplot showing dataset with respect to Z.
I tried code from this example but i am not able to do it.
Please help.
I'm guessing at what you want, but here's an example of plotting the X values with colors determined by the Y and Z lists respectively. It's using a lot of default behavior -- color values between 0 and 1 get plotted into a default colorbar, iirc -- but you could make a more complicated function and pass a list of (rgb) or (rgba) values instead.
import matplotlib.pyplot as plt
from numpy import array
X = array([[0.67910803031180977, 0.1443997264255876], [0.57, 0.87],
[0.545, 0.854], [0.645, 0.1254], [0.645, 0.1354], [0.62, 0.83],
[0.6945, 0.144], [0.9945, 0.45244], [0.235, 0.7754], [0.7, 0.85]])
Y = [0, 1, -1, -1, -1, 1, -1, -1, -1, 1]
Z = [0, 1, 1, 0, 0, 1, 0, 1, 1, 1]
# for readability mostly
Xx = X.T[0]
Xy = X.T[1]
fig = plt.figure()
ax1 = fig.add_subplot(121)
ax1.scatter(Xx, Xy, c=map(lambda c: 0.3 * c + 0.5, Y), s=50, alpha=0.75)
ax1.set_xlabel('Y labels')
ax2 = fig.add_subplot(122)
ax2.scatter(Xx, Xy, c=map(lambda c: 0.3 * c + 0.5, Z), s=50, alpha=0.75)
ax2.set_xlabel('Z labels')
plt.show()

Categories

Resources