I have a dataframe created using
results = df2[(df2['R'] > 100)].sort(columns='ZR', ascending=False)
I would like to do
plt.plot(results['ZR'], marker='o')
except I would like the points were results['I'] == foo to be in red and the points where results['I'] != foo to be in blue.
I tried
firstset = results.ZR[results.I.str.contains('foo')]
secondset = results.ZR[~results.I.str.contains('foo)]
plt.plot(firstset, marker='o', color='red')
plt.plot(secondset, marker='o', color='blue')
but this plots both halves starting from x axis 0 which is not what I need.
I would instead just like the original graph but with some of the points in red and some in blue. That is no new points and no points with changed positions. Here is the original graph.
First of all, in absence of any X parameter, plt.show assumes it as consecutive integers from 1. In order to find correct index using logical operations, do this:
firstIndex = results.index[results.I.str.contains('foo')]
secondIndex = results.index[~results.I.str.contains('foo)]
If your original index is complex, create a dummy pandas DataFrame for new index.
newDf = pd.DataFrame(range(len(results)))
firstIndex = newDf.index[results.I.str.contains('foo')]
secondIndex = newDf.index[~results.I.str.contains('foo')]
This is guaranteed to create two indices that are subsets of 1:230.
Next, pass a string qulifier to plt.plot to specify color and marker type.
plt.plot(results['ZR'])
plt.plot(firstIndex, firstset, "bo")
plt.plot(secondIndex, secondset, "ro")
This will plot the entire underlined data using lines, with no marker for points. It will then overlay the two set points with their respective colors.
Related
I'm trying to compare two sets of London Airbnb data. I want an elegant way to plot the London shapefile on two subplots, and then overlay the different data as points on each map. My shapefile is from here:
londonshp = gpd.read_file("statistical-gis-boundaries london\ESRI\London_Borough_Excluding_MHW.shp")
londonshp = londonshp.to_crs(4326)`
This is the code to plot the maps:
fig, axes = plt.subplots(ncols=2, figsize = (12,16))
#entire home/apt on left
axes[0].set_aspect('equal')
londonshp.plot(ax = axes[0],
color = '#e0e1dd',
edgecolor = '#1c1c1c')
axes[0].scatter(entirehomedf.longitude,
entirehomedf.latitude,
s = 1,
c = '#2ec4b6',
marker = '.')
axes[0].set_yticklabels([])
axes[0].set_xticklabels([])
axes[0].set_title("Entire Homes/Apts")
#private room on right
axes[1].set_aspect('equal')
londonshp.plot(ax = axes[1],
color = '#e0e1dd',
edgecolor = '#1c1c1c')
axes[1].scatter(privateroomdf.longitude,
privateroomdf.latitude,
s = 1,
c = '#ff9f1c')
axes[1].set_yticklabels([])
axes[1].set_xticklabels([])
axes[1].set_title("Private Rooms")
Result:
The code I have works fine, but it seems inelegant.
Manually plotting the shapefile on each subplot is ok for just two subplots, but not ideal for larger numbers of subplots. I imagine there's a quicker way to do it automatically (e.g. a loop?)
Some scatterplot features (like marker shape/size) are the same on each subplot. I'm sure there's a better way to set these features for the whole figure, and then edit features which are individual to each subplot (like colour) separately.
I won't code it out, but I can give you some tips:
Yes you can use loops to plot multiple subplots, all you have to do is iterate through multiple lists of variables you want to change e.g. colour and data and use them in the loop
When you use the loop, you can easily access all the different variables needed, including all your features for your graphs, e.g.:
c= ["blue","red","yellow"]
for x in range(3):
plt.plot(...,color=c[x])
I have a seemingly simple problem of standardizing and labeling my axis on a series of graphs I am creating from a DataFrame. This dataframe contains a column with a sort of ID and each row contains a value for x and a value for y. I am generating a separate graph for each ID; however, I would like a standard axis across all of these graphs. Here is my code:
groups = data.groupby('Pedigree')
for Pedigree,group in groups:
group.plot(x='EnvironmentalIndex',y='GrainYield',marker='o',linestyle='',color ='white',label=Pedigree)
plt.plot([0,250],[0,250],linestyle = 'dashed',color='black')
x = group.EnvironmentalIndex
y = group.GrainYield
z = np.polyfit(x,y,1)
p = np.poly1d(z)
q = sum(y)/len(y)
plt.plot(x,p(x),color='green')
plt.text(25,220,'Stability=%.6f'%(z[0]))
plt.text(25,205,'Mean Yield=%.6f'%(q))
I know there is an axes function in Matplotlib, but I can't get the formatting right so that it plays well with the for loop. I have tried inserting a
group.axes()
inside of the for loop but I get the error that the list object is not callable.
If you mean by standard having the same ticks, there are different ways of doing this, one is, if you don't have a lot of plots, create a subplot that shares the same x-axis,
no_rows = len(data.groupby('Pedigree'))
no_columns = 1
fig, ax = plt.subplots(no_rows, no_columns, sharex = True)
ax = ax.reshape(-1)
count = 0
for Pedigree,group in groups:
...
q = sum(y)/len(y)
ax[count].plot(x,p(x),color='green')
ax[count].text(25,220,'Stability=%.6f'%(z[0]))
ax[count].text(25,205,'Mean Yield=%.6f'%(q))
count+=1
Only the xticks from the bottom plot will be applied, you can also define a different number of columns but make sure no_rows * no_columns >= # of plots.
In the following code I want to plot scatter points after comparing two numpy arrays a and b. When a value in the a is low then assign a bright color to the corresponding value b. E.g when a is zero, assign bright color to the value 2(in b) on the final graph. I have never plotted data with colors after such a comparison. Please guide me how can we do that?
a = np.array([6,2,7,0,1])
b= np.array([-3,-2,0,2,3])
c=np.array([1/3,1/3,1/3,1/3,1/3])
print("lengths:",len(a),len(b),len(c))
fig=plt.figure()
ax= fig.add_subplot(111)
ax.scatter(b,c,marker='.')
ax.set_xlim(-3,3)
ax.set_ylim(-1/2,1/2)
plt.savefig("./Colormap")`
You could create a color array using np.where and supply your conditions. Something like this
low_thresh = 1
colors = np.where(a < low_thresh, 'r', 'g')
plt.scatter(a,b,c=colors)
If I understood you correctly, you could do in this way:
for el_a, el_b, el_c in zip(a,b,c):
if el_a < el_b:
ax.scatter(el_b, el_c , c='lightblue', marker='.')
else: ax.scatter(el_b, el_c , c='blue', marker='.')
I have a 4D plot plotted using matplotlib with the 4th dimension being color.
Lets assume the range of all the 3 axes to be 0 to 5. Array for plot looks something like this - [0,1,2,50],[1,2,3,40],[5,5,5,80]. So I will see 3 points on co-ordinates [0,1,2],[1,2,3],[5,5,5] in the graph when plotted (assuming a scatter plot).
My question is - is there any way to check the combinations of co-ordinates on the plot which do not have any points. In the above case, for example, there are no points on co-ordinates [1,2,4],[1,2,5],[0,0,0],[1,1,1],[4,1,2],[5,3,1] and so on. I can individually check if there is an element (between 0 and 5) which is not covered in a particular axis (using 'is in'). But how to get the missing combination of co-ordinates and then plot it in the graph with some dummy value ? Any leads or solution would be appreciated.
#create an empty list
list1 = []
#append elements to the list
list1.append([0,1,2,50])
list1.append([1,2,3,40])
list1.append([5,5,5,80])
using the scatter plot
cmap = LinearSegmentedColormap.from_list('mycmap', ['green', 'yellow', 'red'])
fig = plt.figure()
ax = fig.gca(projection = '3d')
ax.set_xlim(0,5)
ax.set_ylim(0,5)
ax.set_zlim(0,5)
#convert list into array for slicing
array1 = np.array(list1)
ax.scatter(array1[:,0],array1[:,1],array1[:,2],c=array1[:,3], cmap = cmap)
plt.show()
for the above i need to add zeroes to remaining co-ordinates.
When i plot, the co-ordinates (0,1,2) gets a color corresponding to 50, (1,2,3) gets a color corresponding to 40 and similarly for co-ordinates (5,5,5). So the remaining co-ordinates should be appended with zeroes. Like (1,4,1,0), (1,3,5,0), (2,4,1,0) and so on.
This is how I would tackle this problem. Instead of checking on where there are not coordinates available, I'd initialize a matrix with the dummy value and set all available to the value they are supposed to have. It looks like this:
myData = [[0,1,2,50], [1,2,3,40], [5,5,5,80]]
fullMatrix = np.empty(shape=(6,6,6), dtype=np.int8) # 0 to 5 in each dimension
fullMatrix.fill(-1) # -1 is an example for the dummy value
for i in myData:
fullMatrix[i[0],i[1],i[2]] = i[3]
How can I change the data on one axis?
I'm making some spectrum analysis on some data and my x-axis is the index of some matrix. I'd like to change it so that the x-axis becomes the data itself.
I'm using the imshow() to plot the data (I have a matrix whose elements are some intensity, the y axes are their detector-source correspondent pair and the x-axis should be their frequency).
The code for it is written down here:
def pltspec(dOD, self):
idx = 0
b = plt.psd(dOD[:,idx],Fs=self.fs,NFFT=512)
B = np.zeros((2*len(self.Chan),len(b[0])))
for idx in range(2*len(self.Chan)):
b = plt.psd(dOD[:,idx],Fs=self.fs,NFFT=512)
B[idx,:] = 20*log10(b[0])
fig = plt.figure()
ax = fig.add_subplot(111)
plt.imshow(B, origin = 'lower')
plt.colorbar()
locs, labels = xticks(find(b[1]), b[1])
plt.axis('tight')
ax.xaxis.set_major_locator(MaxNLocator(5))
I think if there's a way of interchanging the index of some array with its value, my problem would be solved.
I've managed to use the line locs, labels = xticks(find(b[1]), b[1]). But with it on my graph my axis interval just isn't right... I think it has something to do with the MaxNLocator (which I used to decrease the number of ticks).
And if I use the xlim, I can set the figure to be what I want, but the x axis is still the same (on that xlim I had to use the original data to set it right).
What am I doing wrong?
Yes, you can use the xticks method exemplified in this example.
There are also more sophisticated ways of doing it. See ticker.