I am making a basic program using matplotlib which graphs a large number of points, and calculates a value to colour those points. My issue is that as the number of points gets very large, the time it takes to individually plot each point through a for loop also gets very large. Is there any way I can use one plot statement and specify a list to use the colours for each individual point? As an example,
Current method:
colours = [(1,0,0),(0,1,0),(0,1,1)] #The length of these lists is usual in the thousands
x = [0,1,2]
y = [2,1,0]
for i in range(len(colours)):
plot([x[i]],[y[i]],'o', color = colours[i])
Whereas what I would like to use would be something more like:
plot(x,y,'o', color=colours)
Which would use each colour for each point. Is there any better way to approach this than a for loop?
You do not want to use plot, but scatter.
import matplotlib.pyplot as plt
colours = [(1,0,0),(0,1,0),(0,1,1)]
x = [0,1,2]
y = [2,1,0]
plt.scatter(x,y, c=colours)
plt.show()
Related
I have a list of values, to which I am applying a function.
I want to be able to plot the results of each iteration separately on a scatterplot.
To complicate things somewhat, the results list is not the same length for each iteration.
I've tried playing around with colourmap, but it's not even printing a blank chart.
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.cm as cm
cmap = cm.get_cmap('Set1')
def scatter_plot(list):
x = []
y = []
for i in list:
x.append(i[0])
y.append(i[1])
c = cmap(i[2])
plt.figure(figsize=(8,8))
plt.scatter(x,y, color=c)
plt.show()
In the function funky_function I have:
return(my_list, a_value)
my_list contains the x and y values for the plot, a_value is the value for which I want each different result a separate colour. The scatter_plot function is picking out the x and y fine for a single value.
To produce the results:
pointlist = funky_function(a_value)
value_list = [1,2,3,4]
for a_value in value_list:
funky_function(a_value)
scatter_plot(pointlist)
It's printing the results fine, but not plotting them. I want it to be able to just add new results to the plot if I add new items to the value list, hence trying to set the colour to be a dynamic input rather than plot1=color1, plot2=color2.
I had a look at Add colour scale to plot as 3rd variable, but I need the colour to match to a specific item in the list. (I agree with that poster that the info available on colormap isn't very clear.)
I've generated a lot of numbers using Python, saving them in a list, and now I want to plot them in a scatter graph and a bar graph to see the patterns of the function that I created.
I'm getting numbers from 0 to 99,999,999 at maximum, I tried to plot but I failed.
In the bar graph the y axis should show how many times a number repeated itself in the range of the numbers generated and the x axis the number itself.
I tried to use the method collections.counter but it keeps returning me a dict with all the numbers that appeard at least one time in the list, instead of the ones that just repeated, with the data of the numbers that repeated I think I could plot the graph properly.
Image showing the data that i get from the function
What would you like to plot in the scatter graph? Matplotlib has built-in histogram plotter.
import random
import matplotlib.pyplot as plt
random.seed(0)
ITER = 20
LOW = 0
HIGH = 10
RND_NUMS = []
for _ in range(ITER):
RND_NUMS.append(random.randint(LOW, HIGH))
plt.hist(RND_NUMS, bins=range(0,10))
This produces something like:
This question already has answers here:
How to change outliers to some other colors in a scatter plot
(2 answers)
Closed 5 years ago.
I have two numpy arrays, x and y, with 7000 elements each. I want to make a scatter plot of them giving each point a different color depending on these conditions:
-BLACK if x[i]<10.
-RED if x[i]>=10 and y[i]<=-0.5
-BLUE if x[i]>=10 and y[i]>-0.5
I tried creating a list of the same length as the data with the color I want to assign to each point and then plot the data with a loop, but it takes me a long time to run it. Here's my code:
import numpy as np
import matplotlib.pyplot as plt
#color list with same length as the data
col=[]
for i in range(0,len(x)):
if x[i]<10:
col.append('k')
elif x[i]>=10 and y[i]<=-0.5:
col.append('r')
else:
col.append('b')
#scatter plot
for i in range(len(x)):
plt.scatter(x[i],y[i],c=col[i],s=5, linewidth=0)
#add horizontal line and invert y-axis
plt.gca().invert_yaxis()
plt.axhline(y=-0.5,linewidth=2,c='k')
Before that, I tried creating the same color list in the same way, but plotting the data without the loop:
#scatter plot
plt.scatter(x,y,c=col,s=5, linewidth=0)
Even though this plots the data much, much faster than using the for loop, some of the scattered points appear with a wrong color. Why not using a loop to plot the data leads to incorrect color of some points?
I also tried defining three sets of data, one for each color, and adding them to the plot separately. But this is not what I am looking for.
Is there a way to specify in the scatter plots arguments the list of colors I want to use for each point in order not to use the for loop?
PS: This is the plot I get when I don't use the for loop (wrong one):
And this one when I use the for loop (correct):
This can be done using numpy.where. Since I do not your exact x and y values I will have to use some fake data:
import numpy as np
import matplotlib.pyplot as plt
#generate some fake data
x = np.random.random(10000)*10
y = np.random.random(10000)*10
col = np.where(x<1,'k',np.where(y<5,'b','r'))
plt.scatter(x, y, c=col, s=5, linewidth=0)
plt.show()
This produces the plot below:
The line col = np.where(x<1,'k',np.where(y<5,'b','r')) is the important one. This produces a list, the same size as x and y. It fills this list with 'k','b' or 'r' depending on the condition that is written before it. So if x is less than 1, 'k' will be appended to list, else if y is less than 5 'b' will be appended and if neither of those conditions are met, 'r' will be appended to the list. This way, you do not have to use a loop to plot your graph.
For your specific data you will have to change the values in the conditions of np.where.
I have a pair of lists of numbers representing points in a 2-D space, and I want to represent the y/x ratios for these points as a 1-dimensional heatmap, with a diverging color map centered around 1, or the logs of my ratios, with a diverging color map centered around 0.
How do I do that?
My current attempt (borrowing somewhat from Heatmap in matplotlib with pcolor?):
from matplotlib import numpy as np
import matplotlib.pyplot as plt
# There must be a better way to generate arrays of random values
x_values = [np.random.random() for _ in range(10)]
y_values = [np.random.random() for _ in range(10)]
labels = list("abcdefghij")
ratios = np.asarray(y_values) / np.asarray(x_values)
axis = plt.gca()
# I transpose the array to get the points arranged vertically
heatmap = axis.pcolor(np.log2([ratios]).T, cmap=plt.cm.PuOr)
# Put labels left of the colour cells
axis.set_yticks(np.arange(len(labels)) + 0.5, minor=False)
# (Not sure I get the label order correct...)
axis.set_yticklabels(labels)
# I don't want ticks on the x-axis: this has no meaning here
axis.set_xticks([])
plt.show()
Some points I'm not satisfied with:
The coloured cells I obtain are horizontally-elongated rectangles. I would like to control the width of these cells and obtain a column of cells.
I would like to add a legend for the color map. heatmap.colorbar = plt.colorbar() fails with RuntimeError: No mappable was found to use for colorbar creation. First define a mappable such as an image (with imshow) or a contour set (with contourf).
One important point:
matplotlib/pyplot always leaves me confused: there seems to be a lot of ways to do things and I get lost in the documentation. I never know what would be the "clean" way to do what I want: I welcome suggestions of reading material that would help me clarify my very approximative understanding of these things.
Just 2 more lines:
axis.set_aspect('equal') # X scale matches Y scale
plt.colorbar(mappable=heatmap) # Tells plt where it should find the color info.
Can't answer your final question very well. Part of it is due to we have two branches of doing things in matplotlib: the axis way (axis.do_something...) and the MATLAB clone way plt.some_plot_method. Unfortunately we can't change that, and it is a good feature for people to migrate into matplotlib. As far as the "Clean way" is concerned, I prefer to use whatever produces the shorter code. I guess that is inline with Python motto: Simple is better than complex and Readability counts.
I have a list of numbers.The list is like [0,0,1,0,1 .... ] .Presently it has binary digits only but later on it can have decimal digits as well. I want to plot a histogram of this sequence in the list.
When I use standard hist funcion of matplotlib library , I get only two bars.It counts all zeros and all ones and shows me the histogram with two bars. But I want to plot in a different way.
I want a no of bars = length of list
and
Height of each bar = value in the list at ( position = bar# ).
Here is the code:
def plot_histogram(self,li_input,):
binseq = numpy.arange(len(li_input))
tupl = matplotlib.pyplot.hist(li_input,bins=binseq)
matplotlib.pyplot.show()
li_input is the list discussed above.
I can do it in a nasty way like :
li_input_mod = []
for x in range(len(li_input)):
li_input_mod += [x]*li_input[x]
and then plot it but i want something better.
The behavior you describe is the way a histogram works; it shows you the distribution of values. It sounds to me like you want to create a bar chart:
import matplotlib.pyplot as plt
x = [0,0,1,0,1,1,0,1,1,0,0,0,1]
plt.bar(range(len(x)), x, align='center')
which would produce: