Matplotlib cmap only showing grey - python

I'm plot with matplotlib on python using 'tab20' color map with the following code:
colors=[str(float(year-1980)/(2017-1980)) for i in years];
fig,ax = plt.subplots()
ax.scatter(Topic[:,0],Topic[:,1],c=colors,cmap='tab20')
but the plot I get is completely grey. What could be the reason?

By passing a list of strings to c in your ax.scatter call, you're telling matplotlib to treat them as color format strings. Since the strings look like they represent floats, it treats them as grayscale values. If you pass a list of floats instead, it should use the colormap correctly:
colors = [float((year-1980)/(2017-1980)) for year in years]
See the docs for more details, in particular:
cmap : Colormap, optional, default: None
A Colormap instance or registered name. cmap is only used if c is an
array of floats.
(Also, you don't need the ; after your first line.)

Solved by using
colors=[cm.RdYlBu(float(year-1980)/(2017-1980)) for i in years];
because I need to convert the float number to the colormap I wanted to use.

Related

Countplot error: min() arg is an empty sequence

I have a problem where sns.countplot won't work. I got the names of the most popular color in each year, and with that I'm trying to plot a countplot that will show number (count) of each of those color. Something like .value_counts() but in a graph.
Here is the code that I've written:
most_popular_color = df_merged_full.groupby('year')[['name_cr_invp_inv']].agg({lambda color_name: color_name.value_counts().idxmax()}).reset_index()
and it returns this (example not full file):
Now when I try to do the countplot:
sns.countplot(most_popular_color['name_cr_invp_inv'],
palette={color: color for color in most_popular_color['name_cr_invp_inv'].drop_duplicates()})
it returns an error: min() arg is an empty sequence.
Where is the problem, I can't find it?
From the question, it looks like you are trying to plot the number of entries with each color and map the color to the bar. For this, you just need to provide a dictionary with mapping of each color to the column value (which will be the same in this case) and use that as the palette. I have used the data you provided above and created this. As white is one of the colors, I have added a border so that you can see the bar. Hope this is what you are looking for...
## Create dictionary with mapping of colors to the various unique entries in data
cmap = dict(zip(df_merged_full.name_cr_invp_inv.unique(), df_merged_full.name_cr_invp_inv.unique()))
fig, ax = plt.subplots() ## To add border, we will need ax
ax=sns.countplot(x=df_merged_full.name_cr_invp_inv, palette=cmap) ## Plot with pallette=cmap
plt.setp(ax.patches, linewidth=1, edgecolor='black') ## Add border

Matplotlib 'cmap' vs 'c' issue

I want to plot some impedance values and task and code are both simple. xhertz_df is a pandas dataframe and after conversion to a numpy array xhertz[0]is the real part, xhertz[1]the imaginary part and xhertz[3]represents the time between measurements.
def xhertz_plot(xhertz_df):
ax = plt.gca()
xhertz = xhertz_df.T.to_numpy()
ax.plot(xhertz[3], xhertz[0], 'green')
ax.plot(xhertz[3], xhertz[1], 'blue')
ax.scatter(xhertz[3], xhertz[0], cmap ='green')
ax.scatter(xhertz[3], xhertz[1], cmap ='blue')
ax.set_xlabel('Time Passed (in Minutes)')
plt.show()
I'm confused as to what can go wrong with this code as it seems so simple. Yet I get this result:
The upper line and points is a mix of blue and green even though it should be just green. The lower line that should be only blue has orange (?!) points. What is going on here?
Edit:
I found the problem: I used cmap instead of just c for the scatter plot. But to someone with expertise in both concepts: Why did I get the result shown above? E.g. where did the orange come from?
As stated in the docs for Axes.scatter:
A Colormap instance or registered colormap name. cmap is only used if c is an array of floats.
Since you did not provide a list of floats for the arg c, matplotlib ignored your cmap and instead used the first and second default colors (blue, then orange).
If you just want a single color, note the docs for the c argument:
If you wish to specify a single color for all points prefer the color keyword argument.
Alternatively, you can just use Axes.plot with o for the marker style, instead of scatter, e.g. ax.plot(x, y, 'o', color='green') or equivalently ax.plot(x, y, 'og'). This is more typical for simple plots; you can use - or o to explicitly set a line plot or marker plot.
Note that cmap is generally intended to be used if you want a different color for each point, like if you wanted to color the points to represent another dimension of data. In that case c would represent that third dimension of data, norm would scale the data, and cmap would be what colors are mapped to that data. See the scatter demo 2 from matplotlib for an example of how that argument is usually used.

What do c and s mean as parameters to matplotlib's plot function?

I have the following code from a Jupyter notebook:
housing.plot(kind="scatter", x="longitude", y="latitude",
s=housing["population"]/100, alpha=0.4, label="population", figsize=(10,7),
c="median_house_value", cmap=plt.get_cmap("jet"), colorbar=True,
sharex=False)
I can't seem to find what is meant by the parameters s and c anywhere in the documentation. Can someone please explain?
housing.plot with kind='scatter' is a pandas function which passes most of its parameters to matplotlib's scatter plot. When a parameter is given as a string (e.g. "median_house_value"), pandas interprets this string as a pandas column name, and the values of that column are passed to matplotlib.
So, c="median_house_value" gives the values of that column as a list to the c= parameter of matplotlib's scatter. There c= is shorthand for color=. When getting a list of numbers as a color, matplotlib first normalizes the list to values between 0 and 1, and then looks up that value in its colormap.
The s=housing["population"]/100 gives a list of each value of the "population" column divided by 100 to matplotlib's s= parameter. This defines the size of the markers, where the size is interpreted as the area of the marker, not its diameter.
Note the awkward **kwargs in the documentation. This is a list of additional parameters which are passed to deeper functions, e.g. to the function that plots lines.

How to get color of most recent plotted line in pandas df.plot()

I would like to get the color of the my last plot
ax = df.plot()
df2.plot(ax=ax)
# how to get the color of this last plot,
#the plot is a single timeseries, there is therefore a single color.
I know how to do it in matplotlib.pyplot, for those interested see for instance here but I can't find a way to do it in pandas. Is there something acting like get_color() in pandas?
You cannot do the same with DataFrame.plot because it doesn't return a list of Line2D objects as pyplot.plot does. But ax.get_lines() will return a list of the lines plotted in the axes so you can look at the color of the last plotted line:
ax.get_lines()[-1].get_color()
Notice (don't know if it was implicit in the answer by Goyo) that calls to pandas objects' .plot() precisely return the ax you're looking for, as in:
plt1 = pd.Series(range(2)).plot()
color = plt1.lines[-1].get_color()
pd.Series(range(2, 4)).plot(color=color)
This is not much nicer, but might allow you to avoid importing matplotlib explicitly

Using a Colormap with lineplot in matplotlib

I'm very new to python and matplotlib, and I want to create a plot with different colored lines. I know I have to use a colormap, but I'm not sure how. So I have a for loop:
for i in range(len(params)):
centers,fN = graph.createHistogram(values = NHI[i])
for j in range(len(centers)):
if params[i]!=fidVal:
vals[j] = (np.log10(origfNHI[j]/fN[j]))/(fidVal-params[i])
plt.plot(centers,vals)
I want to give each line different colors based on the difference between the value of params[i] and fidVal. If fidVal - params[i] is a negative large number, I want the line to be very red, and if it is a negative small number, I want it to be not as red. Similarly if fidVal - params[i] is positive, I want it to be blue based on that value. Finally, I want the colors to be mapped on a colorbar which would be displayed on the plot.
Alternatively, is there a way I can specify the rgb color of a line when I use plt.plot()? Like, could I say plt.plot(centers,vals,Color(0,0,0))?
What code should I use to solve this problem?
I will answer about the colormap. You can use the karg color for specify an rgb color with a tuple... It's well explained in the documentation.
"In addition, you can specify colors in many weird and wonderful ways, including full names ('green'), hex strings ('#008000'), RGB or RGBA tuples ((0,1,0,1)) or grayscale intensities as a string ('0.8'). Of these, the string specifications can be used in place of a fmt group, but the tuple forms can be used only as kwargs."
Here you have a very simple example:
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0,1,1000)
n=50
for i in range(1,n):
y = i/float(n)*x**2
plt.plot(x,y,color=(i/float(n),(i/float(n))**4,(i/float(n))**2))
ax = plt.gca()
ax.xaxis.set_visible(False)
ax.yaxis.set_visible(False)
plt.show()

Categories

Resources