I saw a post on assigning the same colors across multiple pie plots in Matplotlib here
But there's something I don't understand about indexing the axis object.
Here's the code:
import numpy as np
import matplotlib.pyplot as plt
def mypie(slices,labels,colors):
colordict={}
for l,c in zip(labels,colors):
print l,c
colordict[l]=c
fig = plt.figure(figsize=[10, 10])
ax = fig.add_subplot(111)
pie_wedge_collection = ax.pie(slices, labels=labels, labeldistance=1.05)#, autopct=make_autopct(slices))
for pie_wedge in pie_wedge_collection[0]:
pie_wedge.set_edgecolor('white')
pie_wedge.set_facecolor(colordict[pie_wedge.get_label()])
titlestring = 'Issues'
ax.set_title(titlestring)
return fig,ax,pie_wedge_collection
slices = [37, 39, 39, 38, 62, 21, 15, 9, 6, 7, 6, 5, 4, 3]
cmap = plt.cm.prism
colors = cmap(np.linspace(0., 1., len(slices)))
labels = [u'TI', u'Con', u'FR', u'TraI', u'Bug', u'Data', u'Int', u'KB', u'Other', u'Dep', u'PW', u'Uns', u'Perf', u'Dep']
fig,ax,pie_wedge_collection = mypie(slices,labels,colors)
plt.show()
In the line: for pie_wedge in pie_wedge_collection[0] what does the index [0] do? The code doesn't work if I don't use it or use pie_wedge_collection[1]
Doesn't the ax object here only have one plot here? So I don't understand what the index is doing.
According to the Matplotlib documentation, pie() returns two or three lists:
A list of matplotlib.patches.Wedge
A list of matplotlib.text.Text labels
(conditionally) A list of matplotlib.text.Text data labels
Your code needs to manipulate the edge and face colors of the Wedge objects returned by pie(), which are in the first list (zero index) in the return value, pie_wedge_collection.
Related
Context: I'd like to plot multiple subplots (sparated by legend) based on patterns from the columns of a dataframe inside a subplot however, I'm not being able to separate each subplots into another set of subplots.
This is what I have:
import matplotlib.pyplot as plt
col_patterns = ['pattern1','pattern2']
# define subplot grid
fig, axs = plt.subplots(nrows=len(col_patterns), ncols=1, figsize=(30, 80))
plt.subplots_adjust()
fig.suptitle("Title", fontsize=18, y=0.95)
for col_pat,ax in zip(col_patterns,axs.ravel()):
col_pat_columns = [col for col in df.columns if col_pat in col]
df[col_pat_columns].plot(x='Week',ax=ax)
# chart formatting
ax.set_title(col_pat.upper())
ax.set_xlabel("")
Which results in something like this:
How could I make it so that each one of those suplots turn into another 6 subplots all layed out horizontally? (i.e. each figure legend would be its own subplot)
Thank you!
In your example, you're defining a 2x1 subplot and only looping through two axes objects that get created. In each of the two loops, when you call df[col_pat_columns].plot(x='Week',ax=ax), since col_pat_columns is a list and you're passing it to df, you're just plotting multiple columns from your dataframe. That's why it's multiple series on a single plot.
#fdireito is correct—you just need to set the ncols argument of plt.subplots() to the right number that you need, but you'd need to adjust your loops to accommodate.
If you want to stay in matplotlib, then here's a basic example. I had to take some guesses as to how your dataframe was structured and so on.
# import matplotlib
import matplotlib.pyplot as plt
# create some fake data
x = [1, 2, 3, 4, 5]
df = pd.DataFrame({
'a':[1, 1, 1, 1, 1], # horizontal line
'b':[3, 6, 9, 6, 3], # pyramid
'c':[4, 8, 12, 16, 20], # steep line
'd':[1, 10, 3, 13, 5] # zig-zag
})
# a list of lists, where each inner list is a set of
# columns we want in the same row of subplots
col_patterns = [['a', 'b', 'c'], ['b', 'c', 'd']]
The following is a simplified example of what your code ends up doing.
fig, axes = plt.subplots(len(col_patterns), 1)
for pat, ax in zip(col_patterns, axes):
ax.plot(x, df[pat])
2x1 subplot (what you have right now)
I use enumerate() with col_patterns to iterate through the subplot rows, and then use enumerate() with each column name in a given pattern to iterate through the subplot columns.
# the following will size your subplots according to
# - number of different column patterns you want matched (rows)
# - largest number of columns in a given column pattern (columns)
subplot_rows = len(col_patterns)
subplot_cols = max([len(x) for x in col_patterns])
fig, axes = plt.subplots(subplot_rows, subplot_cols)
for nrow, pat in enumerate(col_patterns):
for ncol, col in enumerate(pat):
axes[nrow][ncol].plot(x, df[col])
Correctly sized subplot
Here's all the code, with a couple additions I omitted from the code above for simplicity's sake.
import matplotlib.pyplot as plt
x = [1, 2, 3, 4, 5]
df = pd.DataFrame({
'a':[1, 1, 1, 1, 1], # horizontal line
'b':[3, 6, 9, 6, 3], # pyramid
'c':[4, 8, 12, 16, 20], # steep line
'd':[1, 10, 3, 13, 5] # zig-zag
})
col_patterns = [['a', 'b', 'c'], ['b', 'c', 'd']]
# what you have now
fig, axes = plt.subplots(len(col_patterns), 1, figsize=(12, 8))
for pat, ax in zip(col_patterns, axes):
ax.plot(x, df[pat])
ax.legend(pat, loc='upper left')
# what I think you want
subplot_rows = len(col_patterns)
subplot_cols = max([len(x) for x in col_patterns])
fig, axes = plt.subplots(subplot_rows, subplot_cols, figsize=(16, 8), sharex=True, sharey=True, tight_layout=True)
for nrow, pat in enumerate(col_patterns):
for ncol, col in enumerate(pat):
axes[nrow][ncol].plot(x, df[col], label=col)
axes[nrow][ncol].legend(loc='upper left')
Another option you can consider is ditching matplotlib and using Seaborn relplots. There are several examples on that page that should help. If you have your dataframe set up correctly (long or "tidy" format), then to achieve the same as above, your one-liner would look something like this:
# import seaborn as sns
sns.relplot(data=df, kind='line', x=x_vals, y=y_vals, row=col_pattern, col=num_weeks_rolling)
I have problem with sns lineplot and scatterplot. Basically what I'm trying to do is to connect dots of a scatterplot to present closest line joining mapped points. Somehow lineplot is changing width when facing points with tha same x axis values. I want to lineplot to be same, solid line all the way.
The code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
data = {'X': [13, 13, 13, 12, 11], 'Y':[14, 11, 13, 15, 20], 'NumberOfPlanets':[2, 5, 2, 1, 2]}
cts = pd.DataFrame(data=data)
plt.figure(figsize=(10,10))
sns.scatterplot(data=cts, x='X', y='Y', size='NumberOfPlanets', sizes=(50,500), legend=False)
sns.lineplot(data=cts, x='X', y='Y',estimator='max', color='red')
plt.show()
The outcome:
Any ideas?
EDIT:
If I try using pyplot it doesn't work either:
Code:
plt.plot(cts['X'], cts['Y'])
Outcome:
I need one line, which connects closest points (basically what is presented on image one but with same solid line).
Ok, I have finally figured it out. The reason lineplot was so messy is because data was not properly sorted. When I sorted dataframe data by 'Y' values, the outcome was satisfactory.
data = {'X': [13, 13, 13, 12, 11], 'Y':[14, 11, 13, 15, 20], 'NumberOfPlanets':[2, 5, 2, 1, 2]}
cts = pd.DataFrame(data=data)
cts = cts.sort_values('Y')
plt.figure(figsize=(10,10))
plt.scatter(cts['X'], cts['Y'], zorder=1)
plt.plot(cts['X'], cts['Y'], zorder=2)
plt.show()
Now it works. Tested it also on other similar scatter points. Everything is fine :)
Thanks!
I have problem update limits on y-axis.
My idea is to read some csv file, and to plot some graphs.
When I set limits for y-axis, it doesn't show on the plot.
It always shows, values from file.
I'm new in python.
import matplotlib.pyplot as plt
import csv
import numpy as np
x = []
y = []
chamber_temperature = []
with open(r"C:\Users\mm02058\Documents\test.txt", 'r') as file:
reader = csv.reader(file, delimiter = '\t')
for row in (reader):
x.append(row[0])
chamber_temperature.append(row[1])
y.append(row[10])
x.pop(0)
y.pop(0)
chamber_temperature.pop(0)
#print(chamber_temperature)
arr = np.array(chamber_temperature)
n_lines = len(arr)
time = np.arange(0,n_lines,1)
time_sec = time * 30
time_min = time_sec / 60
time_hour = time_min / 60
time_day = time_hour / 24
Fig_1 = plt.figure(figsize=(10,8), dpi=100)
plt.suptitle("Powered Thermal Cycle", fontsize=14, x=0.56, y= 0.91)
plt.subplot(311, xlim=(0, 30), ylim=(-45,90), xticks=(0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30), yticks=( -40, -30, -20, -10, 0, 10, 20, 30, 40, 50, 60, 70, 80, 90), ylabel=("Temperature [°C]"))
plt.plot(time_hour, chamber_temperature, 'k', label='Temperature')
plt.gca().invert_yaxis()
plt.grid()
plt.legend(shadow=True, fontsize=('small'), loc = 'center right', bbox_to_anchor=(1.13, 0.5))
plt.show()
Your code looks suspicious, because I cannot see a conversion from strings (what csv.reader produces) to floating point numbers.
Also your plot look suspicious, because the y tick labels are not sorted!
I decided to check if, by chance, Matplotlib tries to be smarter than it should...
import numpy as np
import matplotlib.pyplot as plt
# let's plot an array of strings, as I suppose you did,
# and see if Matplotlib doesn't like it, or ...
np.random.seed(20210719)
arr_of_floats = 80+10*np.random.rand(10)
arr_of_strings = np.array(["x = %6.3f"%round(x, 2) for x in arr_of_floats])
plt.plot(range(10), arr_of_strings)
plt.show()
Now, let's see what happens if we perform the conversion to floats
# for you it's simply: array(chamber_temperature, dtype=float)
arr_of_floats = np.array([s[4:] for s in arr_of_strings], dtype=float)
plt.plot(range(10), arr_of_floats)
plt.show()
Eventually, do not change axes' limits (etc etc) BEFORE plotting, but:
first, possibly organize your figure (figure size, subplots, etc)
second, plot your data,
third, adjust the details of the graph and
fourth and last, commit your work using plt.show().
Use
plt.ylim([bottom limit, top limit]) #like plt.ylim(84,86)
before your
plt.show()
that should work!
You are setting your x and y lims, as you have the equal sign.
You need to call them like a function (no equal sign).
I want this to print out two histograms (of the first two columns), but this instead stacks the histograms within the same plot. How do I get it to output two separate histograms?
dataobj = pd.DataFrame([[1,2,3],[3,4,5],[6,7,8]])
for i in [0,1]:
a = np.array(dataobj.iloc[:,i])
plt.hist(a,bins = np.linspace(0,10,11))
Even better would be a solution where I can save the plots into an array which I could later call to display them.
Working in Jupyter
dataobj = pd.DataFrame([[1, 2, 3], [3, 4, 5], [6, 7, 8]])
fig, axes = plt.subplots(3, 1)
plt.rcParams['figure.figsize'] = (12, 12)
for i in range(3):
a = np.array(dataobj.iloc[:, i])
axes[i].hist(a, bins=np.linspace(0, 10, 11))
plt.show()
u need to use axes
Just add plt.show() in for loop, no need in subplots and axes. Like this
dataobj = pd.DataFrame([[1,2,3],[3,4,5],[6,7,8]])
for i in [0,1]:
a = np.array(dataobj.iloc[:,i])
plt.hist(a,bins = np.linspace(0,10,11))
plt.show()
How do I bring the other line to the front or show both the graphs together?
plot_yield_df.plot(figsize=(20,20))
If plot data overlaps, then one way to view both the data is increase the linewidth along with handling transparency, as shown:
plt.plot(np.arange(5), [5, 8, 6, 9, 4], label='Original', linewidth=5, alpha=0.5)
plt.plot(np.arange(5), [5, 8, 6, 9, 4], label='Predicted')
plt.legend()
Subplotting is other good way.
Problem
The lines are plotted in the order their columns appear in the dataframe. So for example
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
a = np.random.rand(400)*0.9
b = np.random.rand(400)+1
a = np.c_[a,-a].flatten()
b = np.c_[b,-b].flatten()
df = pd.DataFrame({"A" : a, "B" : b})
df.plot()
plt.show()
Here the values of "B" hide those from "A".
Solution 1: Reverse column order
A solution is to reverse their order
df[df.columns[::-1]].plot()
That has also changed the order in the legend and the color coding.
Solution 2: Reverse z-order
So if that is not desired, you can instead play with the zorder.
ax = df.plot()
lines = ax.get_lines()
for line, j in zip(lines, list(range(len(lines)))[::-1]):
line.set_zorder(j)