I am attempting to graph battery cycling data similar to this . Each line is one cycle worth of datapoints and should be one line on the graph. At first the code I wrote simply saw the dataframe as a continuous variable, then I inserted a for loop that would graph 1 line for the 1 cycles worth of data, iterate to the next cycle 2 but currently it simply bugs and does not show any graph. Debug seems to show an issue once it loops past cycle 1. Each cycle does not have an equal amount of data points.
EDIT: I suspect now when looping the headers of the data is causing an issue. I think making a dictionary would solve this issue
df2 = pd.read_excel(r'C:\Users\####\- ##### - ####\2-7_7.xlsx',\
sheet_name='record', usecols="A:N")
df2['Capacity(mAh)'] = df2['Capacity(mAh)'].apply(lambda x: x*1000) #A fix for unit error in the data
df2.set_index('Cycle ID',inplace = True) #Set the index to the Cycle number
for cycle in df2.index:
chosen_cyclex = df2.loc[cycle, 'Capacity(mAh)']
chosen_cycley = df2.loc[cycle,'Voltage(V)']
plt.plot(chosen_cyclex.iloc[1],chosen_cycley.iloc[1])
#print(chosen_cyclex[1],chosen_cycley[1])
plt.show()
I ended up using this method, where the equivalents were selected.
for cycle in cyclearray:
plt.plot(df2[df2.index == cycle]['Capacity(mAh)'],df2[df2.index == cycle]['Voltage(V)'],cycle
For other battery testers who show up here, if you need to 'cut' the voltages curves up, use
plt.xlim([xmin,xmax])
plt.ylim([ymin+0.1,ymax-0.1])
You need to specify an ax when plotting. Here are some examples:
# reproducible (but unimaginative) setup
n = 100
cycles = 4
df2 = pd.DataFrame({
'ID': np.repeat(np.arange(cycles), n),
'Capacity(mAh)': np.tile(np.arange(n), cycles),
'Voltage(V)': (np.arange(n)**0.8 * np.linspace(5, 3, cycles)[:, None]).ravel(),
})
Example 1: using groupby.plot, then fiddle around to adjust labels
fig, ax = plt.subplots()
df2.groupby('ID').plot(x='Capacity(mAh)', y='Voltage(V)', ax=ax)
# now customize the labels
lines, labels = ax.get_legend_handles_labels()
for ith, line in zip('1st 2nd 3rd 4th'.split(), lines):
line.set_label(f'{ith} discharge')
ax.legend()
Example 2: groupby used as an iterator
fig, ax = plt.subplots()
ld = {1: 'st', 2: 'nd', 3: 'rd'}
for cycle, g in df2.groupby('ID'):
label = f'{cycle + 1}{ld.get(cycle + 1, "th")} discharge'
g.plot(x='Capacity(mAh)', y='Voltage(V)', label=label, ax=ax)
Same plot as above.
Example 3: using ax.plot instead of df.plot or similar
fig, ax = plt.subplots()
ld = {1: 'st', 2: 'nd', 3: 'rd'}
for cycle, g in df2.groupby('ID'):
label = f'{cycle + 1}{ld.get(cycle + 1, "th")} discharge'
ax.plot(g['Capacity(mAh)'], g['Voltage(V)'], label=label)
ax.legend()
Related
I have large subplot-based figure to produce in python using matplotlib. In total the figure has in excess of 500 individual plots each with 1000s of datapoints. This can be plotted using a for loop-based approach modelled on the minimum example given below
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
# define main plot names and subplot names
mains = ['A','B','C','D']
subs = list(range(9))
# generate mimic data in pd dataframe
col = [letter+str(number) for letter in mains for number in subs]
col.insert(0,'Time')
df = pd.DataFrame(columns=col)
for title in df.columns:
df[title] = [i for i in range(100)]
# although alphabet and mains are the same in this minimal example this may not always be true
alphabet = ['A', 'B', 'C', 'D']
column_names = [column for column in df.columns if column != 'Time']
# define figure size and main gridshape
fig = plt.figure(figsize=(15, 15))
outer = gridspec.GridSpec(2, 2, wspace=0.2, hspace=0.2)
for i, letter in enumerate(alphabet):
# define inner grid size and shape
inner = gridspec.GridSpecFromSubplotSpec(3, 3,
subplot_spec=outer[i], wspace=0.1, hspace=0.1)
# select only columns with correct letter
plot_array = [col for col in column_names if col.startswith(letter)]
# set title for each letter plot
ax = plt.Subplot(fig, outer[i])
ax.set_title(f'Letter {letter}')
ax.axis('off')
fig.add_subplot(ax)
# create each subplot
for j, col in enumerate(plot_array):
ax = plt.Subplot(fig, inner[j])
X = df['Time']
Y = df[col]
# plot waveform
ax.plot(X, Y)
# hide all axis ticks
ax.axis('off')
# set y_axis limits so all plots share same y_axis
ax.set_ylim(df[column_names].min().min(),df[column_names].max().max())
fig.add_subplot(ax)
However this is slow, requiring minutes to plot the figure. Is there a more efficient (potentially for loop free) method to achieve the same result
The issue with the loop is not the plotting but the setting of the axis limits with df[column_names].min().min() and df[column_names].max().max().
Testing with 6 main plots, 64 subplots and 375,000 data points, the plotting section of the example takes approx 360s to complete when axis limits are set by searching df for min and max values each loop. However by moving the search for min and max outside the loops. eg
# set y_lims
y_upper = df[column_names].max().max()
y_lower = df[column_names].min().min()
and changing
ax.set_ylim(df[column_names].min().min(),df[column_names].max().max())
to
ax.set_ylim(y_lower,y_upper)
the plotting time is reduced to approx 24 seconds.
My data & code are as below
w = [1,2,3,4,5,6,7,8,9,10]
vals = [[1,2,3,4,5,6,7,8,9,10],[2,4,6,8,8,8,8,8,7,1],[1,4,2,4,8,9,8,8,7,2]]
def plot_compare(*id_nums):
fig = plt.figure(figsize=(10, 5))
leg=[]
for id_num in id_nums:
rel = vals[id_num]
sns.lineplot(x=w, y=rel)
leg.append(id_num)
fig.legend(labels=[leg],loc=5,);
plot_compare(0,2)
The idea was to get multiple line plots with just one function (I my actual data I have a lot of values that need to be plotted)
When I run the code as above, I get the plot as below.
Line plots are exactly as I want, but the legend is just one item instead of 2 items (since I have plotted 2 line graphs).
I have tried moving the legend line inside of the for loop but no use. I want a may legends as the line plots.
Can anyone help?
You are having legend as list of list. Instead use fig.legend(labels=leg,loc=5)
Use:
w = [1,2,3,4,5,6,7,8,9,10]
vals = [[1,2,3,4,5,6,7,8,9,10],[2,4,6,8,8,8,8,8,7,1],[1,4,2,4,8,9,8,8,7,2]]
def plot_compare(*id_nums):
fig = plt.figure(figsize=(10, 5))
leg=[]
for id_num in id_nums:
rel = vals[id_num]
sns.lineplot(x=w, y=rel)
leg.append(id_num)
fig.legend(labels=leg,loc=5)
plt.show()
plot_compare(0,2)
The graphs that are output from two distinct nx.draw_networkx commands seem to overlap in the console. How does one properly separate different graphs?
This feels like a silly question, but I have yet to find any solution on the web.
def desenho(dados)
edges,weights = zip(*nx.get_edge_attributes(dados,'weight').items())
pos = nx.spring_layout(dados)
print(nx.draw_networkx(dados, pos, node_color='purple', edgelist=edges, edge_color=weights, width=5.0, edge_cmap=plt.cm.jet), '\n')
graph_1 = desenho(data_1)
graph_2 = desenho(data_2)
I'd expect that each output would process and match with the empty string I put in to create some distance between them, but that isn't happening. What am I doing wrong here?
Present output:
Expected output:
I'd also appreciate suggestions on how to make the color_map a bit less extreme in its gradient.
You need to create a subplot for each graph that you want to plot. For example:
import matplotlib.pyplot as plt
import networkx as nx
def desenho(dados):
edges,weights = zip(*nx.get_edge_attributes(dados,'weight').items())
pos = nx.spring_layout(dados)
nx.draw_networkx(dados, pos, node_color='purple', edgelist=edges, edge_color=weights, width=5.0, edge_cmap=plt.cm.jet)
fig = plt.figure()
ax = fig.add_subplot(2,1,1) # 2,1,1 means: 2:two rows, 1: one column, 1: first plot
graph_1 = desenho(data_1)
ax2 = fig.add_subplot(2,1,2) # 2,1,2 means: 2:two rows, 1: one column, 1: second plot
graph_2 = desenho(data_2)
secondHold = np.zeros((96,30))
channel = ['channel' for x in range(96)]
for i in range (96):
BlankBinsx = bins[blankposition,0:30,i]
StimBinsx = bins[NonBlankPositions,0:30,i]
meanx = BlankBinsx.mean(axis=0);
stimmeanx = StimBinsx.mean(axis=0);
for j in range(30):
hold[i][j] = meanx[j];
secondHold[i][j] = stimmeanx[j];
plt.subplots(1, 1, sharex='all', sharey='all')
plt.plot(hold[i], label='stimulus')
plt.plot(secondHold[i], label='Blank Stimulus')
plt.title('Channel x')
plt.xlabel('time (ms)')
plt.ylabel('Avg Spike Rate')
plt.legend()
plt.show()
I am creating 96 different graphs through a for-loop and I want it to also label the graphs (i.e., the first graph would be 'Channel 1', graph two 'Channel 2' and so on. I tried ax.set_title but couldn't figure it out how to make it work with the string and numbers.
Also I'd like the graphs to print as a 6x16 subplots instead of 96 graphs in a column.
You are creating a new figure each time in your for loop that's why you get 96 figures. I don't have your data so I can't provide a final figure but the following should work for you. The idea here is:
Define a figure and an array of axes containing 6x16 subplots.
Use enumerate on axes.flatten to iterate through the subfigures ax row wise and use i as the index to access the data.
Use the field specifier %d to label the subplots iteratively.
Put plt.show() outside the for loop
secondHold = np.zeros((96,30))
channel = ['channel' for x in range(96)]
fig, axes = plt.subplots(nrows=6, ncols=16, sharex='all', sharey='all')
for i, ax in enumerate(axes.flatten()):
BlankBinsx = bins[blankposition,0:30,i]
StimBinsx = bins[NonBlankPositions,0:30,i]
meanx = BlankBinsx.mean(axis=0);
stimmeanx = StimBinsx.mean(axis=0);
for j in range(30):
hold[i][j] = meanx[j];
secondHold[i][j] = stimmeanx[j];
ax.plot(hold[i], label='stimulus')
ax.plot(secondHold[i], label='Blank Stimulus')
ax.set_title('Channel %d' %i)
ax.set_xlabel('time (ms)')
ax.set_ylabel('Avg Spike Rate')
ax.legend()
plt.show()
The code below achieves what I want to do, but does so in a very roundabout way. I have looked around for a succinct way to produce a single legend for a figure that includes multiple subplots that takes into account their labels, to no avail. plt.figlegend() requires you to pass in labels and lines, and plt.legend() requires only handles (slightly better).
My example below illustrates what I want. I have 9 vectors, each with one of 3 categories. I want to plot each vector on a separate sub plot, label it, and plot a legend which indicates (using colour) what the label means; this is the automatic behaviour on a single plot.
Do you know of a better way of achieving the plot below?
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
nr_lines = 9
nr_cats = 3
np.random.seed(1337)
# Data
X = np.random.randn(nr_lines, 100)
labels = ['Category {}'.format(ii) for ii in range(nr_cats)]
y = np.random.choice(labels, nr_lines)
# Ideally wouldn't have to manually pick colours
clrs = matplotlib.rcParams['axes.prop_cycle'].by_key()['color']
clrs = [clrs[ii] for ii in range(nr_cats)]
lab_clr = {k: v for k, v in zip(labels, clrs)}
fig, ax = plt.subplots(3, 3)
ax = ax.flatten()
for ii in range(nr_lines):
ax[ii].plot(X[ii,:], label=y[ii], color=lab_clr[y[ii]])
lines = [a.lines[0] for a in ax]
l_labels = [l.get_label() for l in lines]
# the hack - get a single occurance of each label
idx_list = [l_labels.index(lab) for lab in labels]
lines_ = [lines[idx] for idx in idx_list]
#l_labels_ = [l_labels[idx] for idx in idx_list]
plt.legend(handles=lines_, bbox_to_anchor=[2, 2.5])
plt.tight_layout()
plt.savefig('/home/james/Downloads/stack_figlegend_example.png',
bbox_inches='tight')
You could use a dictionary to collect them using the label as a key. For example:
handles = {}
for ii in range(nr_lines):
l1, = ax[ii].plot(X[ii,:], label=y[ii], color=lab_clr[y[ii]])
if y[ii] not in handles:
handles[y[ii]] = l1
plt.legend(handles=handles.values(), bbox_to_anchor=[2, 2.5])
You only add a handle to the dictionary if the category isn't already present.