Openpyxl Bar graph Error bars - python

I was wondering if it is possible to set an error bar series to a reference. What I would like to do is reference cells with the error values already calculated. and then have it displayed on top of the bar graphs.
I found a similar question where they have seemed to do a version of what I am trying, but when I edit their example I get an error.
Any advice would be much appreciated. Thank you for your time.
Example of what I want the graph and error bars to look like
Error:
TypeError: expected class 'openpyxl.chart.error_bar.ErrorBars'
similar Question
openpyxl chart error bar styles
My Current Code
chart1 = BarChart()
chart1.type = "col"
chart1.height = 10
chart1.width = col + 7
chart1.title = name
data = Reference(ws, min_col=4, min_row=23, max_col=17)
cats = Reference(ws, min_col=4, min_row=29, max_col = 17)
eBars = Reference(ws, min_col=4, min_row=26, max_col=17)
s= Series(data)
series = SeriesFactory(data, title="y direction error")
series.errBars = eBars
chart1.append(s)
chart1.append(series)
chart1.legend = None
chart1.set_categories(cats)
chart1.x_axis.tickLblPos = "low"
#chart1.x_axis.tickLblSkip = 0
chart1.shape = 10
ws.add_chart(chart1, "C3")

Comment: ... setting a reference to the plus and minus
I see your point, replace numLitwith numRef:
NumDataSource / NumRef
class openpyxl.chart.data_source.NumDataSource(numRef=None, numLit=None)
`numLit` Values must be of type <class ‘openpyxl.chart.data_source.NumData’>
`numRef` Values must be of type <class ‘openpyxl.chart.data_source.NumRef’>
eBarsNumDataSource = NumDataSource(NumRef(eBars))
series.errBars = ErrorBars(errDir='y', errValType='cust', plus=eBarsNumDataSource, minus=eBarsNumDataSource)
Question: TypeError: expected class 'openpyxl.chart.error_bar.ErrorBars'
Your eBars is of Type Reference but you need Type openpyxl.chart.error_bar.ErrorBars.
class openpyxl.chart.error_bar.ErrorBars
class openpyxl.chart.error_bar.ErrorBars(
errDir=None,
errBarType='both',
errValType='fixedVal',
noEndCap=None, plus=None, minus=None, val=None, spPr=None, extLst=None)
You need at least the following Parameters:
ErrorBars(errDir=Value must be one of {‘x’, ‘y’},
plus=Values must be of type <class ‘openpyxl.chart.data_source.NumDataSource’>,
minus=Values must be of type <class ‘openpyxl.chart.data_source.NumDataSource’>,
)
Follow def list2errorbars(... in the linked similar Question.

Related

__init__() got multiple values for argument 'use_technical_indicator' - error

I can't figure out why I am getting this error. If you can figure it out, I'd appreciate it. If you can provide specific instruction, I'd appreciate it. This code is in one module; there are 7 modules total.
Python 3.7, Mac OS, code from www.finrl.org
# Perform Feature Engineering:
df = FeatureEngineer(df.copy(),
use_technical_indicator=True,
use_turbulence=False).preprocess_data()
# add covariance matrix as states
df=df.sort_values(['date','tic'],ignore_index=True)
df.index = df.date.factorize()[0]
cov_list = []
# look back is one year
lookback=252
for i in range(lookback,len(df.index.unique())):
data_lookback = df.loc[i-lookback:i,:]
price_lookback=data_lookback.pivot_table(index = 'date',columns = 'tic', values = 'close')
return_lookback = price_lookback.pct_change().dropna()
covs = return_lookback.cov().values
cov_list.append(covs)
df_cov = pd.DataFrame({'date':df.date.unique()[lookback:],'cov_list':cov_list})
df = df.merge(df_cov, on='date')
df = df.sort_values(['date','tic']).reset_index(drop=True)
df.head()
The function definition statement for FeatureEngineer.__init__ is:
def __init__(
self,
use_technical_indicator=True,
tech_indicator_list=config.TECHNICAL_INDICATORS_LIST,
use_turbulence=False,
user_defined_feature=False,
):
As you can see there is no argument (other than self which you should not provide) before use_technical_indicator, so you should remove the df.copy() from before the use_techincal_indicator in your line 2.
Checking the current FeatureEngineer class, you must to provide the df.copy() parameter to the preprocess_data() method.
So, your code have to look like:
# Perform Feature Engineering:
df = FeatureEngineer(use_technical_indicator=True,
tech_indicator_list = config.TECHNICAL_INDICATORS_LIST,
use_turbulence=True,
user_defined_feature = False).preprocess_data(df.copy())

Spyder charts in the code are not working. What is w?

I am new to Spyder and am working with the KDD1999 data. I am trying to create charts based on the dataset such as total amounts of srv_error rates. However when I try to create these charts errors pop up and I have a few I can't solve. I have commented the code. Does anyone know what is wrong with the code?
#Used to import all packanges annd/or libraries you will be useing
#pd loads and creates the data table or dataframe
import pandas as pd
####Section for loading data
#If the datafile extention has xlsx than the read_excel function should be used. If cvs than read_cvs should be used
#As this is stored in the same area the absoloute path can remain unchanged
df = pd.read_csv('kddcupdata1.csv')
#Pulls specific details
#Pulls first five rows
df.head()
#Pulls first three rows
df.head(3)
#Setting column names
df.columns = ['duration', 'protocol_type', 'service', 'flag', 'src_bytes', 'dst_bytes', 'land', 'wrong_fragment', 'urgent', 'hot', 'num_failed_logins', 'logged_in', 'lnum_compromised', 'lroot_shell', 'lsu_attempted', 'lnum_root', 'lnum_file_creations', 'lnum_shells', 'lnum_access_files', 'lnum_outbound_cmds', 'is_host_login', 'is_guest_login', 'count', 'srv_count', 'serror_rate', 'srv_serror_rate', 'rerror_rate', 'srv_rerror_rate', 'same_srv_rate', 'diff_srv_rate', 'srv_diff_host_rate', 'dst_host_count', 'dst_host_srv_count', 'dst_host_same_srv_rate', 'dst_host_diff_srv_rate', 'dst_host_same_src_port_rate', 'dst_host_srv_diff_host_rate', 'dst_host_serror_rate', 'dst_host_srv_serror_rate', 'dst_host_rerror_rate', 'dst_host_srv_rerror_rate', 'label']
#Scatter graph for number of failed logins caused by srv serror rate
df.plot(kind='scatter',x='num_failed_logins',y='srv_serror_rate',color='red')
#This works
#Total num_failed_logins caused by srv_error_rate
# making a dict of list
info = {'Attack': ['dst_host_same_srv_rate', 'dst_host_srv_rerror_rate'],
'Num' : [0, 1]}
otd = pd.DataFrame(info)
# sum of all salary stored in 'total'
otd['total'] = otd['Num'].sum()
print(otd)
##################################################################################
#Charts that do not work
import matplotlib.pyplot as plt
#1 ERROR MESSAGE - AttributeError: 'list' object has no attribute 'lsu_attempted'
#Bar chart showing total 1su attempts
df['lsu_attempted'] = df['lsu_attempted'].astype(int)
df = ({'lsu_attempted':[1]})
df['lsu_attempted'].lsu_attempted(sort=0).plot.bar()
ax = df.plot.bar(x='super user attempts', y='Total of super user attempts', rot=0)
df.from_dict('all super user attempts', orient='index')
df.transpose()
#2 ERROR MESSAGE - TypeError: plot got an unexpected keyword argument 'x'
#A simple line plot
plt.plot(kind='bar',x='protocol_type',y='lsu_attempted')
#3 ERROR MESSAGE - TypeError: 'set' object is not subscriptable
df['lsu_attempted'] = df['lsu_attempted'].astype(int)
df = ({'lsu_attempted'})
df['lsu_attempted'].lsu_attempted(sort=0).plot.bar()
ax = df.plot.bar(x='protocol_type', y='lsu_attempted', rot=0)
df.from_dict('all super user attempts', orient='index')
df.transpose()
#5 ERROR MESSAGE - TypeError: 'dict' object is not callable
#Bar chart showing total of chosen protocols used
Data = {'protocol_types': ['tcp','icmp'],
'number of protocols used': [10,20,30]
}
bar = df(Data,columns=['protocol_types','number of protocols used'])
bar.plot(x ='protocol_types', y='number of protocols used', kind = 'bar')
df.show()
Note:(Also if anyone has some clear explanation on what its about that would also be healpful please link sources if possible?)
Your first error in this snippet :
df['lsu_attempted'] = df['lsu_attempted'].astype(int)
df = ({'lsu_attempted':[1]})
df['lsu_attempted'].lsu_attempted(sort=0).plot.bar()
ax = df.plot.bar(x='super user attempts', y='Total of super user attempts', rot=0)
df.from_dict('all super user attempts', orient='index')
df.transpose()
The error you get AttributeError: 'list' object has no attribute 'lsu_attempted' is as a result of line two above.
Initially df is a pandas data frame (line 1 above), but from line 2 df = ({'lsu_attempted':[1]}), df is now a dictionary with one key - ‘lsu_attempted’ - which has a value of a list with one element.
so in line 3 when you do df['lsu_attempted'] (as the first part of that statement) this equates to that single element list, and a list doesn’t have the lsu_attempted attribute.
I have no idea what you were trying to achieve but it is my strong guess that you did not intend to replace your data frame with a single key dictionary.
Your 2nd error is easy - you are calling plt.plot incorrectly - x is not a keyword argument - see matplotlib.pyplot.plot - Matplotlib 3.2.1 documentation - x and y are positional arguments.
Your 3rd error message results from the code snippet above - you made df a dictionary - and you can’t call dictionaries.

Bokeh Plotting 'Out of range float values are not JSON compliant' Issue

I try to build a heat map by using bokeh. However I keep getting the same error. I'll include both my code and error below, please help me out!
I assumed that the error is mainly about Nan's in my data, so I've added necessary if statements to the code to make sure that this issue is addressed. Even tried to fill any possible Na's with zero in the following lists: 'user','module','ratio','color', and 'alpha'. However none of these changes helped.
colors = ['#ff0000','#ff1919','#ff4c4c','#ff7f7f','#99cc99','#7fbf7f','#4ca64c','#329932','#008000'] sorted_userlist = list(total_checks_sorted.index) user = [] module = [] ratio = [] color = [] alpha = []
for m_id in ol_module_ids:
pset = m_id.split('/')[-1]
col_name1 = m_id + '_ratio'
col_name2 = m_id + '_total'
min_checks = min(check_matrix[col_name2].values)
max_checks = max(check_matrix[col_name2].values)
for i, u in enumerate(sorted_userlist):
module.append(pset)
user.append(str(i+1))
ratio_value = check_matrix[col_name1][u]
ratio.append(ratio_value)
al= math.sqrt((check_matrix[col_name2][u]-min_checks+0.0001)/float(max_checks))
if ratio_value>0.16:
al = min(al*100,1)
alpha.append(al)
if np.isnan(ratio_value):
color.append(colors[0])
else:
color.append(colors[int(ratio_value*8)])
#fill NAs in source lists with zeroes pd.Series(ratio).fillna(0).tolist()
col_source = ColumnDataSource(data = dict(module = module, user = user, color=color, alpha=alpha, ratio = ratio))
#source = source.fillna('')
#TOOLS = "resize,hover,save,pan,box_zoom,wheel_zoom" TOOLS = "reset,hover,save,pan,box_zoom,wheel_zoom"
p=figure(title="Ratio of Correct Checks Each Student Each Online Homework Problem",
x_range=pset,
#y_range = list(reversed(sorted_userlist)),
y_range=list(reversed(list(map(str, range(1,475))))),
x_axis_location="above", plot_width=900, plot_height=4000,
toolbar_location="left", tools=TOOLS)
#axis_line_color = None)
#outline_line_color = None)#
p.rect("module", "user", 1, 1, source=col_source,
color="color", alpha = 'alpha', line_color=None)
show(p)
NaN values are not JSON serializable (this is a glaring deficiency in the JSON standard). You mentioned there are NaN values in the ratio list, which you are putting in the ColumnDataSource here:
col_source = ColumnDataSource(data=dict(..., ratio=ratio))
Since it is in the CDS, Bokeh will try to serialize it, resulting in the error. You have two options:
If you don't actually need the numeric ratio values in the plot for some reason (e.g. to drive a hover tool or custom mapper or something), then just leave it out of the data source.
If you do need to send the ratio values, then you must put the data into a NumPy array. Bokeh serializes NumPy arrays using a different, non-JSON approach, so it is then possible to send NaNs successfully.

Setting a column header in a table model

I'm not able to set the header of a new column created in a table view.
This is the code:
def addColumn(self):
if self.tableView.selectionModel().hasSelection():
indexes = self.tableView.selectionModel().selectedColumns()
for index in sorted(indexes):
print('Adding column %d...' % index.column())
self.QSModel.insertColumn(index.column()+1)
self.QSModel.setHorizontalHeaderItem(index.column()+1,'XXX')
else:
print('No col selected!')
The error I get is:
self.QSModel.setHorizontalHeaderItem(index.column()+1,'XXX')
TypeError: setHorizontalHeaderItem(self, int, QStandardItem): argument 2 has unexpected type 'str'
How can I solve it?
As the error and the docs point out, the second parameter that is expected is a QStandardItem, not a string, in your case it should be:
self.QSModel.setHorizontalHeaderItem(index.column() + 1, QStandardItem('XXX'))

Heatmap plot of a pandas dataframe - TypeError

I have two pandas dataframes that on inspection look identical. One was created using the Pandas builtin:
df.corr(method='pearson')
While the other was created with a custom function:
def cor_matrix(dataframe, method):
coeffmat = pd.DataFrame(index=dataframe.columns,
columns=dataframe.columns)
pvalmat = pd.DataFrame(index=dataframe.columns, columns=dataframe.columns)
for i in range(dataframe.shape[1]):
for j in range(dataframe.shape[1]):
x = np.array(dataframe[dataframe.columns[i]])
y = np.array(dataframe[dataframe.columns[j]])
bad = ~np.logical_or(np.isnan(x), np.isnan(y))
if method == 'spearman':
corrtest = spearmanr(np.compress(bad,x), np.compress(bad,y))
if method == 'pearson':
corrtest = pearsonr(np.compress(bad,x), np.compress(bad,y))
coeffmat.iloc[i,j] = corrtest[0]
pvalmat.iloc[i,j] = corrtest[1]
return (coeffmat, pvalmat)
Both look identical and have same type (pandas.core.frame.DataFrame) and their entries are also of same type (numpy.float64)
However when I try to plot these using:
import matplotlib.pyplot as plt
plt.imshow((df))
Only the dataframe created with the pandas builtin function works. For the other dataframe I receive the error: TypeError: Image data cannot be converted to float. Can anyone explain what is going on, how the two dataframes are different and what can be done to address the error?
Edit - It looks as though there is one difference, when I convert the dataframes to a numpy array, the one that doesn't work has dtype = object at the end. Is there a way to remove this?
Amending the function to specify the dataframe as float fixed the issue:
def cor_matrix(dataframe, method):
coeffmat = pd.DataFrame(index=dataframe.columns, columns=dataframe.columns)
pvalmat = pd.DataFrame(index=dataframe.columns, columns=dataframe.columns)
for i in range(dataframe.shape[1]):
for j in range(dataframe.shape[1]):
x = np.array(dataframe[dataframe.columns[i]])
y = np.array(dataframe[dataframe.columns[j]])
bad = ~np.logical_or(np.isnan(x), np.isnan(y))
if method == 'spearman':
corrtest = spearmanr(np.compress(bad,x), np.compress(bad,y))
if method == 'pearson':
corrtest = pearsonr(np.compress(bad,x), np.compress(bad,y))
coeffmat.iloc[i,j] = corrtest[0]
pvalmat.iloc[i,j] = corrtest[1]
#This is to convert to float type otherwise can cause problems when e.g. plotting
coeffmat=coeffmat.apply(pd.to_numeric, errors='ignore')
pvalmat=pvalmat.apply(pd.to_numeric, errors='ignore')
return (coeffmat, pvalmat)

Categories

Resources