I am new to Spyder and am working with the KDD1999 data. I am trying to create charts based on the dataset such as total amounts of srv_error rates. However when I try to create these charts errors pop up and I have a few I can't solve. I have commented the code. Does anyone know what is wrong with the code?
#Used to import all packanges annd/or libraries you will be useing
#pd loads and creates the data table or dataframe
import pandas as pd
####Section for loading data
#If the datafile extention has xlsx than the read_excel function should be used. If cvs than read_cvs should be used
#As this is stored in the same area the absoloute path can remain unchanged
df = pd.read_csv('kddcupdata1.csv')
#Pulls specific details
#Pulls first five rows
df.head()
#Pulls first three rows
df.head(3)
#Setting column names
df.columns = ['duration', 'protocol_type', 'service', 'flag', 'src_bytes', 'dst_bytes', 'land', 'wrong_fragment', 'urgent', 'hot', 'num_failed_logins', 'logged_in', 'lnum_compromised', 'lroot_shell', 'lsu_attempted', 'lnum_root', 'lnum_file_creations', 'lnum_shells', 'lnum_access_files', 'lnum_outbound_cmds', 'is_host_login', 'is_guest_login', 'count', 'srv_count', 'serror_rate', 'srv_serror_rate', 'rerror_rate', 'srv_rerror_rate', 'same_srv_rate', 'diff_srv_rate', 'srv_diff_host_rate', 'dst_host_count', 'dst_host_srv_count', 'dst_host_same_srv_rate', 'dst_host_diff_srv_rate', 'dst_host_same_src_port_rate', 'dst_host_srv_diff_host_rate', 'dst_host_serror_rate', 'dst_host_srv_serror_rate', 'dst_host_rerror_rate', 'dst_host_srv_rerror_rate', 'label']
#Scatter graph for number of failed logins caused by srv serror rate
df.plot(kind='scatter',x='num_failed_logins',y='srv_serror_rate',color='red')
#This works
#Total num_failed_logins caused by srv_error_rate
# making a dict of list
info = {'Attack': ['dst_host_same_srv_rate', 'dst_host_srv_rerror_rate'],
'Num' : [0, 1]}
otd = pd.DataFrame(info)
# sum of all salary stored in 'total'
otd['total'] = otd['Num'].sum()
print(otd)
##################################################################################
#Charts that do not work
import matplotlib.pyplot as plt
#1 ERROR MESSAGE - AttributeError: 'list' object has no attribute 'lsu_attempted'
#Bar chart showing total 1su attempts
df['lsu_attempted'] = df['lsu_attempted'].astype(int)
df = ({'lsu_attempted':[1]})
df['lsu_attempted'].lsu_attempted(sort=0).plot.bar()
ax = df.plot.bar(x='super user attempts', y='Total of super user attempts', rot=0)
df.from_dict('all super user attempts', orient='index')
df.transpose()
#2 ERROR MESSAGE - TypeError: plot got an unexpected keyword argument 'x'
#A simple line plot
plt.plot(kind='bar',x='protocol_type',y='lsu_attempted')
#3 ERROR MESSAGE - TypeError: 'set' object is not subscriptable
df['lsu_attempted'] = df['lsu_attempted'].astype(int)
df = ({'lsu_attempted'})
df['lsu_attempted'].lsu_attempted(sort=0).plot.bar()
ax = df.plot.bar(x='protocol_type', y='lsu_attempted', rot=0)
df.from_dict('all super user attempts', orient='index')
df.transpose()
#5 ERROR MESSAGE - TypeError: 'dict' object is not callable
#Bar chart showing total of chosen protocols used
Data = {'protocol_types': ['tcp','icmp'],
'number of protocols used': [10,20,30]
}
bar = df(Data,columns=['protocol_types','number of protocols used'])
bar.plot(x ='protocol_types', y='number of protocols used', kind = 'bar')
df.show()
Note:(Also if anyone has some clear explanation on what its about that would also be healpful please link sources if possible?)
Your first error in this snippet :
df['lsu_attempted'] = df['lsu_attempted'].astype(int)
df = ({'lsu_attempted':[1]})
df['lsu_attempted'].lsu_attempted(sort=0).plot.bar()
ax = df.plot.bar(x='super user attempts', y='Total of super user attempts', rot=0)
df.from_dict('all super user attempts', orient='index')
df.transpose()
The error you get AttributeError: 'list' object has no attribute 'lsu_attempted' is as a result of line two above.
Initially df is a pandas data frame (line 1 above), but from line 2 df = ({'lsu_attempted':[1]}), df is now a dictionary with one key - ‘lsu_attempted’ - which has a value of a list with one element.
so in line 3 when you do df['lsu_attempted'] (as the first part of that statement) this equates to that single element list, and a list doesn’t have the lsu_attempted attribute.
I have no idea what you were trying to achieve but it is my strong guess that you did not intend to replace your data frame with a single key dictionary.
Your 2nd error is easy - you are calling plt.plot incorrectly - x is not a keyword argument - see matplotlib.pyplot.plot - Matplotlib 3.2.1 documentation - x and y are positional arguments.
Your 3rd error message results from the code snippet above - you made df a dictionary - and you can’t call dictionaries.
Related
I am stuck and I have looked up others solutions to this but I don't quite understand. In my code I have a giant matrix in a csv file that I want to iterate data in my 4th column only. It is called 'MovementTime' i thought that by calling it the way shown below I could iterate my data and therefore sort it. I am getting the error
'str' object has no attribute 'values'
Can someone explain to me why im getting this error?
Thank you!
bigdata = pd.read_csv(r'Assetslog_912021_11.csv')
data = pd.DataFrame(bigdata)
#create a function to analyze data
def analytics(data):
data.columns = ['Time', 'Fixed Delta', 'Movement Time', 'MovementNumber', 'Rest Flag', 'DistortionDigit', 'RobotForceX','RobotForceY','RobotForceZ', 'PrevPositionX','PrevPositionY','PrevPositionZ', 'TargetPosZ', 'TargetPosY', 'TargetPosZ', 'PlayerPosX', 'PlayerPosX', 'PlayerPosY', 'PlayerPosZ', 'RobotVelX','RobotVelY','RobotVelZ', 'LocalPosX', 'LocalPosY', 'LocalPosZ', 'PerpError', 'ExtError']
i = np.iterable(data.columns)
for i in set(data['MovementNumber'.]):
print("Plot for Movement Number " + str(i))
data2 = data.loc[['MovementNumber'] == i]
ax = plt.axes(projection = '3d')
xdata = data2['PlayerPosX'].values
ydata = data2['PlayerPosY'].values
zdata = data2['PlayerPosZ'].values
plot1 =ax.scatter3D(xdata,ydata,zdata, c=zdata)
plt.show(plot1)
This line is not right:
data2 = data.loc[['MovementNumber'] == i]
That's going to compare a list containing a string to an integer, which will always be false. I believe you want
data2 = data[data['MovementNumber'] == i]]
That assigns to data2 all the rows where MovementNumber is i.
And, by the way, your indentation is wrong. I assume you want one plot per movement number, so all the lines starting with ax = ... need to be indented, so they are inside the loop.
I'm having an issue with this:
I'm getting the error "
bar() missing 1 required positional argument: 'self'
"
I've tried fiddling this classes (using them and not using them) and with the self variable but I've got nothing. The function bar() comes from the pandas library I've imported as well as the dataframe (df) object. I've attached the main function of my code and the function in which the error is occurring.
def createDataframe(assessments):
df = pd.DataFrame
for review in assessments:
for skills in review.skillList:
for skill in skills:
tmp = pd.DataFrame({str(skill[:2]): [skill[3:]]})
df.merge(tmp, how = 'right', right=tmp)
return df
def plotData(df):
ax = df.plot.bar(x='1.')
plt.show()
def main():
# Ensure proper CMD Line Arg
if len(sys.argv) > 3:
print("Error!")
return 1
assessments = dataParse()
df = createDataframe(assessments)
plotData(df)
Any help is welcome! Let me know!
EDIT:
as tdy said in a comment below. I needed to add Parentheses to create an instance. Now I get no errors but I am left with nothing when printing df and nothing shows when plotting the information
Pandas data frame do not have an option for in-place merge. In your code when you merge df, then assign it back to df like so:
df = df.merge(tmp, how=‘right’, right=tmp)
I am trying to predict the stock price of Facebook on the 1664th row of the .csv file. I am encountering an error when it comes to appending a np.array. Here's my code:
##predicts price of facebook stock for one day
from sklearn.svm import SVR
import numpy as np
import pandas as pd
##store and show data
df = pd.read_csv (r'fb.csv')
##get and print last row of data
actual_price = df.tail(1)
#print(actual_price)
##prepare and print svr models
##get all of the data except for last row
df = df.head(len(df)-1)
ind = (np.arange((len(df.index))))
df["index"] = ind
##create empty list to store dependent and independent data
# days1 =
days = np.array([])
adj_close_prices = np.array([])
##get the date and adjusted close prices
df_days = df.loc[:, 'index']
df_adj_close = df.loc[:, 'Adj Close']
##create the independent dataset ### this part to specify
for day in df_days:
days = np.append(float(day))
And the error which keeps occurring is the following:
days = np.append(float(day))
File "<__array_function__ internals>", line 4, in append
TypeError: _append_dispatcher() missing 1 required positional argument: 'values'
I have very basic level of Python knowledge and have been using YouTube and online resources to come up with what I have already.
I haven't checked your whole code, but the append problem isn't too hard to fix. Check out numpy.append() documentation and you will notice that it takes 2 parameters that are required (and a third that is optional), in a form of numpy.append(array_to_apend_to, value_to_apend) so, in your code it should look like days = np.append(days, float(day))
I was wondering if it is possible to set an error bar series to a reference. What I would like to do is reference cells with the error values already calculated. and then have it displayed on top of the bar graphs.
I found a similar question where they have seemed to do a version of what I am trying, but when I edit their example I get an error.
Any advice would be much appreciated. Thank you for your time.
Example of what I want the graph and error bars to look like
Error:
TypeError: expected class 'openpyxl.chart.error_bar.ErrorBars'
similar Question
openpyxl chart error bar styles
My Current Code
chart1 = BarChart()
chart1.type = "col"
chart1.height = 10
chart1.width = col + 7
chart1.title = name
data = Reference(ws, min_col=4, min_row=23, max_col=17)
cats = Reference(ws, min_col=4, min_row=29, max_col = 17)
eBars = Reference(ws, min_col=4, min_row=26, max_col=17)
s= Series(data)
series = SeriesFactory(data, title="y direction error")
series.errBars = eBars
chart1.append(s)
chart1.append(series)
chart1.legend = None
chart1.set_categories(cats)
chart1.x_axis.tickLblPos = "low"
#chart1.x_axis.tickLblSkip = 0
chart1.shape = 10
ws.add_chart(chart1, "C3")
Comment: ... setting a reference to the plus and minus
I see your point, replace numLitwith numRef:
NumDataSource / NumRef
class openpyxl.chart.data_source.NumDataSource(numRef=None, numLit=None)
`numLit` Values must be of type <class ‘openpyxl.chart.data_source.NumData’>
`numRef` Values must be of type <class ‘openpyxl.chart.data_source.NumRef’>
eBarsNumDataSource = NumDataSource(NumRef(eBars))
series.errBars = ErrorBars(errDir='y', errValType='cust', plus=eBarsNumDataSource, minus=eBarsNumDataSource)
Question: TypeError: expected class 'openpyxl.chart.error_bar.ErrorBars'
Your eBars is of Type Reference but you need Type openpyxl.chart.error_bar.ErrorBars.
class openpyxl.chart.error_bar.ErrorBars
class openpyxl.chart.error_bar.ErrorBars(
errDir=None,
errBarType='both',
errValType='fixedVal',
noEndCap=None, plus=None, minus=None, val=None, spPr=None, extLst=None)
You need at least the following Parameters:
ErrorBars(errDir=Value must be one of {‘x’, ‘y’},
plus=Values must be of type <class ‘openpyxl.chart.data_source.NumDataSource’>,
minus=Values must be of type <class ‘openpyxl.chart.data_source.NumDataSource’>,
)
Follow def list2errorbars(... in the linked similar Question.
I'm interested in creating Choropleth map with Python on a county level. When I run my code without trying to bind data to it I get the county lines drawn in beautifully. However whenever I try to bind my data I get KeyError: None.
From my searching it appeared as though this is due to values in the GeoJson not matching up with the values in the data file... but I went in manually and checked and have already edited the data so there are the exact same number of rows and exact same values... still getting the same error. Very frustrating :(
My code:
import folium
from folium import plugins
from folium.plugins import Fullscreen
import pandas as pd
county_geo = 'Desktop\counties.json'
county_data = 'Desktop\fips.csv'
# Read into Dataframe, cast to string for consistency.
df = pd.read_csv(county_data, na_values=[' '])
df['FIPS'] = df['FIPS'].astype(str)
m = folium.Map(location=[48, -102], zoom_start=3)
m.choropleth(geo_path=county_geo,
data=df,
columns=['FIPS', 'Value'],
key_on='feature.properties.id',
fill_color='PuBu')
Fullscreen().add_to(m)
m
And my error:
KeyError: None
Out[32]:
folium.folium.Map at 0x10231748
Any advice or example code/files that are working for you on a county level would be much appreciated!
EDIT:
I found my own error.
key_on='feature.properties.id',
Should be:
key_on='feature.id',
import json
keys=[k['id'] for k in json.load(open('Desktop\counties.json')['features']]
missing_keys=set(keys)-set(plot_data['FIPS'])
dicts=[]
for k in missing_keys:
row={}
dicts.append({'FIPS': k, 'Value': 0})
dicts
mapdata = country_data
mapdata = mapdata.append(dicts, ignore_index=True)
This will find the missing keys in DataFrame and create new rows with 0 value.
This might resolve your key error problem