AttributeError: 'str' object has no attribute 'search_nodes' - Python - python

I've built a tree using ete2 package. Now I'm trying to write a piece of code that takes the data from the tree and a csv file and does some data analysis through the function fre.
Here is an example of the csv file I've used:
PID Code Value
1 A1... 6
1 A2... 5
2 A.... 4
2 D.... 1
2 A1... 2
3 D.... 5
3 D1... 3
3 D2... 5
Here is a simplified version of the code
from ete2 import Tree
import pandas as pd
t= Tree("((A1...,A2...)A...., (D1..., D2...)D....).....;", format=1)
data= pd.read_csv('/data_2.csv', names=['PID','Code', 'Value'])
code_count = data.groupby('Code').sum()
total_patients= len(list (set(data['PID'])))
del code_count['PID']
############
def fre(code1,code2):
code1_ancestors=[]
code2_ancestors=[]
for i in t.search_nodes(name=code1)[0].get_ancestors():
code1_ancestors.append(i.name)
for i in t.search_nodes(name=code2)[0].get_ancestors():
code2_ancestors.append(i.name)
common_ancestors = []
for i in code1_ancestors:
for j in code2_ancestors:
if i==j:
common_ancestors.append(i)
print common_ancestors
####
for i in patients_list:
a= list (data.Code[data.PID==patients_list[i-1]])
#print a
for j in patients_list:
b= list (data.Code[data.PID==patients_list[j-1]])
for k in a:
for t in b:
fre (k,t)
However, an error is raising which is:
AttributeError Traceback (most recent call last)
<ipython-input-12-f9b47fcec010> in <module>()
38 for k in a:
39 for t in b:
---> 40 fre (k,t)
<ipython-input-12-f9b47fcec010> in fre(code1, code2)
12 code1_ancestors=[]
13 code2_ancestors=[]
---> 14 for i in t.search_nodes(name=code1)[0].get_ancestors():
15 code1_ancestors.append(i.name)
16 for i in t.search_nodes(name=code2)[0].get_ancestors():
AttributeError: 'str' object has no attribute 'search_nodes'
I've tried to manually pass all possible values to the function and it works! However, When I'm using the last section of the code, it raises the error.

You're changing your global variable 't' with your for loop.
If you print out its value before each call to your function, you will find that you have assigned it to a string at some point.

Related

TypeError: 'numpy.float64' object cannot be interpreted as an integer fake news detection

I am getting this error and not able to resolve it and not able to find it on the internet.
TypeError: 'numpy.float64' object cannot be interpreted as an integer
TypeError Traceback (most recent call last)
<ipython-input-10-33f2a17ec582> in <module>
20 print("Saving New CSV file")
21 if __name__=='__main__':
---> 22 dataSetExtraction()
<ipython-input-10-33f2a17ec582> in dataSetExtraction()
6 dfReal=processRealNewsDataFrame(dfReal)
7 dfCombine=[]
----> 8 for d in extractTopRealResultsForCrawling(dfReal):
9 print('len of datadrame :',d['URL'].size)
10 #d=d[:100]
<ipython-input-6-9dbfd3f21499> in extractTopRealResultsForCrawling(dfReal)
6 listOfIndex=[]
7 df=[]
----> 8 for i in range(0,loop):
9 listOfIndex.append(dfReal[i*10000:(i+1)*10000])
10 df+=[dfReal[i*10000:(i+1)*10000]]
TypeError: 'numpy.float64' object cannot be interpreted as an integer
This is code giving the error. I have not been able to remove the error Please help me
def extractTopRealResultsForCrawling(dfReal):
print("Retrieve top 20000 Real news data")
num=dfReal.size
loop=num/10000
listOfIndex=[]
df=[]
for i in range(0,loop):
listOfIndex.append(dfReal[i*10000:(i+1)*10000])
df+=[dfReal[i*10000:(i+1)*10000]]
#print "length of dataframe array retrieved:",len(df[0])
return df[:LEN]
The range function can only receive integer values
Here is a minimal code reproducing (more or less) the problem:
>>> a = 2.0
>>> [i for i in range(a)]
Traceback (most recent call last):
File "<pyshell#15>", line 1, in <module>
[i for i in range(a)]
TypeError: 'float' object cannot be interpreted as an integer
You need to convert the value to an integer
>>> [i for i in range(int(a))]
[0, 1]
In your code you should use:
for i in range(int(loop)):
Alternatively, you could do:
for i in range(0, num, 10000):
listOfIndex.append(dfReal[i:i+10000])
df+=[dfReal[i:i+10000]]
avoiding the division...

Getting the length of text in a dataframe in python

So i have this dataframe:
Text target
#Coronavirus is a cover for something else. #5... D
Crush the One Belt One Road !! \r\n#onebeltonf... B
RT #nickmyer: It seems to be, #COVID-19 aka #c... B
#Jerusalem_Post All he knows is how to destroy... B
#newscomauHQ Its gonna show us all. We will al... B
Where Text are tweets and i am trying to get the count of each string in the text column and input the count into the dataframe. And i have tried this
d = pd.read_csv('5gCoronaFinal.csv')
d['textlength'] = [len(int(t)) for t in d['Text']]
But it keeps giving me this error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-42-dabcab1de7b2> in <module>
----> 1 d['textlength'] = [len(t) for t in d['Text']]
<ipython-input-42-dabcab1de7b2> in <listcomp>(.0)
----> 1 d['textlength'] = [len(t) for t in d['Text']]
TypeError: object of type 'float' has no len()
I've tried converting t to integer like so:
d['textlength'] = [len(int(t)) for t in d['Text']]
but then it gives me this error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-43-9ae56e5f7912> in <module>
----> 1 d['textlength'] = [len(int(t)) for t in d['Text']]
<ipython-input-43-9ae56e5f7912> in <listcomp>(.0)
----> 1 d['textlength'] = [len(int(t)) for t in d['Text']]
ValueError: invalid literal for int() with base 10: '#Coronavirus is a cover for something else. #5g is being rolled out and they are expecting lots to...what? Die from #60ghz +. They look like they are to keep the cold in? #socialdistancing #covid19 #
I need some help thanks!
You can use the str accessor for vectorised string operations. In this case you can use str.split and str.len:
df['Text_length'] = df.Text.str.split().str.len()
print(df)
Text target Text_length
0 #Coronavirus is a cover for something else. #5... D 8
1 Crush the One Belt One Road !! \r\n#onebeltonf... B 8
2 RT #nickmyer: It seems to be, #COVID-19 aka # B 9
3 #Jerusalem_Post All he knows is how to destroy B 8
4 #newscomauHQ Its gonna show us all. We will al B 9

I don't know why I'm getting this error or index out of range. I'm using Python 3.0 in jupyter notebook

import random
from IPython.display import clear_output
dictionary = open("words_50000.txt","r")
dict_5000 = dictionary.readlines()
guess = random.choice(dict_5000).lower().strip('\n')
no_of_letters = len(guess)
game_str = ['-']*no_of_letters
only_length=[]
def word_guesser():
only_length_words()
print(dict_5000)
def only_length_words():
for i in range(len(dict_5000)):
if len(dict_5000[i].strip('\n'))!=no_of_letters:
dict_5000.pop(i)
word_guesser()
--------------------------------------------------------------------------- IndexError Traceback (most recent call
last) in ()
20 dict_5000.pop(i)
21
---> 22 word_guesser()
in word_guesser()
11
12 def word_guesser():
---> 13 only_length_words()
14 print(dict_5000)
15
in only_length_words()
17 def only_length_words():
18 for i in range(len(dict_5000)):
---> 19 if len(dict_5000[i].strip('\n'))!=no_of_letters:
20 dict_5000.pop(i)
21
IndexError: list index out of range
The problem is that you are using pop(), which mutilates the list, but you're also iterating over the list. So, let's say there is an element in the list that you popped out. Now, the length of the mutilated list is shorter then the original but the for loop will still try to iterate till the original length of the list and this is causing the IndexError.

While calculate SSE getting Error : 'numpy.float64' object is not iterable

I am trying to calculate the Sum of Square Error(SSE), code mentioned below
def SSEadver(tv_train,radio_train,newsppr_train,y_train):
y_train_predct = []
sse_train = 0
y_train_predct_srs = 0
# Calculating the predicted sales values on training data
for i in range(0,len(tv_train)):
y_train_predct.append(2.8769666223179353 + (0.04656457* tv_train.iloc[i])+ (0.17915812*(radio_train.iloc[i])) + (0.00345046*(newsppr_train.iloc[i])))
# ** Here I Convert y_train_predct's type List to Series, but still it is showing type as list**
y_train_predct_srs = pd.Series(y_train_predct)
# *** Due above converting not working here y_train_predct_srs.iloc[j]) is not working***
# Now calculate SSE (sum of Squared Errors)
for j in range (len(y_train)):
sse_train += sum((y_train.iloc[j] - y_train_predct_srs.iloc[j])**2)
return y_train_predct, y_train_predct_srs
sse_train = SSEadver(tv_train,radio_train,newsppr_train, y_train)
While I run this code I am getting error :
TypeError Traceback (most recent call last)
<ipython-input-446-576e1af02461> in <module>()
20 return y_train_predct, y_train_predct_srs
21
---> 22 sse_train = SSEadver(tv_train,radio_train,newsppr_train, y_train)
23
<ipython-input-446-576e1af02461> in SSEadver(tv_train, radio_train, newsppr_train, y_train)
14 # Now calculate SSE (sum of Squared Errors)
15 for j in range (len(y_train)):
---> 16 sse_train += sum((y_train.iloc[j] - y_train_predct_srs.iloc[j])**2)
17
18
TypeError: 'numpy.float64' object is not iterable
why am I getting this error? I am using Python 3.X.X
I can't see all your code, however, it looks like either
y_train.iloc or y_train_predct_srs.iloc
are not lists, but in fact numpy.float64. You should check they are definitely lists, and try again.

Getting an error "The suffix tree string must not contain terminal symbol!"

I want to build a generalized suffix tree. I am using http://www.daimi.au.dk/~mailund/suffix_tree.html for this implementation. My code is as follows:
s1 = u'abcd'
x = 36
for i in range(x):
listing.append(s1)
stree = GeneralisedSuffixTree(listing)
For the value of x = 35 the code is woking fine but for x = 36 or more then I'm getting following error
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-36-e7b83eadb302> in <module>()
6 listing.append(s1)
7
----> 8 stree = GeneralisedSuffixTree(listing)
9
10 count = []
/home/darshan/anaconda/lib/python2.7/site-packages/suffix_tree.pyc in __init__(self, sequences)
113 self.sequences += [u'']
114
--> 115 SuffixTree.__init__(self,concatString)
116 self._annotateNodes()
117
/home/darshan/anaconda/lib/python2.7/site-packages/suffix_tree.pyc in __init__(self, s, t)
60 must not contain the special symbol $.'''
61 if t in s:
---> 62 raise "The suffix tree string must not contain terminal symbol!"
63 _suffix_tree.SuffixTree.__init__(self,s,t)
64
TypeError: exceptions must be old-style classes or derived from BaseException, not str
Exceptions are from this file https://github.com/Yacoby/suffix-tree-unicode/blob/master/suffix_tree.py
I don't understand why it is working for values x < 36 but not for other values.
Please help me to understand what is going on here.

Categories

Resources