ntlk python frenchStemmer - python

I an trying to initialize FrenchStemmer:
stemmer = nltk.stem.FrenchStemmer('french')
and the error is:
AttributeError: 'module' object has no attribute 'FrenchStemmer'
Has anyone seen this error before?

That's because nltk.stem has no module called as FrenchStemmer.
The French stemmer available is in SnowballStemmer() and you can access it by
import nltk
stemmer=nltk.stem.SnowballStemmer('french')
or by
import nltk
stemmer=nltk.stem.snowball.FrenchStemmer()

Related

AttributeError: module 'tensorflow' has no attribute 'read_file'

I am trying to run this code:
from pandas.io.parsers.readers import read_table
from tensorflow.python.ops.gen_io_ops import read_file
t_x, t_y = next(valid_gen)
But I always get this error:
AttributeError: module 'tensorflow' has no attribute 'read_file'.
I found that in many comments they mention that the function has been moved into the tensorflow.io module. Any one can help me to resolve this issue ?
Thanks

AttributeError: module 'databricks.koalas' has no attribute 'DateOffset'

I am working on replacing Pandas library to Koalas Library in my python repo in VS Code. But Koalas module does not seem to have DateOffset() module similar to what pandas has.
I tried this :
import databricks.koalas as ks
kdf["date_col_2"] = kdf["date_col_1"] - ks.DateOffset(months=cycle_info_gap)
It results in the below error :
AttributeError: module 'databricks.koalas' has no attribute 'DateOffset'
Is there any alternative for this in Koalas?

AttributeError: 'English' object has no attribute 'vocal'

Im running this code with en_core_web_sm 2.2.5
>>> import spacy
>>> nlp = spacy.load('en_core_web_sm', parser=False)
>>> print(nlp.vocal[u'fun'].similarity(nlp.vocal[u'humour']))
Traceback (most recent call last): File "", line 1, in
AttributeError: 'English' object has no attribute 'vocal'
First of all, I think you meant vocab instead of vocal.
Second of all, you are trying to access the word-vector and vocab has nothing to do with that.
Finally, you are using the en_core_web_sm model which doesn't support word-vectors according to spaCy official documentation here.
My suggestion is to use en_core_web_md instead. You can download it using the following command:
python -m spacy download en_core_web_md
And you can change your code to be:
>>> import spacy
>>> nlp = spacy.load('en_core_web_md', parser=False)
>>> nlp.(u'fun').similarity(nlp(u'humour'))
0.43595678034197044

Problem loading an tfidf object with pickle

I have an issue with pickle:
from a previous job, I created an sklearn tfidfvectorizer object and I saved it thanks to pickle.
Her is the code I used to do it :
def lemma_tokenizer(text):
lemmatizer=WordNetLemmatizer()
return [lemmatizer.lemmatize(token) for token in
word_tokenize(text.replace("'"," "))]
punctuation=list(string.punctuation)
stop_words=set(stopwords.words("english")+punctuation+['``',"''"]+['doe','ha','wa'])
tfidf = TfidfVectorizr(input='content',tokenizer=lemma_tokenizer,stop_words=stop_words)
pickle.dump(tfidf, open(filename_tfidf, 'wb'))
I saw that if i wan't to load the this tfidf object thanks to pickle, I need to define the function "lemma_tokenizer" before.
So I create the following python script named 'loadtfidf.py' to load the tfidf object :
import pickle
from nltk.stem import WordNetLemmatizer
from nltk.tokenize import word_tokenize
def lemma_tokenizer(text):
lemmatizer=WordNetLemmatizer()
return [lemmatizer.lemmatize(token) for token in word_tokenize(text.replace("'"," "))]
tfidf = pickle.load(open('tfidf.sav', 'rb'))
if I run this script the object is well loaded and everything goes well!
BUT then, I create another python script named 'test.py' in the same directory of 'loadtfidf.py' where I simply try to import loadtfidf:
import loadtfidf
And when I try to run this one line I have the following error :
"""Can't get attribute 'lemma_tokenizer' on module 'main' from '.../test.py'"""
I really don't understand why... I don't even know what to try to fix this error... can you help me to fix it?
Thank you in advance for your help !

'module' object has no attribute 'word_tokenize'

It is first time to play with python, I am trying to copy and run some codes available online to understand how they work, but some of them are not working. This is an example of a problem that I faced when I try to copy a program that does the part of speech in python using NLTK.
import nltk
import re
import time
exampleArray = ['The incredibly intimidating NLP scares people who are sissies.']
def processLanguage():
try:
for item in exampleArray:
tokenized = nltk.word_tokenized(item)
tagged = nltk.pos_tag(tokenized)
except Exception, e:
print str(e)
processLanguage()
The problem is when I run this code, I have this message:
'module' object has no attribute 'word_tokenize'
'module' object has no attribute 'word_tokenized'
Can you please tell me how to solve this.

Categories

Resources