Translate a Pandas df using googletrans, AttributeError error

Translate a Pandas df using googletrans, AttributeError error - python

I am trying to translate words from a Pandas dataframe column and get error in googletrans.Translator() class. It works normal with single words or phrases. Can it be environmental issue?
Any help or suggestions much appreciated
import pandas as pd
from googletrans import Translator
translator = Translator()
df = pd.DataFrame({'Spanish':['piso','cama']})
df['English'] = df['Spanish'].apply(translator.translate, src='es', dest='en').apply(getattr, args=('text',))
Output:
AttributeError: 'Translator' object has no attribute 'raise_Exception'

Hi this error occured because there is an exception occurred in the runtime. To see the error insert this below code
python translator.raise_Exception = True
If you get the error as below
Exception: Unexpected status code "429" from ['translate.google.com']
which means Too many requests. Hope you would not get this error. If so you have to upgrade you account. To avoid the error please refer this answer
Source 1

Related

Translate a column with English and Dutch text to only English using GoogleTrans

I have a data frame with tweets and I want to translate it to only English. The problem is that the source column has both English and Dutch tweets. I used the following code to try to translate this column:
from googletrans import Translator
translator = Translator()
df_posts['text_en'] = df_posts['text'].apply(lambda x: translator.translate(x, dest='en').text)
I tried some other code from stack as well, but nothing worked. I did update the translator package, so I don't have the common NoneType error anymore. The error I get looks like this:
error that needs to be solved
The source data in the "text" column from the df_posts data frame looks like this (note that this part only shows the English text that doesn't need to be translated):
data that needs translating

variable structure in json data source

thanks for your time.
I have a dataframe in pyspark in Databricks that reads json. The data from the source does not always have the same structure, sometimes the 'emailAddress' field does not appear, causing me the error "org.apache.spark.sql.AnalysisException: cannot resolve ...".
I have tried to solve by applying a Try-Except function in this way:
try:
df_json = df_json.select("responseID", "surveyID", "surveyName","timestamp", "customVariables.Id_Cliente", "timestamp", "responseSet", "emailAddress")
except ValueError:
None
But it does not work for me, it returns the same error that I mentioned.
I am even trying to take another alternative but without results:
if 'Id_Cliente' in s_fields:
try:
df_json = df_json.select("responseID", "surveyID", "surveyName","timestamp", "customVariables.Id_Cliente", "timestamp", "responseSet", "emailAddress")
except ValueError:
df_json = df_json.select("responseID", "surveyID", "surveyName","timestamp", "customVariables.Id_Cliente", "timestamp", "responseSet")
Please help me with some idea to control this situation? I need to stop the execution of my notebook when it does not find the field in the structure, otherwise (it finds the emailAddress variable) to continue processing.
From already thank you very much.
Greetings.

You're catching ValueError while the exception is AnalysisException, that's why it doesn't work.
from pyspark.sql.utils import AnalysisException
try:
df.select('xyz')
except AnalysisException:
print(123)

I am gettting an error when using pandas parse_date

I am running the following python code block using pandas parse_date but get a syntax error, since I am still struggling with finding the proper package for my atom editor to help me with syntax error detection I would appreciate any help I can get on this.
marketing = pd.read_csv.('/Users/name/Folder/marketing.csv', parse_dates=['date_served', 'date_subscribed', 'date_caneled'])

Take out the dot before the first paren.
marketing = pd.read_csv.('/Users/name/Folder/marketing.csv', parse_dates=['date_served', 'date_subscribed', 'date_caneled'])

many of the nltk package methods / tools are not working

1)I tried the code from the official book on nltk package named /Natural Language Processing' but it gives error
dt = nltk.DiscourseTester(['A student dances', 'Every student is a person'])
print(dt.readings())
I get the error
NLTK was unable to find the mace4 file!
Use software specific configuration paramaters or set the PROVER9 environment variable.
2)I tried to use another code from the book:
from nltk import load_parser
parser = load_parser('drt.fcfg', logic_parser=nltk.DrtParser())
trees = parser.parse('Angus owns a dog'.split())
print(trees[0].node['sem'].simplify())
I got the error
AttributeError: module 'nltk' has no attribute 'DrtParser'
3)I tried the below code:
from nltk.sem import cooper_storage as cs
sentence = 'every girl chases a dog'
trees = cs.parse_with_bindops(sentence, grammar='storage.fcfg')
semrep = trees[0].label()
cs_semrep = cs.CooperStore(semrep)
print(cs_semrep.core)
for bo in cs_semrep.store:
print(bo)
cs_semrep.s_retrieve(trace=True)
for reading in cs_semrep.readings:
print(reading)
It worked but still it gave the below error:
AttributeError: 'CooperStore' object has no attribute 'core'
4) I tried another code from book:
from nltk import load_parser
parser = load_parser('simple-sem.fcfg', trace=0)
sentence = 'Angus gives a bone to every dog'
tokens = sentence.split()
trees = parser.parse(tokens)
for tree in trees:
print(tree.node['SEM'])
I got the below error:
NotImplementedError: Use label() to access a node label.
Please let me know what to do? Are these features deprecated because I heard that many of the features of nltk are. Please suggest a way out for all those features mentioned.

I found the answer, actually I was following the code from the book instead of NLTK's online version of book which is updated. So following the updated version solved the problems.

Python 3 encode error

Recently I encountered the following problem:
I have an array of strings:
name in ['Mueller', 'Meier', 'Schulze', 'Schmidt']
I face problems with its encoding in Python 3:
name.encode('cp1252')
Here is the full snippet:
target_name = [name.encode('cp1252')
for name in
['Mueller', 'Meier', 'Schulze', 'Schmidt']]
assert_array_equal(arr['surname'],
target_name)
And here is the point where I also get the error. The error states:
Fail in test..... dtype='<|S7>'
I've been searching for a solution for some time, what I found so far is the need of changing the encoding. I applied:
name = np.char.encode('cp1252')
However I get another type of error with it.
Could someone help me with the error tracking?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Translate a Pandas df using googletrans, AttributeError error - python

Related

Translate a column with English and Dutch text to only English using GoogleTrans

variable structure in json data source

I am gettting an error when using pandas parse_date

many of the nltk package methods / tools are not working

Python 3 encode error

Categories

Resources