LanguageDetectionError in transliterate library | Can't detect english language? - python

I have one problem. Let me try to explain this little problem.
I use transliterate library in my Django project. User can write english (latin) or russian (cyrillic) letters in field. If user write russian words it change word to latin letters but if user write english words I see next error:
LanguageDetectionError: Can't detect language for the text "document" given.
I use this code:
transliterate.translit(field_value, reversed=True)
Also I notice that in that project its impossible to detect english language, isn't it?
transliterate.detect_language(field_value) return None when user enter english word.
My aim is to transliterate only if user wrote russion word, but don't touch it user wrote english word. What can you advice?
Right now I found library which can help me to detect language: https://pypi.python.org/pypi/langdetect
Who worked with this library?

Could you try detecting English and then moving on to assume Russian? I put some Russian news articles into the Python code listed below. It clearly detected that it is not English.
It's pretty simple code that can be easily applied.
isEnglish github

Related

Hindi english code mix language translation to english or hindi

i want to transliterate hindi english code mix language (popularly known as hinglish) to english .I tried google transliterate api ,but its not showing correct results is there any other alternative to that .for eg :- kya haal hai? to how are you?

How to check if some words from several text files are written in Hindi, and what files are written in English?

I have several files written in Hindi or in English. I have to check which ones are written in English and which ones in Hindi. Cannot open one by one to check, because are more than 2.000 text files.
So, is it possible for Python to show me what are the files written in Hindi and what files are written in English?
I just want to know if it's possible and how to do it. I'll do the code if anyone gives me an idea of how this can be done.
For example, I was thinking of putting a few basic words in a dictionary, and making a function that if it finds 3 of such words, it means that the whole text is in English.
pattern = r'<h3\x20.*(\b(the|you|which|view|because|here|have)\b.*){5,}.*</h3>'
Does anyone have a better idea?

Google Translate API in Python

I know about the Google Translate API in python. However, in my data frame, there are only a few entries that have 'Hindi' language. How do I recognize the language of these records and then translate them to English.
Basically, I want to do the following.
if !hindi, continue else translate from hindi to english.
I am using this - https://pypi.org/project/googletrans/
So, I realized that default is English. The translate command automatically converts other languages to English and if it is English, it stays as it is.

Check if input is a proper noun

So I am making my own MadLibs game, and have come across a problem. When a user is requested to enter a noun, for example, but instead enters a verb or adverb etc. I want my program to pick up on this, and ask them to enter a different word as this word does not match criteria. How do I do this? This is what I have so far:
while True:
name1 = input("A male name (protagonist): ")
if name1.endswith (('ly', 's')):
print("Sorry mate, this doesn't seem to be a proper noun. Try it again.")
continue
break
But I would like it to come out along the lines of this:
A male name (protagonist): sandwich
Sorry mate, this doesn't seem to be a proper noun. Try it again.
A male name (protagonist): Bob
How do I make it recognise nouns, adverbs etc. without me manually typing it in?
What you are looking for is Natural Language Processing mate. You have to identify which part of speech is the word & then you can tag it. NLP field is vast & complex so try researching on your own & you might come up with some solution. But I think there is no direct way to do that in Python. Python is programming language. Though you can use some tools that might help you tag POS, such as Tree tagger & try integrating them in your application.

Can some one explain how transliteration works?

I am new to programming, and I am trying to understand transliteration - like the Google Input Tools that will allow the user to type from one language to another language.
How does transliteration work? Specifically, if I am translating from English to Hindi or English to Russsian, do I need to incorporate a dictionary of words for English, Hindi and Russian languages?
Does any one know of any tutorials showing how to write the code for transliteration? I have tried searching, but no luck.
Also, does the code have to be in JavaScript/JQuery (client side code)? My project is Python/django. Can I write the transliteration code in python/dgango?
Thanks.
Direct dictionary-to-dictionary automatic translation produces poor results due to differences in grammar and the presence of idiomatic sentences. The starting point in python, in my experience, should be NLTK (Natural Language ToolKit) libraries and tutorials.
Then, trying to provide you a working example you may start from here:
Machine Translation using babelize_shell() in NLTK
Translating human languages in Python
Google is your friend
Bing is your friend
The use of javascript/jquery depends on the UI you are planning, maybe you want to trigger an automatic translation after a few key pressed, or onblur or onchange in a input tag but is not relevant for the translation itself.
The process of translating is also really resource consuming, so I discourage you to do it inside a django view. My suggestion is to not reinvent the wheel, and use some already existing API like google or bing ones.
I found that the better search term is Input Method Editor not transliteration.
There is a project on github here: https://github.com/wikimedia/jquery.ime that deals with IME's and transliteration here.
I hope that this helps some one.
The typical way of implementing transliteration is to use a mapping dictionary. An example of this can be seen in the mapping.py file for the CyrTranslit Python package.
Word translation usages a database to convert English word into Hindi Word.
Some apps are based on this concept like:
English to Hindi Dictionary

Categories

Resources