How do you code a translator with language user can choose

How do you code a translator with language user can choose - python

Im creating a webapp that takes an image, reads the text inside of it, and translates that text into another language. However right now you can't chnage the language the text is going to be changed into on the website. How would I do this
Right now I've got the source language as english and the target language as german (which is working). However, i've created a dictionary that has all languages and language codes supported by google translate, and ive put this in a dropdown form. how to I link the input from this back to python and make it the target language
PYTHON
def Lang_target():
language_targ={
'af': 'Afrikaans', 'sq':'Albanian', 'ar': 'Arabic', 'az': 'Azerbaijani',
'be': 'Belarusian', 'bn': 'Bengali', 'ca': 'Catalan', 'zh-CN': 'Chinese Simplified',
'zh-TW': 'Chinese Traditional', 'hr': 'Croatian', 'cs': 'Czech', 'da': 'Danish',
'nl': 'Dutch', 'en': 'English', 'eo': 'Esperanto', 'et': 'Estonian',
'tl': 'Filipino', 'fi': 'Finnish', 'fr': 'French', 'gl': 'Galician',
'ka': 'Georgian', 'de': 'German', 'el': 'Greek', 'gu': 'Gujarati',
'ht': 'Haitian Creole', 'iw': 'Hebrew', 'hi': 'Hindi', 'hu': 'Hungarian',
'is': 'Icelandic', 'ga': 'Irish', 'it': 'Italian', 'id': 'Indonesian',
'ja': 'Japanese', 'kn': 'Kannada', 'ko': 'Korean', 'la': 'Latin',
'lv': 'Latvian', 'lt': 'Lithuanian', 'mk': 'Macedonian', 'ms': 'Malay',
'mt': 'Maltese', 'no': 'Norwegian', 'fa': 'Persian', 'pl': 'Polish',
'pt': 'Portuguese', 'ro': 'Romanian', 'ru': 'Russian', 'sr': 'Serbian',
'sk': 'Slovak', 'es': 'Spanish', 'sl': 'Slovenian', 'sw': 'Swahili',
'sv': 'Swedish', 'ta': 'Tamil', 'te': 'Telugu', 'th': 'Thai',
'tr': 'Turkish', 'uk': 'Ukrainian', 'ur': 'Urdu', 'vi': 'Vietnamese',
'cy': 'Welsh', 'yi': 'Yiddish',
}
return language_targ
#app.route('/selectImage')
def selectImage():
fn = image_name()
language_target = Lang_target()
return render_template("selectImage.html", image_name=image_name, fn=fn, language_target=language_target)
#app.route('/getfileHelper', methods=['GET','POST'])
def getfileHelper():
if request.method == 'POST':
file = request.files['imgfile']
filename = secure_filename(file.filename) #from werkzeug import secure_filename
selectImage.html page
if file.filename == '':
flash("No file selected. Please select an image file")
return render_template('selectImage.html')
texts = detect_text('static/images/'+filename)
text_translations = [] #emty list for dictionary of original text and translation
for text in texts:
translate_client = translate.Client()
translate_text = text.description
source = 'en'
target = 'de'
translation = translate_client.translate(translate_text, source_language=source, target_language=target)
text_translations.append({'text':translate_text, 'translation':translation['translatedText']})
db_append(filename, translate_text, translation['translatedText'])
return render_template('home.html', filename=filename, text_translations=text_translations)
HTML
<form>
<select>
{% for x in language_target%}
<option> {{ language_target[x] }}</option>
{% endfor %}
</select>
<input type="submit" value="Submit">
</form>

If you add a name attribute to the select (for example, name="lang_target"), you can retrieve the value of the dropdown from the request in request.args["lang_target"] for GET (as you did not specify POST). I am not sure which application route performs the translations, but you should direct the request to that route.

Related

Trying to return all nested json dictionary values that contain a substring python

I want to return a list of all of the text that comes after the nested 'en' key and before the comma that separates that key from the rest of the dictionary
dictionary = {'title': 'a',
'labels': {'label0': {'en': 'Statement',
'ca': 'dog1',
'bd': 'ေမးခြန္း ၁၀'},
'label1': {'en': 'Hello how are you',
'ca': 'cat6979309',
'bd': 'turkey89'},
'option0': {'en': 'No',
'bd': 'turkey232',
'ca': 'dog2'},
'option1': {'en': 'Absoluelty not',
'bd': 'turkey3',
'ca': 'dog3'},
'option2': {'en': 'Neutral ', 'bd': 'snake3', 'ca':'bat1'},
'option3': {'en': 'Somewhat Disagree',
'bd': 'turkey4',
'ca': 'dog4'},
'option4': {'en': 'For Sure',
'bd': 'turkey5',
'ca': 'dog5'}},```
I have tried this which isn't working, the goal is to have the function return something like sentences = ['Statement', 'Hello how are you', 'No', 'Absolutely not','Neutral'...]
def json(dic):
sentences = []
for (key, value) in dic.items():
if "en" in value:
sentences.append(value)
return sentences

You will need to recurse through the nested structure. Here is a generator function that does so:
def get_key(data, key):
if key in data:
yield data[key]
for k, v in data.items():
if isinstance(v, dict):
yield from get_key(v, key)
>>> list(get_key(dictionary, "en"))
['Statement', 'Hello how are you', 'No', 'Absoluelty not', 'Neutral ',
'Somewhat Disagree', 'For Sure']

Nested Python Object to CSV

I looked up "nested dict" and "nested list" but either method work.
I have a python object with the following structure:
[{
'id': 'productID1', 'name': 'productname A',
'option': {
'size': {
'type': 'list',
'name': 'size',
'choices': [
{'value': 'M'},
]}},
'variant': [{
'id': 'variantID1',
'choices':
{'size': 'M'},
'attributes':
{'currency': 'USD', 'price': 1}}]
}]
what i need to output is a csv file in the following, flattened structure:
id, productname, variantid, size, currency, price
productID1, productname A, variantID1, M, USD, 1
productID1, productname A, variantID2, L, USD, 2
productID2, productname A, variantID3, XL, USD, 3
i tried this solution: Python: Writing Nested Dictionary to CSV
or this one: From Nested Dictionary to CSV File
i got rid of the [] around and within the data and e.g. i used this code snippet from 2 and adapted it to my needs. IRL i can't get rid of the [] because that's simple the format i get when calling the API.
with open('productdata.csv', 'w', newline='', encoding='utf-8') as output:
writer = csv.writer(output, delimiter=';', quotechar = '"', quoting=csv.QUOTE_NONNUMERIC)
for key in sorted(data):
value = data[key]
if len(value) > 0:
writer.writerow([key, value])
else:
for i in value:
writer.writerow([key, i, value])
but the output is like this:
"id";"productID1"
"name";"productname A"
"option";"{'size': {'type': 'list', 'name': 'size', 'choices': {'value': 'M'}}}"
"variant";"{'id': 'variantID1', 'choices': {'size': 'M'}, 'attributes': {'currency': 'USD', 'price': 1}}"
anyone can help me out, please?
thanks in advance

list indices must be integers not strings
The following presents a visual example of a python list:
0 carrot.
1 broccoli.
2 asparagus.
3 cauliflower.
4 corn.
5 cucumber.
6 eggplant.
7 bell pepper
0, 1, 2 are all "indices".
"carrot", "broccoli", etc... are all said to be "values"
Essentially, a python list is a machine which has integer inputs and arbitrary outputs.
Think of a python list as a black-box:
A number, such as 5, goes into the box.
you turn a crank handle attached to the box.
Maybe the string "cucumber" comes out of the box
You got an error: TypeError: list indices must be integers or slices, not str
There are various solutions.
Convert Strings into Integers
Convert the string into an integer.
listy_the_list = ["carrot", "broccoli", "asparagus", "cauliflower"]
string_index = "2"
integer_index = int(string_index)
element = listy_the_list[integer_index]
so yeah.... that works as long as your string-indicies look like numbers (e.g. "456" or "7")
The integer class constructor, int(), is not very smart.
For example, x = int("3 ") will produce an error.
You can try x = int(strying.strip()) to get rid of leading and trailing white-space characters.
Use a Container which Allows Keys to be Strings
Long ago, before before electronic computers existed, there were various types of containers in the world:
cookie jars
muffin tins
carboard boxes
glass jars
steel cans.
back-packs
duffel bags
closets/wardrobes
brief-cases
In computer programming there are also various types of "containers"
You do not have to use a list as your container, if you do not want to.
There are containers where the keys (AKA indices) are allowed to be strings, instead of integers.
In python, the standard container which like a list, but where the keys/indices can be strings, is a dictionary
thisdict = {
"make": "Ford",
"model": "Mustang",
"year": 1964
}
thisdict["brand"] == "Ford"
If you want to index into a container using strings, instead of integers, then use a dict, instead of a list
The following is an example of a python dict which has state names as input and state abreviations as output:
us_state_abbrev = {
'Alabama': 'AL',
'Alaska': 'AK',
'American Samoa': 'AS',
'Arizona': 'AZ',
'Arkansas': 'AR',
'California': 'CA',
'Colorado': 'CO',
'Connecticut': 'CT',
'Delaware': 'DE',
'District of Columbia': 'DC',
'Florida': 'FL',
'Georgia': 'GA',
'Guam': 'GU',
'Hawaii': 'HI',
'Idaho': 'ID',
'Illinois': 'IL',
'Indiana': 'IN',
'Iowa': 'IA',
'Kansas': 'KS',
'Kentucky': 'KY',
'Louisiana': 'LA',
'Maine': 'ME',
'Maryland': 'MD',
'Massachusetts': 'MA',
'Michigan': 'MI',
'Minnesota': 'MN',
'Mississippi': 'MS',
'Missouri': 'MO',
'Montana': 'MT',
'Nebraska': 'NE',
'Nevada': 'NV',
'New Hampshire': 'NH',
'New Jersey': 'NJ',
'New Mexico': 'NM',
'New York': 'NY',
'North Carolina': 'NC',
'North Dakota': 'ND',
'Northern Mariana Islands':'MP',
'Ohio': 'OH',
'Oklahoma': 'OK',
'Oregon': 'OR',
'Pennsylvania': 'PA',
'Puerto Rico': 'PR',
'Rhode Island': 'RI',
'South Carolina': 'SC',
'South Dakota': 'SD',
'Tennessee': 'TN',
'Texas': 'TX',
'Utah': 'UT',
'Vermont': 'VT',
'Virgin Islands': 'VI',
'Virginia': 'VA',
'Washington': 'WA',
'West Virginia': 'WV',
'Wisconsin': 'WI',
'Wyoming': 'WY'
}

i could actually iterate this list and create my own sublist, e.g. e list of variants
data = [{
'id': 'productID1', 'name': 'productname A',
'option': {
'size': {
'type': 'list',
'name': 'size',
'choices': [
{'value': 'M'},
]}},
'variant': [{
'id': 'variantID1',
'choices':
{'size': 'M'},
'attributes':
{'currency': 'USD', 'price': 1}}]
},
{'id': 'productID2', 'name': 'productname B',
'option': {
'size': {
'type': 'list',
'name': 'size',
'choices': [
{'value': 'XL', 'salue':'XXL'},
]}},
'variant': [{
'id': 'variantID2',
'choices':
{'size': 'XL', 'size2':'XXL'},
'attributes':
{'currency': 'USD', 'price': 2}}]
}
]
new_list = {}
for item in data:
new_list.update(id=item['id'])
new_list.update (name=item['name'])
for variant in item['variant']:
new_list.update (varid=variant['id'])
for vchoice in variant['choices']:
new_list.update (vsize=variant['choices'][vchoice])
for attribute in variant['attributes']:
new_list.update (vprice=variant['attributes'][attribute])
for option in item['option']['size']['choices']:
new_list.update (osize=option['value'])
print (new_list)
but the output is always the last item of the iteration, because i always overwrite new_list with update().
{'id': 'productID2', 'name': 'productname B', 'varid': 'variantID2', 'vsize': 'XXL', 'vprice': 2, 'osize': 'XL'}

here's the final solution which worked for me:
data = [{
'id': 'productID1', 'name': 'productname A',
'variant': [{
'id': 'variantID1',
'choices':
{'size': 'M'},
'attributes':
{'currency': 'USD', 'price': 1}},
{'id':'variantID2',
'choices':
{'size': 'L'},
'attributes':
{'currency':'USD', 'price':2}}
]
},
{
'id': 'productID2', 'name': 'productname B',
'variant': [{
'id': 'variantID3',
'choices':
{'size': 'XL'},
'attributes':
{'currency': 'USD', 'price': 3}},
{'id':'variantID4',
'choices':
{'size': 'XXL'},
'attributes':
{'currency':'USD', 'price':4}}
]
}
]
for item in data:
for variant in item['variant']:
dic = {}
dic.update (ProductID=item['id'])
dic.update (Name=item['name'].title())
dic.update (ID=variant['id'])
dic.update (size=variant['choices']['size'])
dic.update (Price=variant['attributes']['price'])
products.append(dic)
keys = products[0].keys()
with open('productdata.csv', 'w', newline='', encoding='utf-8') as output_file:
dict_writer = csv.DictWriter(output_file, keys,delimiter=';', quotechar = '"', quoting=csv.QUOTE_NONNUMERIC)
dict_writer.writeheader()
dict_writer.writerows(products)
with the following output:
"ProductID";"Name";"ID";"size";"Price"
"productID1";"Productname A";"variantID1";"M";1
"productID1";"Productname A";"variantID2";"L";2
"productID2";"Productname B";"variantID3";"XL";3
"productID2";"Productname B";"variantID4";"XXL";4
which is exactly what i wanted.

googletrans stopped working with detecting all languages as English

The problem I have here is googletrans API suddenly stopped working, just like this:
result = translator.translate('祝您新年快乐', src='zh-cn', dest='en')
result.text
Output:
'祝您新年快乐'
It should return English but just printed the original text. Then I checked what goes wrong. I found that googletrans detect all languages as english, like this:
print(translator.detect('이 문장은 한글로 쓰여졌습니다.'))
print(translator.detect('祝您新年快乐'))
Output:
Detected(lang=en, confidence=1)
Detected(lang=en, confidence=1)
Finally I checked if those languages are available in the library. It is.
print(googletrans.LANGUAGES)
output:
{'af': 'afrikaans', 'sq': 'albanian', 'am': 'amharic', 'ar': 'arabic', 'hy': 'armenian', 'az': 'azerbaijani', 'eu': 'basque', 'be': 'belarusian', 'bn': 'bengali', 'bs': 'bosnian', 'bg': 'bulgarian', 'ca': 'catalan', 'ceb': 'cebuano', 'ny': 'chichewa', 'zh-cn': 'chinese (simplified)', 'zh-tw': 'chinese (traditional)', 'co': 'corsican', 'hr': 'croatian', 'cs': 'czech', 'da': 'danish', 'nl': 'dutch', 'en': 'english', 'eo': 'esperanto', 'et': 'estonian', 'tl': 'filipino', 'fi': 'finnish', 'fr': 'french', 'fy': 'frisian', 'gl': 'galician', 'ka': 'georgian', 'de': 'german', 'el': 'greek', 'gu': 'gujarati', 'ht': 'haitian creole', 'ha': 'hausa', 'haw': 'hawaiian', 'iw': 'hebrew', 'he': 'hebrew', 'hi': 'hindi', 'hmn': 'hmong', 'hu': 'hungarian', 'is': 'icelandic', 'ig': 'igbo', 'id': 'indonesian', 'ga': 'irish', 'it': 'italian', 'ja': 'japanese', 'jw': 'javanese', 'kn': 'kannada', 'kk': 'kazakh', 'km': 'khmer', 'ko': 'korean', 'ku': 'kurdish (kurmanji)', 'ky': 'kyrgyz', 'lo': 'lao', 'la': 'latin', 'lv': 'latvian', 'lt': 'lithuanian', 'lb': 'luxembourgish', 'mk': 'macedonian', 'mg': 'malagasy', 'ms': 'malay', 'ml': 'malayalam', 'mt': 'maltese', 'mi': 'maori', 'mr': 'marathi', 'mn': 'mongolian', 'my': 'myanmar (burmese)', 'ne': 'nepali', 'no': 'norwegian', 'or': 'odia', 'ps': 'pashto', 'fa': 'persian', 'pl': 'polish', 'pt': 'portuguese', 'pa': 'punjabi', 'ro': 'romanian', 'ru': 'russian', 'sm': 'samoan', 'gd': 'scots gaelic', 'sr': 'serbian', 'st': 'sesotho', 'sn': 'shona', 'sd': 'sindhi', 'si': 'sinhala', 'sk': 'slovak', 'sl': 'slovenian', 'so': 'somali', 'es': 'spanish', 'su': 'sundanese', 'sw': 'swahili', 'sv': 'swedish', 'tg': 'tajik', 'ta': 'tamil', 'te': 'telugu', 'th': 'thai', 'tr': 'turkish', 'uk': 'ukrainian', 'ur': 'urdu', 'ug': 'uyghur', 'uz': 'uzbek', 'vi': 'vietnamese', 'cy': 'welsh', 'xh': 'xhosa', 'yi': 'yiddish', 'yo': 'yoruba', 'zu': 'zulu'}
Can someone help here by explaning why this problem happened all of a sudden? It works just 30 minutes ago. It's weird it stopped working without changing anything.

According to the documentation googletrans, https://pypi.org/project/googletrans/, "is an unofficial library using the web API of translate.google.com".
They specifically state:
Due to limitations of the web version of google translate, this API does not guarantee that the library would work properly at all times (so please use this library if you don’t care about stability)
and suggest to use the official Google Translate API (click here).
For further reading I highly suggest the following sources:
GoogleTrans Python not translating
https://pypi.org/project/googletrans/
https://py-googletrans.readthedocs.io/en/latest/
If you decide to switch to the official API check out: https://cloud.google.com/translate/docs

Array into json formatting

I am trying to format a list of cities into json to put it on a firebase database.
I am really new to coding and very lost. Working in python but just trying to get this text formatted.
My list of cities
cities = ['Abu Dhabi', 'Albuquerque', 'Amsterdam', 'Anchorage', 'Antalya', 'Aspen', 'Athens', 'Atlanta', 'Austin', 'Bali', 'Baltimore', 'Bangalore', 'Bangkok', 'Barcelona', 'Beijing', 'Berlin', 'Berlin', 'Bogota', 'Bora Bora', 'Boston', 'Brisbane', 'Brussels', 'Buffalo', 'Burbank', 'Cairo', 'Cancun', 'Cape Town', 'Changcha', 'Charlotte', 'Chengdu', 'Chicago', 'Chongqing', 'Cincinnati']
I need to format them like this
},
"Seattle" : {
"city_name" : "Seattle"
},
"Houston" : {
"city_name" : "Houston"
}
What is the best way to go about doing this?

You can use a simple dict comprehension:
cities = ['Chicago', 'Charlotte', 'Barcelona']
print({city: {'city_name': city} for city in cities})
Which prints:
{'Chicago': {'city_name': 'Chicago'}, 'Charlotte': {'city_name': 'Charlotte'}, 'Barcelona': {'city_name': 'Barcelona'}}

Problem with nested dictionaries that changes certain values for different adverts

I'm trying to scrape links on a website. When I follow the link it can be either a motor advert or an ordinary advert.
The keys that I need to scrape for both types of adverts are the same:
For the Motor adverts - data = dict_keys['header', 'description', 'currency', 'price', 'wanted', 'id', 'photos', 'section', 'age', 'spotlight', 'year', 'state', 'friendlyUrl', 'keyInfo', 'seller', 'displayAttributes', 'countyTown', 'breadcrumbs']
For the Ordinary adverts - data = dict_keys(['header', 'description', 'currency', 'price', 'wanted', 'id', 'photos', 'section', 'age', 'spotlight', 'year', 'state', 'friendlyUrl', 'keyInfo', 'seller', 'displayAttributes', 'countyTown', 'breadcrumbs'])
In the Motor adverts data the 'breadcrumbs' key gives me
[{'name': 'motor',
'displayName': 'Cars & Motor',
'id': 1003,
'title': 'Cars Motorbikes Trucks Caravans and More',
'subdomain': 'www',
'containsSubsections': True,
'xtn2': 101},
{'name': 'cars',
'displayName': 'Cars',
'id': 11,
'title': 'Cars',
'subdomain': 'cars',
'containsSubsections': False,
'xtn2': 142}]
while in the Ordinary adverts 'breadcrumbs' gives me
[{'name': 'all',
'displayName': 'All Sections',
'id': 2066,
'title': 'See Everything For Sale',
'subdomain': 'www',
'containsSubsections': True,
'xtn2': 100},
{'name': 'household',
'displayName': 'House & DIY',
'id': 1001,
'title': 'House & DIY',
'subdomain': 'www',
'containsSubsections': True,
'xtn2': 105},
{'name': 'furniture',
'displayName': 'Furniture & Interiors',
'id': 3,
'title': 'Furniture',
'subdomain': 'www',
'containsSubsections': True,
'xtn2': 105},
{'name': 'kitchenappliances',
'displayName': 'Kitchen Appliances',
'id': 1089,
'title': 'Kitchen Appliances',
'subdomain': 'www',
'containsSubsections': False,
'xtn2': 105}]
I have tried to get the Motor data by calling the 'xtn2' key and value with data['breadcrumbs'][0]['xtn2'] == 101: and giving it a name 'motordata'
if data['breadcrumbs'][0]['xtn2'] == 101:
motordata = data
if motordata:
motors = motordata['breadcrumbs'][0]['name']
views = motordata['views']
title = motordata['header']
Adcounty = motordata['county']
itemId = motordata['id']
sellerId = motordata['seller']['id']
sellerName = motordata['seller']['name']
adCount = motordata['seller']['adCount']
lifetimeAds = motordata['seller']['adCountStats']['lifetimeAdView']['value']
currency = motordata['currency']
price = motordata['price']
adUrl = motordata['friendlyUrl']
adAge = motordata['age']
spotlight = motordata['spotlight']
and the Ordinary data with elif data['breadcrumbs'][0]['xtn2'] == 100: with a name 'Allotherads'
elif data['breadcrumbs'][0]['xtn2'] == 100:
Allotherads = alldata
if Allotherads:
views = Allotherads['views']
title = Allotherads['header']
itemId = Allotherads['id']
Adcounty = Allotherads['county']
# Adtown = alldata['countyTown']
sellerId = Allotherads['seller']['id']
sellerName = Allotherads['seller']['name']
adCount = Allotherads['seller']['adCount']
lifetimeAds = Allotherads['seller']['adCountStats']['lifetimeAdView']['value']
currency = Allotherads['currency']
price = Allotherads['price']
adUrl = Allotherads['friendlyUrl']
adAge = Allotherads['age']
spotlight = Allotherads['spotlight']
topSectionName = Allotherads['xitiAdData']['topSectionName']
xtn2 = Allotherads['breadcrumbs'][2]['xtn2']
subSection = Allotherads['breadcrumbs'][2]['displayName']
but it doesn't work. It just scrapes the Ordinary adverts but not the Motor adverts.
Where am I going wrong?

Can't you just do (if multiple motor dicts are possible):
motordata = [x for x in data.get('breadcrumbs') if x.get('name') == "motor"]
or (if only one motordata is possible:
motordata = next(iter([x for x in data.get('breadcrumbs') if x.get('name') == "motor"]))
next(iter()) works here the same as [0] at the end but is faster

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How do you code a translator with language user can choose - python

Related

Trying to return all nested json dictionary values that contain a substring python

Nested Python Object to CSV

googletrans stopped working with detecting all languages as English

Array into json formatting

Problem with nested dictionaries that changes certain values for different adverts

Categories

Resources