My list comprehension returns the error "List index out of range" if the value does not exist in the list.
Objective:
Check a three-letter code for a country given by a variable (country) and transform the code into a two-letter code by looking up a list of tuples (COUNTRIES)
Constant:
# Two letter code, three letter code, full name
COUNTRIES = [
('US', 'USA', 'United States'),
('DE', 'DEU', 'Germany'),
....
]
Code:
country = 'EUR'
# Check if code in country has 3 characters (I have multiple checks for two letter codes too) and is not None
if len(country) == 3 and country is not None:
country = [code2 for code2, code3, name in COUNTRIES if code3 == country][0]
If I only include a list with three letter codes USA and DEU, the code works fine. If I add the fictitious code "EUR", which is not a valid country code in the variable "country", than I get the List index out of range error.
How can I return None instead of breaking the program? The variable country will be used later on again.
I don't think List comprehensions are a good choice here. They are good when you want to turn one list into another list, which you don't really want to do here. A better approach would be a regular for loop with a return here.
However, my personal approach would be to transform the list lookups into dict lookups instead:
COUNTRIES_LUT = {}
for code2, code3, country in COUNTRIES:
COUNTRIES_LUT[code2] = country
COUNTRIES_LUT[code3] = country
At the end of that, you can just use COUNTRIES_LUT[your_str] as expected.
If you generate this lookuptable at the start, this also has the bonus of being faster, since you don't need to loop through every element of the list every time.
Related
I am new to python and am looking to analyze the S&P500 by sector. I have assigned symbols to all 11 sectors in the S&P with the first two looking like:
Financials = ['AFL', 'AIG', .... 'ZION']
Energy = ['APA', 'BKR', ... 'SLB']
I then create a new list (of lists) which might look like:
sectors_to_analyze = [Financials, Energy] or [Materials, ConsumerStaples]
My analysis is working perfectly, but I want to retrieve the names "Financials" and "Energy" to attach to the data produced and I cannot figure out how to do it other than make the name part of the list (Financials = ['Financials','AFL', 'AIG', .... 'ZION']
Can someone please point me in the right direction? Thank you.
Perhaps you could use a dictionary
sectors = {
'Financials':['AFL', ...],
# rest of your lists
}
Then you can iterate over the whole dict and access both names and data associated with those names
for key, value in sectors.items():
print(f'Sector name: {key}, List: {value}')
I think you want to use a dictionary instead of a "list of lists" (also called a two dimensional list). You could then loop over the dictionary almost the same way. Here's some example code:
Financials = ['AFL', 'AIG', 'ZION']
Energy = ['APA', 'BKR', 'SLB']
sectors = {"Finacials": Financials, "Energy": Energy}
# in this loop, sector is the sector's name, and symbols is the sector's
# list
for sector in sectors:
symbols = sectors[sector]
# ...
# do some analysis
# ...
By using the function I made, I want to print the records of all the employees using only one loop.
Record = [["ali", "Mazen", "Fida", "Nader", "Majd"], ["Tutor", "IT", "Manager", "PR", "Clerk"]]
def Netsal(salary):
bonus=0
if salary<2000:
bonus = 300
elif salary>=4000:
bonus = 100
else:
bonus = 200
return salary + bonus
Sal=[2200,1600,4000,2000,1400]
holder=[]
for base_salary in Sal:
total_salary=Netsal(base_salary)
holder.append(total_salary)
for total_salary in Netsal(base_salary):
print(Record, total_salary)
I'm kinda lost when it comes to printing because I need to have an output like:
Mazen - IT : 1900
So the two sublists in the record need to also be mapped to the new salary list with the bonus.
Plus I need to loop over the lists and print all the values in the format I showed above.
Before you run the last loop to print, you have three lists:
Record[0] is a list containing names
Record[1] is a list containing departments
holder is a list containing net salaries.
zip() can take any number of iterables (a list is an iterable), and return tuples containing one element from each iterable. The for statement can unpack this tuple into the same number of variables.
for person, department, salary in zip(Record[0], Record[1], holder):
print(f"{person} - {department} : {salary}")
Your code has an unnecessary loop though. You could remove the loop that populates holder and incorporate that logic in the zip loop I showed above like so, using the Sal list in zip():
for person, department, base_salary in zip(Record[0], Record[1], Sal):
net_salary = Netsal(base_salary)
print(f"{person} - {department} : {net_salary}")
In the print() statements, I use f-strings for string interpolation
My teacher is giving us an exercise problem on how to processing a list. I came across with sorting a list alphabetically, but he never taught us this before so I have no clue how to do things.
The problem is creating a loop that will output only the names that come before "Thor" in the alphabet from the names list. This is what I tried:
names = ["Peter", "Bruce", "Steve", "Tony", "Natasha", "Clint", "Wanda", "Hope",
"Danny", "Carol"]
thor = []
index = 1
for i in names:
if names <= "Thor":
thor.append ()
index +=1
print(thor)
Python is very well suited for this kind of thing. for loops in Python, for example, do not need a numeric index to traverse a list. So, for what you are trying to do, it could be as simple as:
names = ["Peter", "Bruce", "Steve", "Tony", "Natasha", "Clint", "Wanda", "Hope", "Danny", "Carol"]
# Go through the list of names one at a time
for one_name in names:
# Check to see if the current name is "less than" Thor
if one_name < "Thor":
# If you found a name like that, print it out
print(one_name)
I am new to python so I wanted to know if the code I wrote for printing items inside a nested dictionary in a sorted alphabetical order is optimal especially for checking if key exists. Let me know if there is a better optimal solution
# Code
import operator
locations = {'North America': {'USA': ['Mountain View']}}
locations['Asia'] = {'India':['Bangalore']}
locations['North America']['USA'].append('Atlanta')
locations['Africa'] = {'Egypt':['Cairo']}
locations['Asia']['China'] = ['Shanghai']
# TODO: Print a list of all cities in the USA in alphabetic order.
if 'North America' in locations:
for key,value in locations['North America'].items():
if 'USA' in key:
for item in sorted(value):
print(f"{item}")
# TODO: Print all cities in Asia, in alphabetic order, next to the name of the country
if 'Asia' in locations:
for key,value in sorted(locations['Asia'].items(),key=operator.itemgetter(1)):
print(f"{value[0]} {key}")
Make these two lines your code:
print('\n'.join(sorted([x for i in locations.get('North America', {}).values() for x in i])))
print('\n'.join(sorted([x + ' ' + k for k,v in locations.get('Asia', {}).items() for x in v])))
Which outputs:
Atlanta
Mountain View
Bangalore India
Shanghai China
Dictionaries in python are unordered. Given that, I will try to help solve for your actual problem of checking for a key in a dictionary.
locations = {'North America': {'USA': ['Mountain View']}}
locations['Asia'] = {'India':['Bangalore']}
locations['North America']['USA'].append('Atlanta')
locations['Africa'] = {'Egypt':['Cairo']}
locations['Asia']['China'] = ['Shanghai']
# First we clean up all the loops.
# You are just checking if the key is in the dictionary with all the loops
if 'North America' in locations and 'USA' in locations['North America']:
for item in sorted(value):
print(f"{item}")
if 'Asia' in locations:
# Since dictionaries are unordered, we will make a list of the countries to order
countries = []
for k in locations['Asia'].keys():
countries.append(k)
# Using a similar loop to the one to print cities
for country in sorted(countries):
# Adding a dimension for cities.
for city in sorted(locations['Asia'][country]):
print(f"{country}:{city}")
The Asia dictionary should loop through each country and in alphabetical order print each country and city.
dictionaries are used because they give direct lookup of any specific key. For testing existence, you don't need to search. The downside is they are not sorted.
You iterate through all countries in north america when you already know you want usa, so ... don't do that.
print(sorted(locations['North America']['USA']))
This is better because it is O(1) lookup on the second layer when you do O(n) where n is the number of nations in that particular continent. Which admittedly isn't much so that's why they say don't optimize if you don't need to. But maybe you have a lot more data and the geography sample data was just filler.
To test for existence of a key, use "in" or write a try-except for KeyError. Python is one of the few languages where it's often better to just handle the exception.
To print all the cities in Asia, you will have to combine all the lists in asia and sort that: Combining two sorted lists in Python
You can do better by maintaining the city lists in sorted order all the time, using the bisect module. Inserting or removing in a sorted list is less work than sorting it each time, assuming you look at the list more often than you add and remove cities.
If you maintain sorted lists, you can efficiently get the sorted merge with https://docs.python.org/3.0/library/heapq.html#heapq.merge Although sadly you don't have the nation name doing that.
I have three lists, (1) treatments (2) medicine name and (3) medicine code symbol. I am trying to identify the respective medicine code symbol for each of 14,700 treatments. My current approach is to identify if any name in (2) is "in" (1), and then return the corresponding (3). However, I am returned an abitrary list (correct length) of medicine code symbols corresponding to the 14,700 treatments. Code for the method I've written is below:
codes = pandas.read_csv('Codes.csv', dtype=str)
codes_list = _codes.values.tolist()
names = pandas.read_csv('Names.csv', dtype=str)
names_list = names.values.tolist()
treatments = pandas.read_csv('Treatments.csv', dtype=str)
treatments_list = treatments.values.tolist()
matched_codes_list = range(len(treatments_list))
for i in range(len(treatments_list)):
for j in range(len(names_list)):
if names_list[j] in treatments_list[i]:
matched_codes_list[i]=codes_list_text[j]
print matched_codes_list
Any suggestions for where I am going wrong would be much appreciated!
I can't tell what you are expecting. You should replace the xxx_list code with examples instead, since you don't seem to have any problems with the csv reading.
Let's suppose you did that, and your result looks like this.
codes_list = ['shark', 'panda', 'horse']
names_list = ['fin', 'paw', 'hoof']
assert len(codes_list) == len(names_list)
treatments_list = ['tape up fin', 'reverse paw', 'stand on one hoof', 'pawn affinity maneuver', 'alert wing patrol']
it sounds like you are trying to determine the 'code' for each 'treatment', assuming that the number of codes and names are the same (and indicate some mapping). You plan to use the presence of the name to determine the code.
we can zip together the name and codes list to avoid using indexes there, and we can use iteration over the treatment list instead of indexes for pythonic readability
matched_codes_list = []
for treatment in treatment:
matched_codes = []
for name, code in zip(names_list, codes_list):
if name in treatment:
matched_codes.append(code)
matched_codes_list.append(matched_codes)
this would give something like
assert matched_codes_list == [
['shark'], # 'tape up fin'
['panda'], # 'reverse paw'
['horse'], # 'stand on one hoof'
['shark', 'panda', 'horse'], # 'pawn affinity maneuver'
[], # 'alert wing patrol'
]
note that the method used to do this is quite slow (and probably will give false positives, see 4th entry). You will traverse the text of all treatment descriptions once for each name/code pair.
You can use a dictionary like 'lookup = {name: code for name, code in zip(names_list, codes_list)}, or itertools.izip for minor gains. Otherwise something more clever might be needed, perhaps splitting treatments into a set containing words, or mapping words into multiple codes.