Comparing list items and tuples

Comparing list items and tuples - python

In Python 3, I'm trying to create a program which takes input from a user as 3 digit codes and converts them into items in a list. It then compares these items with the first(the 3 digit code) part of a tuple in a list of tuples and prints the whole tuple.
import shares
portfolio_str=input("Please list portfolio: ")
portfolio_str= portfolio_str.replace(' ','')
portfolio_str= portfolio_str.upper()
portfolio_list= portfolio_str.split(',')
print(portfolio_list)
print()
print('{:<6} {:<20} {:>8}'.format('Code', 'Name', 'Price'))
data=shares.EXCHANGE_DATA
for (code, name, share_value) in data:
if code == i in portfolio_list:
print('{:<6} {:<20} {:>8.2f}'.format(code, name, share_value))
else:
print("Failure")
As you can see I'm using a module called shares containing a list of tuples called EXCHANGE_DATA which is set out like this:
EXCHANGE_DATA = [('AIA', 'Auckair', 1.50),
('AIR', 'Airnz', 5.60),
('AMP', 'Amp',3.22),
('ANZ', 'Anzbankgrp', 26.25),
('ARG', 'Argosy', 12.22),
('CEN', 'Contact', 11.22),
('CNU', 'Chorus',3.01),
('DIL', 'Diligent', 5.3),
('DNZ', 'Dnz Property', 2.33),
('EBO', 'Ebos', 1.1),
An exemplar input would be:
AIA, AMP, ANZ
The corresponding output would be:
Code Name Price
AIA Auckair 1.50
AMP Amp 3.22
ANZ Anzbankgrp 26.25
I'm just stuck on the for and/or if statements which I think I need.

Your issue is this here:
if code == i in portfolio_list:
This doesn't make sense in Python. in checks if a given value is contained in the list, so this checks if i is in portfolio_list, then checks if code is equal to True or False (whatever i in portfolio_list returned. What you want is simply:
if code in portfolio_list:
Note that if portfolio_list could be long, it might be worth making it a set, as checking for membership in a set is significantly more efficient for large amounts of data.
Your syntax appears to be a mashup of different methodologies. You might have meant:
if any(code == i for i in portfolio_list):
However, as this is directly equivalent to code in portfolio_list, but more verbose and inefficient, it's not a good solution.

Related

Masking the Zip Codes

I'm taking a course and I need to solve the following assignment:
"In this part, you should write a for loop, updating the df_users dataframe.
Go through each user, and update their zip code, to Safe Harbor specifications:
If the user is from a zip code for the which the “Geographic Subdivision” is less than equal to 20,000, change the zip code in df_users to ‘0’ (as a string)
Otherwise, zip should be only the first 3 numbers of the full zip code
Do all this by directly updating the zip column of the df_users DataFrame
Hints:
This will be several lines of code, looping through the DataFrame, getting each zip code, checking the geographic subdivision with the population in zip_dict, and setting the zip_code accordingly.
Be very aware of your variable types when working with zip codes here."
Here you can find all the data necessary to understand the context:
https://raw.githubusercontent.com/DataScienceInPractice/Data/master/
assignment: 'A4'
data_files: user_dat.csv, zip_pop.csv
After cleaning the data from user_dat.csv leaving only the columns: 'age', 'zip' and 'gender', and creating a dictionary from zip_pop.csv that contains the population of the first 3 digits from all the zipcodes; I wrote this code:
# Loop through the dataframe's to get each zipcode
for zipcode in df_users['zip']:
# check if the zipcode's 3 first numbers from the dataframe, correspond to a population of more or less than 20.000 people
if zip_dict[zipcode[:len(zipcode) - 2]] <= 20000:
# if less, change zipcode value to string zero.
df_users.loc[df_users['zip'] == zipcode, 'zip'] = '0'
else:
# If more, preserve only the first 3 digits of the zipcode.
df_users.loc[df_users['zip'] == zipcode, 'zip'] = zipcode[:len(zipcode) - 2]
This code works halfways and I don't understand why.
It changes the zipcode to 0 if the population is less than 20.000 people, and also changes the first zipcodes (up until the ones that start with '078') but then it returns this error message:
KeyError Traceback (most recent call last)
/var/folders/95/4vh4zhc1273fgmfs4wyntxn00000gn/T/ipykernel_44758/1429192050.py in < module >
1 for zipcode in df_users['zip']:
----> 2 if zip_dict[zipcode[:len(zipcode) - 2]] <= 20000:
3 df_users.loc[df_users['zip'] == zipcode, 'zip'] = '0'
4 else:
5 df_users.loc[df_users['zip'] == zipcode, 'zip'] = str(zipcode[:len(zipcode) - 2])
KeyError: '0'
I get that the problem is in the last line of code, because I've been doing every line at a time and each of them worked, until I put that last one. And if I just print the zipcodes instead of that last line, it also works!
Can anyone can help me understand why my code is wrong?

You're modifying a collection of values (i.e. df_users['zip']) whilst you're iterating over it. This is a common anti pattern. If a loop is absolutely required, then you could consider iterating over df_users['zip'].unique() instead. That creates a copy of all the unique zip codes, solving your current error, and it means that you aren't redoing work when you encounter a duplicate zipcode.
If a loop is not required, then there are better (more pandas style) ways to go about your problem. I would suggest something like (untested):
zip_start = df_users['zip'].str[:-2]
df_users['zip'] = zip_start.where(zip_start.map(zip_dict) > 20000, other="0")

Google Kickstart 2014 Round D Sort a scrambled itinerary - Do I need to bring the input in a ready-to-use array format?

Problem:
Once upon a day, Mary bought a one-way ticket from somewhere to somewhere with some flight transfers.
For example: SFO->DFW DFW->JFK JFK->MIA MIA->ORD.
Obviously, transfer flights at a city twice or more doesn't make any sense. So Mary will not do that.
Unfortunately, after she received the tickets, she messed up the tickets and she forgot the order of the ticket.
Help Mary rearrange the tickets to make the tickets in correct order.
Input:
The first line contains the number of test cases T, after which T cases follow.
For each case, it starts with an integer N. There are N flight tickets follow.
Each of the next 2 lines contains the source and destination of a flight ticket.
Output:
For each test case, output one line containing "Case #x: itinerary", where x is the test case number (starting from 1) and the itinerary is a sorted list of flight tickets that represent the actual itinerary.
Each flight segment in the itinerary should be outputted as pair of source-destination airport codes.
Sample Input: Sample Output:
2 Case #1: SFO-DFW
1 Case #2: SFO-DFW DFW-JFK JFK-MIA MIA-ORD
SFO
DFW
4
MIA
ORD
DFW
JFK
SFO
DFW
JFK
MIA
My question:
I am a beginner in the field of competitive programming. My question is how to interpret the given input in this case. How did Googlers program this input? When I write a function with a Python array as its argument, will this argument be in a ready-to-use array format or will I need to deal with the above mentioned T and N numbers in the input and then arrange airport strings in an array format to make it ready to be passed in the function's argument?
I have looked up at the following Google Kickstart's official Python solution to this problem and was confused how they simply pass the ticket_list argument in the function. Don't they need to clear the input from the numbers T and N and then arrange the airport strings into an array, as I have explained above?
Also, I could not understand how could the methods first and second simply appear if no Class has been initialized? But I think this should be another question...
def print_itinerary(ticket_list):
arrival_map = {}
destination_map = {}
for ticket in ticket_list:
arrival_map[ticket.second] += 1
destination_map[ticket.first] += 1
current = FindStart(arrival_map)
while current in destination_map:
next = destination_map[current]
print current + "-" + next
current = next

You need to implement it yourself to read data from standard input and write results to standard output.
Sample code for reading from standard input and writing to standard output can be found in the coding section of the FAQ on the KickStart Web site.
If you write the solution to this problem in python, you can get T and N as follows.
T = int(input())
for t in range(1, T + 1):
N = int(input())
...
Then if you want to get the source and destination of the flight ticket as a list, you can use the same input method to get them in the list.
ticket_list = [[input(), input()] for _ in range(N)]
# [['MIA', 'ORD'], ['DFW', 'JFK'], ['SFO', 'DFW'], ['JFK', 'MIA']]
If you want to use first and second, try a namedtuple.
Pair = namedtuple('Pair', ['first', 'second'])
ticket_list = [Pair(input(), input()) for _ in range(N)]

How to match input with elements in list/dictionary in Python3

I'm very new at coding, and I'm trying to create a shop list with items and prices on it.
That is, once typed in all the items, the function should calculate the sum and stop the moment you exceed the budget.
So I wrote something like:
def shoplist():
list={"apple":30, "orange":20, "milk":60......}
buy=str(input("What do you want to purchase?")
If buy in list:
While sum<=budget:
sum=sum+??
shoplist ()
I really don't know how to match the input of an item with the price in the list...
My first thought is to use 'if', but it's kinda impractical when you have more than 10 items on the list and random inputs.
I'm in desperate need of help....So any suggestions would be nice!! (or if you have a better solution and think me writing it this way is complete garbage... PLEASE let me know what those better solutions are😭😭😭

The code you post will not run in python. list is a builtin and should not be used for a variable name, and is doubly confusing since it refers to a dict object here. input() already returns a str so the cast has no effect. if and while should be lowercase, and there is no indentation, so we have no way of knowing the limits of those statements.
There are so many things wrong, take a look at this:
def shoplist(budget):
prices = {"apple":30, "orange":20, "milk":60}
# Initialise sum
sum = 0
while sum <= budget:
buy = input("What do you want to purchase?")
# Break out of the loop if the user hts <RETURN>
if not buy: break
if buy in prices:
sum += prices[buy] # This gets the price
else:
print("Invalid item", buy)
shoplist(142)
So what have I changed? The budget has to come from somewhere, so I pass it in as a parameter (142, I made that up). I initialise the sum to zero, and I moved the while loop to the outside.
Notice as well lots of whitespace - it makes the code easier to read and has no effect on performance.
Lots of improvements to make. The user should be shown a list of possible items and prices and also how much budget there is left for each purchase. Note as well that it is possible to go over budget since we might only have 30 in the budget but we can still buy milk (which is 60) - we need another check (if statement) in there!
I'll leave the improvements to you. Have fun!

Take a look at this as an example:
# this is a dictionary not a list
# be careful not using python reserved names as variable names
groceries = {
"apple":30,
"orange":20,
"milk":60
}
expenses = 0
budget = 100
cart = []
# while statements, as well as if statements are in lower letter
while expenses < budget:
# input always returns str, no need to cast
user_input = input("What do you want to purchase?")
if user_input not in groceries.keys():
print(f'{user_input} is not available!')
continue
if groceries[user_input] > budget - expenses:
print('You do not have enough budget to buy this')
user_input = input("Are you done shopping?Type 'y' if you are.")
if user_input == 'y':
break
continue
cart.append(user_input)
# this is how you add a number to anotherone
expenses += groceries[user_input]
print("Shopping cart full. You bought {} items and have {} left in your budget.".format(len(cart), budget-expenses))

I've made some changes to your code to make it work, with explanation including using comments indicated by the # symbol.
The two most important things are that all parentheses need to be closed:
fun((x, y) # broken
fun((x, y)) # not broken
and keywords in Python are all lowercase:
if, while, for, not # will work
If, While, For, Not # won't work
You might be confused by True and False, which probably should be lowercase. They've been that way so long that it's too late to change them now.
budget = 100 # You need to initialize variables before using them.
def shoplist():
prices = { # I re-named the price list from list to prices
'apple' : 30, # because list is a reserved keyword. You should only
'orange' : 20, # use the list keyword to initialize list objects.
'milk' : 60, # This type of object is called a dictionary.
} # The dots .... would have caused an error.
# In most programming languages, you need to close all braces ().
# I've renamed buy to item to make it clearer what that variable represents.
item = input('What do you want to purchase? ')
# Also, you don't need to cast the value of input to str;
# it's already a str.
if item in prices:
# If you need an int, you do have to cast from string to int.
count = int(input('How many? '))
cost = count*prices[item] # Access dictionary items using [].
if cost > budget:
print('You can\'t afford that many!')
else:
# You can put data into strings using the % symbol like so:
print('That\'ll be %i.' % cost) # Here %i indicates an int.
else:
print('We don\'t have %s in stock.' % item) # Here %s means str.
shoplist()
A lot of beginners post broken code on StackOverflow without saying that they're getting errors or what those errors are. It's always helpful to post the error messages. Let me know if you have more questions.

Python- Function which returns the average mark

I am trying to create a function which returns the average of a student's module. The student's data is stored in a list which contains the following, so Dave has: ('Dave', 0, 'none', 'M106', ['50'])
and then Ollie has: ('Ollie', 'M104', 0, 'none', ['60']). I can't get my head around how to get the average from the two averages.
def moduleAverage(self):
if student.getAverage is > 0:
return self.moduleAverage

Ok first I have to say: Correct your indentation.
So I'll make a little example now:
studentDetails = []
studentDetails.append(('Peter', ['40']))
studentDetails.append(('Frank', ['100']))
studentDetails.append(('Ernest', ['40']))
def moduleAverage(inList):
total = 0.0
for i in xrange(0, len(inList)):
total += float(inList[i][1][0]) # the score is at the index 1 in the tuple (('Peter', ['40']))
return (total / float(len(inList)))
print moduleAverage(studentDetails)
Pay attention: You have the single mark ['40'] as a string, so you have to convert it to float.
Also remain the same order in the student tuple.
I don't know your whole structure so I just made a simple example of the algorithm with tuples as you mentioned in your question.

Schwartzian sort example in "Text Processing in Python"

I was browsing through "Text Processing in Python" and tried its example about Schwartzian sort.
I used following structure for sample data which also contains empty lines. I sorted this data by fifth column:
383230 -49 -78 1 100034 '06 text' 9562 'text' 720 'text' 867
335067 -152 -18 3 100030 'text' 2400 'text' 2342 'text' 696
136592 21 230 3 100035 '03. text' 10368 'text' 1838 'text' 977
Code used for Schwartzian sorting:
for n in range(len(lines)): # Create the transform
lst = string.split(lines[n])
if len(lst) >= 4: # Tuple w/ sort info first
lines[n] = (lst[4], lines[n])
else: # Short lines to end
lines[n] = (['\377'], lines[n])
lines.sort() # Native sort
for n in range(len(lines)): # Restore original lines
lines[n] = lines[n][1]
open('tmp.schwartzian','w').writelines(lines)
I don't get how the author intended that short or empty lines should go to end of file by using this code. Lines are sorted after the if-else structure, thus raising empty lines to top of file. Short lines of course work as supposed with the custom sort (fourth_word function) as implemented in the example.
This is now bugging me, so any ideas? If I'm correct about this then how would you ensure that short lines actually stay at end of file?
EDIT: I noticed the square brackets around '\377'. This messed up sort() so I removed those brackets and output started working.
else: # Short lines to end
lines[n] = (['\377'], lines[n])
print type(lines[n][0])
>>> (type 'list')
I accepted nosklo's answer for good clarification about the meaning of '\377' and for his improved algorithm. Many thanks for the other answers also!
If curious, I used 2 MB sample file which took 0.95 secs with the custom sort and 0.09 with the Schwartzian sort while creating identical output files. It works!

Not directly related to the question, but note that in recent versions of python (since 2.3 or 2.4 I think), the transform and untransform can be performed automatically using the key argument to sort() or sorted(). eg:
def key_func(line):
lst = string.split(line)
if len(lst) >= 4:
return lst[4]
else:
return '\377'
lines.sort(key=key_func)

I don't know what is the question, so I'll try to clarify things in a general way.
This algorithm sorts lines by getting the 4th field and placing it in front of the lines. Then built-in sort() will use this field to sort. Later the original line is restored.
The lines empty or shorter than 5 fields fall into the else part of this structure:
if len(lst) >= 4: # Tuple w/ sort info first
lines[n] = (lst[4], lines[n])
else: # Short lines to end
lines[n] = (['\377'], lines[n])
It adds a ['\377'] into the first field of the list to sort. The algorithm does that in hope that '\377' (the last char in ascii table) will be bigger than any string found in the 5th field. So the original line should go to bottom when doing the sort.
I hope that clarifies the question. If not, perhaps you should indicate exaclty what is it that you want to know.
A better, generic version of the same algorithm:
sort_by_field(list_of_str, field_number, separator=' ', defaultvalue='\xFF')
# decorates each value:
for i, line in enumerate(list_of_str)):
fields = line.split(separator)
try:
# places original line as second item:
list_of_str[i] = (fields[field_number], line)
except IndexError:
list_of_str[i] = (defaultvalue, line)
list_of_str.sort() # sorts list, in place
# undecorates values:
for i, group in enumerate(list_of_str))
list_of_str[i] = group[1] # the second item is original line
The algorithm you provided is equivalent to this one.

An empty line won't pass the test
if len(lst) >= 4:
so it will have ['\377'] as its sort key, not the 5th column of your data, which is lst[4] ( lst[0] is the first column).

Well, it will sort short lines almost at the end, but not quite always.
Actually, both the "naive" and the schwartzian version are flawed (in different ways). Nosklo and wbg already explained the algorithm, and you probably learn more if you try to find the error in the schwartzian version yourself, therefore I will give you only a hint for now:
Long lines that contain certain text
in the fourth column will sort later
than short lines.
Add a comment if you need more help.

Although the used of the Schwartzian transform is pretty outdated for Python it is worth mentioning that you could have written the code this way to avoid the possibility of a line with line[4] starting with \377 being sorted into the wrong place
for n in range(len(lines)):
lst = lines[n].split()
if len(lst)>4:
lines[n] = ((0, lst[4]), lines[n])
else:
lines[n] = ((1,), lines[n])
Since tuples are compared elementwise, the tuples starting with 1 will always be sorted to the bottom.
Also note that the test should be len(list)>4 instead of >=
The same logic applies when using the modern equivalent AKA the key= function
def key_func(line):
lst = line.split()
if len(lst)>4:
return 0, lst[4]
else:
return 1,
lines.sort(key=key_func)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.