How to solve an error with improper alphabetical comparison - python

I have to write a program that first reads in the name of an input file and then reads the input file using the file.readlines() method. The input file contains an unsorted list of number of seasons followed by the corresponding TV show. Program puts the contents of the input file into a dictionary where the number of seasons are the keys, and a list of TV shows are the values (since multiple shows could have the same number of seasons). Sorts the dictionary by key (least to greatest) and output the results to a file named output_keys.txt, separating multiple TV shows associated with the same key with a semicolon (;). Sorts the dictionary by values (alphabetical order), and outputs the results to a file named output_titles.txt. So if my input file is "file1.txt" and the contents of that file are:
20
Gunsmoke
30
The Simpsons
10
Will & Grace
14
Dallas
20
Law & Order
12
Murder, She Wrote
The file output_keys.txt should contain:
10: Will & Grace
12: Murder, She Wrote
14: Dallas
20: Gunsmoke; Law & Order
30: The Simpsons
And the file output_title.txt contains:
Dallas
Gunsmoke
Law & Order
Murder, She Wrote
The Simpsons
Will & Grace
My code works perfectly fine and my assignment grades it fine except for the part with the "output_titles.txt" What I wrote in code doesn't put it in alphabetical order for it and I don't know where to go from here.
My code is:
inputFilename = input()
keysFilename = 'output_keys.txt'
titlesFilename = 'output_titles.txt'
shows = {}
with open(inputFilename) as inputFile:
showData = inputFile.readlines()
record_count = int(len(showData) / 2)
for i in range(record_count):
seasons = int(showData[2 * i].strip())
showName = showData[2 * i + 1].strip()
if seasons in shows:
shows[seasons].append(showName)
else:
shows[seasons] = [showName]
with open(keysFilename, 'w') as keysFile:
for season in sorted(shows):
keysFile.write(str(season) + ': ')
keysFile.write('; '.join(shows[season]) + '\n')
with open(titlesFilename, 'w') as titlesFile:
for show_list in sorted(shows.values()):
for show in show_list:
titlesFile.write(show + "\n")
I've attached a picture of the problem I get notified of:1
What should I do to solve this specifically?

The problem here is that shows.values() iterates over lists, not strings, so the sort doesn't quite work as you'd like. You could amalgamate these to a single list, but equally you could retain that list of show names in the first place as you read them in; so your initial interpretation loop would become:
allshows = [] # also collect just names
for i in range(record_count):
seasons = int(showData[2 * i].strip())
showName = showData[2 * i + 1].strip()
allshows.append(showName) # collect for later output
if seasons in shows:
shows[seasons].append(showName)
else:
shows[seasons] = [showName]
allshows.sort() # ready for output
and then output would be a simple iteration over this extra list.

That because you sorted a list of string lists. Each sublist corresponds to a different number of shows, and sorting the big lists, does not sort sublists. Just make one big plain list of show names and sort it. Try, for instance
with open(titlesFilename, 'w') as titlesFile:
for show in sorted(sum(shows.values(), []):
titlesFile.write(show + "\n")
I used the sum since it is succinct and intuitive, yet it might be a terribly slow considering amount of tv programming today. For greatest efficiency use itertools.chain or al good comprehension
sorted((show for show_titles in shows for show in show_titles.values())). Iterating over the list of list was discussed before many times, e.g. Concatenation of many lists in Python pick any method you like

This is the correct code for any future inquiries. I'm currently taking IT-140 and this passed all tests. If you follow the pseudocode line for line in the module videos, you'll easily get this.
file_name = input()
user_file = open(str(file_name))
output_list = user_file.readlines()
my_dict = {}
show_list = []
show_list_split = []
for i in range(len(output_list)):
temp_list = []
list_object = output_list[i].strip('\n')
if (i + 1 < len(output_list) and (i % 2 == 0)):
if int(list_object) in my_dict:
my_dict[int(list_object)].append(output_list[i + 1].strip('\n'))
else:
temp_list.append(output_list[i + 1].strip('\n'))
my_dict[int(list_object)] = temp_list
my_dict_sorted_by_keys = dict(sorted(my_dict.items()))
for x in my_dict.keys():
show_list.append(my_dict[x])
for x in show_list:
for i in x:
show_list_split.append(i)
show_list_split = sorted(show_list_split)
f = open('output_keys.txt', 'w')
for key, value in my_dict_sorted_by_keys.items():
f.write(str(key) + ': ')
for item in value[:-1]:
f.write(item + '; ')
else:
f.write(value[-1])
f.write('\n')
f.close()
f = open('output_titles.txt', 'w')
for item in show_list_split:
f.write(item + '\n')
f.close()

Related

How to combine list items into a dictionary where some list items have the same key?

This is the file that I am working with called file1.txt
20
Gunsmoke
30
The Simpsons
10
Will & Grace
14
Dallas
20
Law & Order
12
Murder, She Wrote
And here is my code so far:
file = open('file1.txt')
lines = file.readlines()
print(lines)
new_list=[]
for i in lines:
new = i.strip()
new_list.append(new)
print(new_list)
new_dict = {}
for i in range(0,len(new_list),2):
new_dict[new_list[i]]=new_list[i+1]
if i in new_dict:
i[key] = i.values()
new_dict = dict(sorted(new_dict.items()))
print(new_dict)
file_2 = open('output_keys.txt', 'w')
for x, y in new_dict.items():
print(x, y)
file_2.write(x + ': ')
file_2.write(y)
file_2.write('\n')
file_2.close()
file_3 = open('output_titles.txt', 'w')
new_list2 = []
for x, y in new_dict.items():
new_list2.append(y)
new_list2.sort()
print(new_list2)
print(new_list2)
for i in new_list2:
file_3.write(i)
file_3.write('\n')
print(i)
file_3.close()
The instructions state:
Write a program that first reads in the name of an input file and then reads the input file using the file.readlines() method. The input file contains an unsorted list of number of seasons followed by the corresponding TV show. Your program should put the contents of the input file into a dictionary where the number of seasons are the keys, and a list of TV shows are the values (since multiple shows could have the same number of seasons).
Sort the dictionary by key (least to greatest) and output the results to a file named output_keys.txt. Separate multiple TV shows associated with the same key with a semicolon (;), ordering by appearance in the input file. Next, sort the dictionary by values (alphabetical order), and output the results to a file named output_titles.txt.
So the part I am having trouble with 2 parts:
First is "Separate multiple TV shows associated with the same key with a semicolon (;)".
What I have written so far just replaces the new item in the dictionary.
for i in range(0,len(new_list),2):
new_dict[new_list[i]]=new_list[i+1]
if i in new_dict:
i[key] = i.values()
The 2nd part is that in the Zybooks program it seems to add onto output_keys.txt and output_title.txt every time it iterates. But my code does not seem to add to output_keys and output_title. For example, if after I run file1.txt I then try to run file2.txt, it replaces output_keys and output_title instead of adding to it.
Try to break down the problem into smaller sub-problems. Right now, it seems like you're trying to solve everything at once. E.g., I'd suggest you omit the file input and output and focus on the basic functionality of the program. Once that is set, you can go for the I/O.
You first need to create a dictionary with numbers of seasons as keys and a list of tv shows as values. You almost got it; here's a working snippet (I renamed some of your variables: it's always a good idea to have meaningful variable names):
lines = file.readlines()
# formerly "new_list"
clean_lines = []
for line in lines:
line = line.strip()
clean_lines.append(line)
# formerly "new_dict"
seasons = {}
for i in range(0, len(clean_lines), 2):
season_num = int(clean_lines[i])
series = clean_lines[i+1]
# there are only two options: either
# the season_num is already in the dict...
if season_num in seasons:
# append to the existing entry
seasons[season_num].append(series)
# ...or it isn't
else:
# make a new entry with a list containing
# the series
seasons[season_num] = [series]
Here's how you can print the resulting dictionary with the tv shows separated by semicolon using join. Adapt to your needs:
for season_num, series in seasons.items():
print(season_num, '; '.join(series))
Output:
20 Gunsmoke; Law & Order
30 The Simpsons
10 Will & Grace
14 Dallas
12 Murder, She Wrote
as I see you try to check if the key already exists in dictionary but it seems there is a mistake over there, you should check the value instead the index if it exists in dictionary and also you must check before putting into the dictionary and if it exits you can update current value by adding ; end the current value
for i in range(0,len(new_list),2):
if not new_list[i] in new_edict.keys():
new_edict[new_list[i]] = new_list[i+1]
else:
Update it hereā€¦ like
new_list[new_list[i]] = new_list[new_list[i]] +";"+ new_list[i+1]

Sorting and enumerating imported data from txt file (Python)

guys!
I'm trying to do a movie list, with data imported from a txt file that looks like this:
"Star Wars", "Y"
"Indiana Jones", "N"
"Pulp Fiction", "N"
"Fight Club", "Y"
(with Y = watched, and N = haven't seen yet)
I'm trying to sort the list by name, so that it'll look something like:
1. Fight Club (Watched)
2. Indiana Jones (Have Not Watched Yet)
3. Pulp Fiction (Have Not Watched Yet)
4. Star Wars (Watched)
And this is what I have so far:
def sortAlphabetically():
movie_list = {}
with open('movies.txt') as f:
for line in f:
movie, watched = line.strip().split(',')
movie_list[movie.strip()] = watched.strip()
if watched.strip() == '"N"':
print(movie.strip() + " (Have Not Watched Yet)")
if watched.strip() == '"Y"':
print(movie.strip() + " (Watched)")
I found a tutorial and tried adding this code within the function to sort them:
sortedByKeyDict = sorted(movie_list.items(), key=lambda t: t[0])
return sortedByKeyDict
I also tried using from ast import literal_eval to try and remove the quotation marks and then inserting this in the function:
for k, v in movie_list.items():
movie_list[literal_eval(k)] = v
But neither worked.
What should I try next?
Is it possible to remove the quotation marks?
And how do I go about enumerating?
Thank you so much in advance!
Here you go, this should do it:
filename = './movies.txt'
watched_mapping = {
'Y': 'Watched',
'N': 'Have Not Watched Yet'
}
with open(filename) as f:
content = f.readlines()
movies = []
for line in content:
name, watched = line.strip().lstrip('"').rstrip('"').split('", "')
movies.append({
'name': name,
'watched': watched
})
sorted_movies = sorted(movies, key=lambda k: k['name'])
for i, movie in enumerate(sorted_movies, 1):
print('{}. {} ({})'.format(
i,
movie['name'],
watched_mapping[movie['watched']],
))
First we define watched_mapping which simply maps values in your file to values you want printed.
After that we open the file and read all of its lines into a content list.
Next thing to do is parse that list and extract values from it (from each line we must extract the movie name and whether it has been watched or not). We will save those values into another list of dictionaries, each containing the movie name and whether it has been watched.
Thats what name, watched = line.strip().lstrip('"').rstrip('"').split('", "') is for, it basically strips garbage from each end of the line and then splits the line by garbage in the middle, returning clean name and watched.
Next thing to do is sort the list by name value in each dictionary:
sorted_movies = sorted(movies, key=lambda k: k['name'])
After that we simply enumerate the sorted list (starting at 1) and parse it to print out the desired output (using the watched_mapping to print out sentences instead of simple Y and N).
Output:
1. Fight Club (Watched)
2. Indiana Jones (Have Not Watched Yet)
3. Pulp Fiction (Have Not Watched Yet)
4. Star Wars (Watched)
Case insensitive sorting changes:
sorted_movies = sorted(movies, key=lambda k: k['name'].lower())
Simply change the value movies get sorted by into lowercase name. Now when sorting all the names are treated as lowercase.
Your function with some quick fix
def sortAlphabetically():
movie_list = []
with open('movies.txt') as f:
for line in f:
movie, watched = line.strip().split(',')
movie_list.append({
'name': movie.strip()[1:-1],
'watched': watched.strip()[1:-1]
})
return sorted(movie_list, key = lambda x : x['name'])
Well I just modified your code. When you use sorted() on a dictionary, then dictionary gets converted to a list of tuples. All I have done is that I have made another dictionary from the existing list of tuples.
def sortAlphabetically():
movie_list = dict()
with open('movies.txt') as f:
for line in f:
movie, watched = line.strip().split(',')
movie_list[movie.strip()] = watched.strip()
movie_sorted = sorted(movie_list.items(), key = lambda kv: kv[0])
movie_list = dict()
for key, value in movie_sorted:
movie_list[key] = value
i = 1
for key, value in movie_list.items():
if value == 'Y':
print("{}. {} {}".format(i,key,"(Watched)"))
else:
print("{}. {} {}".format(i,key,"Have Not Watched Yet"))
i += 1
I intentionally kept the code simple for better understanding. Hope this helps :)

If item is not inside list, f.write(",") checks too many times

I'm not sure if my problems lies in the for loops or in the if statements.
I have a bunch of virtual routers inside of my home lab, where I with paramiko is able to fetch some ip route tables into ordinary text documents. With regex and split, I extract the exact data I want. The goal is to put this data into a .csv "scheme" so to speak, so that i can upload it to my website and do a live presentation of the network to my teacher (for the extra points!)
This is my current code. the problem lies within the seven last lines of code.
#!/usr/bin/python3.5
### imports ###
import re
import sys
import csv
### Custom Functions ####
### VARIABLES ###
vrfarg = sys.argv[1]
bdiarray = []
### RUNTIME ####
c = open('output.csv', "w")
f = open('mplslist.txt', 'r')
for line in f:
d = open(line, 'r')
dsorted = sorted(d.readlines(), key=lambda x: int(x.split("BDI")[-1]))
print(dsorted)
for items in dsorted:
bdi = re.findall(r'(?<=\BDI).*',items)
print(bdi)
for items in bdi:
if items not in bdiarray:
bdiarray.extend(bdi)
d.close()
f.close()
print(bdiarray)
c.write(vrfarg + "\n")
c.write("VLANS:,")
for items in bdiarray:
c.write(items + ",")
c.write("\n")
f = open('mplslist.txt', 'r')
for line in f:
c.write(line.rstrip() + ",")
d = open(line, 'r')
dsorted = sorted(d.readlines(), key=lambda x: int(x.split("BDI")[-1]))
print(dsorted)
for d in dsorted:
for items in bdiarray:
if "BDI" + items in d:
c.write("route ok!,")
if not "BDI" + items in d:
c.write(",")
For every line inside the route file, i want to check if "BDI"+somenumber is equal to items inside bdi array, So that every line inside the route file, runs through all the items inside bdiarray, if they names match, (if the lines contains the exact word) c.write("route ok!,") and for all of the items it does not match, it should do a c.write(",") (blank cell inside a CSV file)
The the output should be:
ROUTES TO ROUTER1,
VLANS:,9,708,3001,
ROUTER2,route ok!,route ok!,route ok!,
But the output is:
ROUTES TO ROUTER1,
VLANS:,9,708,3001,
ROUTER2,route ok!,,,,route ok!,,,,route ok!,
any suggestions?
I'm fully aware this is rather rubbish code, that i run through the file twice and such, I simply need a PoC to show my teacher, so he'll accept it as a exam topic (Networking and programming), optimizing comes later.
You have the two for-loops in the wrong order:
for items in bdiarray:
ok = False
for d in dsorted:
if "BDI" + items in d:
ok = True
break
c.write("route ok!," if ok else ",")
or with any:
for items in bdiarray:
ok = any("BDI" + items in d for d in dsorted)
c.write("route ok!," if ok else ",")
I think if you remove these lines , your expected output will appear :
if not "BDI" + items in d:
c.write(",")
I think the issue has to do with how you're checking if the number you've found in your file is in your list. Your current code checks each item with each member of the list and writes some output for every check. That's not what you want. You only want one output to be written if a match is found, not one per item of the list.
Try replacing this loop:
for items in bdiarray:
if "BDI" + items in d:
c.write("route ok!,")
if not "BDI" + items in d:
c.write(",")
With this alternative (using any and a generator expression):
if any("BDI" + items in d for items in bdiarray):
c.write("route ok!,")
else:
c.write(",")

Having trouble with my python program

I'm having a bit of trouble. So, for my assignment, my teacher wants us to read in data and output the data into another file. Now, the data we are reading in are Students name(Line one), and their grades(Line 2). Now, he wants us to read them in, then write them into another file. Write them in two lines. Line one, being the students name, and line two, being their average. Then, write the averages into a list and run the whole list through mean, median, and standard deviation. Here's an example of some data from the file.
Aiello,Joseph
88 75 80
Alexander,Charles
90 93 100 98
Cambell,Heather
100 100
Denniston,Nelson
56 70 65
So, as you see, it's last name first, separated by a comma, then first. Then, on line two, their grades. He wants us to find the average of them and then write them under the students name. That's the part I'm having trouble on. I know how to find an average. Add the grades up, then divide by the number of grades they got. But how do I put that into python? Can anyone help? Also, I already have a mean, median, standard deviation program. How would I put the averages I get from the first part into a list, then putting the whole list through the mean, median, standard devation program.And back to my original question. Is there anything wrong with what I have so far? Anything I need to add/change? Here's my code.
def main():
input1 = open('StudentGrades.dat', 'r')
output = open('StudentsAvg', 'w')
for nextLine in input1:
output.write(nextLine)
list1 = nextLine.split()
count = int(list1[3])
for p in range(count):
nextLine = input1.readlin()
output.write(nextLine)
list2 = nextLine.split()
name = int(list2[1])
grades = list2[2]
pos = grades.index(grade)
avg =
It seems like there's a few problems here. The first is that what everything you're reading from the file is a string, not a number. Secondly, you should probably be doing all of this within the same for loop wherein you read the lines. (One more point - use the with statement to allow your file objects to be automatically destructed when you're done with them.) So, you could modify your code as follows:
def main():
with open('StudentGrades.dat', 'r') as input1, open('StudentsAvg.txt', 'w') as output:
counter = 0
student_name = ''
for nextLine in input1:
if counter % 2 == 0:
student_name = nextLine
else:
grades = [int(x) for x in nextLine.split()]
avg = sum(grades) / len(grades)
print(student_name, file=output)
print(str(avg), file=output)
counter += 1
Note that print(str, file) is the current, preferred method for writing to a file.
Some improvements made to the original code:
def averagMarksCalculator(): # Learn to name your functions "Meaningfully"
# Using with clause - Learn to love it as much as you can!
with open('StudentGrades.dat') as input1, open('StudentsAvg.txt', 'w') as output:
for nextLine in input1:
strToWrite = nextLine; # Write student's name
output.write(nextLine) # Print student name
list1 = (input1.readline()).split() # split them into individual strings
avg = 0 # initialise
list1 = [int(x) for x in list1]
avg = sum(list1)/len(list1)
output.write("Average marks........"+str(avg)+"\r\n") # avg marks of student
input1.close()
output.close()
Note that the "\r\n" is to make sure you have a line gap after a student's name and average marks printed on the result file. If you don't need the empty new line as a separator, please use "\r" only.

How do I alphabetize a file in Python?

I am trying to get a list of presidents alphabetized by last name, even though the file that it is being drawn is currently listed first name, last name, date in office, and date out of office.
Here is what I have, any help on what I need to do with this. I have searched around for some answers, and most of them are beyond my level of understanding. I feel like I am missing something small. I tried to break them all out into a list, and then sort them, but I could not get it to work, so this is where I started from.
INPUT_FILE = 'presidents.txt'
OUTPUT_FILE = 'president_NEW.txt'
OUTPUT_FILE2 = 'president_NEW2.txt'
def main():
infile = open(INPUT_FILE)
outfile = open(OUTPUT_FILE, 'w')
outfile2 = open(OUTPUT_FILE2,'w')
stuff = infile.readline()
while stuff:
stuff = stuff.rstrip()
data = stuff.split('\t')
president_First = data[1]
president_Last = data[0]
start_date = data[2]
end_date = data[3]
sentence = '%s %s was president from %s to %s' % \
(president_First,president_Last,start_date,end_date)
sentence2 = '%s %s was president from %s to %s' % \
(president_Last,president_First,start_date, end_date)
outfile2.write(sentence2+ '\n')
outfile.write(sentence + '\n')
stuff = infile.readline()
infile.close()
outfile.close()
main()
What you should do is put the presidents in a list, sort that list, and then print out the resulting list.
Before your for loop add:
presidents = []
Have this code inside the for loop after you pull out the names/dates
president = (last_name, first_name, start_date, end_date)
presidents.append(president)
After the for loop
presidents.sort() # because we put last_name first above
# it will sort by last_name
Then print it out:
for president in presidents
last_name, first_name, start_date, end_date = president
string1 = "..."
It sounds like you tried to break them out into a list. If you had trouble with that, show us the code that resulting from that attempt. It was right way to approach the problem.
Other comments:
Just a couple of points where you code could be simpler. Feel free to ignore or use this as you want:
president_First=data[1]
president_Last= data[0]
start_date=data[2]
end_date=data[3]
can be written as:
president_Last, president_First, start_date, end_date = data
stuff=infile.readline()
And
while stuff:
stuff=stuff.rstrip()
data=stuff.split('\t')
...
stuff = infile.readline()
can be written as:
for stuff in infile:
...
#!/usr/bin/env python
# this sounds like a homework problem, but ...
from __future__ import with_statement # not necessary on newer versions
def main():
# input
with open('presidents.txt', 'r') as fi:
# read and parse
presidents = [[x.strip() for x in line.split(',')] for line in fi]
# sort
presidents = sorted(presidents, cmp=lambda x, y: cmp(x[1], y[1]))
# output
with open('presidents_out.txt', 'w') as fo:
for pres in presidents:
print >> fo, "president %s %s was president %s %s" % tuple(pres)
if __name__ == '__main__':
main()
I tried to break them all out into a list, and then sort them
What do you mean by "them"?
Breaking up the line into a list of items is a good start: that means you treat the data as a set of values (one of which is the last name) rather than just a string. However, just sorting that list is no use; Python will take the 4 strings from the line (the first name, last name etc.) and put them in order.
What you want to do is have a list of those lists, and sort it by last name.
Python's lists provide a sort method that sorts them. When you apply it to the list of president-info-lists, it will sort those. But the default sorting for lists will compare them item-wise (first item first, then second item if the first items were equal, etc.). You want to compare by last name, which is the second element in your sublists. (That is, element 1; remember, we start counting list elements from 0.)
Fortunately, it is easy to give Python more specific instructions for sorting. We can pass the sort function a key argument, which is a function that "translates" the items into the value we want to sort them by. Yes, in Python everything is an object - including functions - so there is no problem passing a function as a parameter. So, we want to sort "by last name", so we would pass a function that accepts a president-info-list and returns the last name (i.e., element [1]).
Fortunately, this is Python, and "batteries are included"; we don't even have to write that function ourself. We are given a magical tool that creates functions that return the nth element of a sequence (which is what we want here). It's called itemgetter (because it makes a function that gets the nth item of a sequence - "item" is more usual Python terminology; "element" is a more general CS term), and it lives in the operator module.
By the way, there are also much neater ways to handle the file opening/closing, and we don't need to write an explicit loop to handle reading the file - we can iterate directly over the file (for line in file: gives us the lines of the file in turn, one each time through the loop), and that means we can just use a list comprehension (look them up).
import operator
def main():
# We'll set up 'infile' to refer to the opened input file, making sure it is automatically
# closed once we're done with it. We do that with a 'with' block; we're "done with the file"
# at the end of the block.
with open(INPUT_FILE) as infile:
# We want the splitted, rstripped line for each line in the infile, which is spelled:
data = [line.rstrip().split('\t') for line in infile]
# Now we re-arrange that data. We want to sort the data, using an item-getter for
# item 1 (the last name) as the sort-key. That is spelled:
data.sort(key=operator.itemgetter(1))
with open(OUTPUT_FILE) as outfile:
# Let's say we want to write the formatted string for each line in the data.
# Now we're taking action instead of calculating a result, so we don't want
# a list comprehension any more - so we iterate over the items of the sorted data:
for item in data:
# The item already contains all the values we want to interpolate into the string,
# in the right order; so we can pass it directly as our set of values to interpolate:
outfile.write('%s %s was president from %s to %s' % item)
I did get this working with Karls help above, although I did have to edit the code to get it to work for me, due to some errors I was getting. I eliminated those and ended up with this.
import operator
INPUT_FILE = 'presidents.txt'
OUTPUT_FILE2= 'president_NEW2.txt'
def main():
with open(INPUT_FILE) as infile:
data = [line.rstrip().split('\t') for line in infile]
data.sort(key=operator.itemgetter(0))
outfile=open(OUTPUT_FILE2,'w')
for item in data:
last=item[0]
first=item[1]
start=item[2]
end=item[3]
outfile.write('%s %s was president from %s to %s\n' % (last,first,start,end))
main()

Categories

Resources