I am so new to python (a week in) so I hope I ask this question properly.
I have imported a grade sheet in csv format into python 2.7. The first column is the name of the student and the column titles are the name of the assignments. So the data looks something like this:
Name Test1 Test2 Test3
Robin 89 78 100
...
Rick 72 100 98
I want to be able to do (or have someone else do) 3 things just by typing in the name of the person and the assignment.
1. Get the score for that person for that assignment
2. Get the average score for that assignment
3. Get that persons average score
But for some reason I get lost at figuring how to get python to recognize the field I am trying to call in. So far this is what I have (so far the only part that works is calling in file):
data = csv.DictReader(open("C:\file.csv"))
for row in data:
print row
def grade()
student= input ("Enter a student name: ")
assignment= input("Enter a assignment: ")
for row in data:
task_grade= data.get(int(row["student"], int(row["assignment"])) # specific grade
task_total= sum(int(row['assignment'])) #assignment total
student_total= #student assignments total-- no clue how to do this
task_average= task_total/11
average_score= student_total/9
You can access the individual "columns" of your csv this way:
import csv
def parse_csv():
csv_file = open('data.csv', 'r')
r = csv.reader(csv_file)
grade_averages = {}
for row in r:
if row[0].startswith('Name'):
continue
#print "Student: ", row[0]
grades = []
for column in row[1:]:
#print "Grade: ", column
grades.append(int(column.strip()))
grade_total = 0
for i in grades:
grade_total += i
grade_averages[row[0]] = grade_total / len(grades)
#print "grade_averages: ", grade_averages
return grade_averages
def get_grade(student_name):
grade_averages = parse_csv()
return grade_averages[student_name]
print "Rick: ", get_grade('Rick')
print "Robin: ", get_grade('Robin')
What you are trying to do is not meant for Python because you have keys and values. However...
If you know that your columns are always the same, no need to use keywords, you can use positions:
Here is the easy, inefficient* way to do 1 and 3:
students_name = ...
number = ...
for line in open("C:\file.csv")).readlines()
items = line.split()
num_assignments = len(items)-1
name = items[0]
if name = students_name:
print("assignment score: {0}".format(items[number]))
asum = 0
for k in range(0,num_assignments):
asum+= items[k+1]
print("their average: {0}".format(asum / num_assignments)
To do 2, you should precompute the averages and return them beucase the averages for each assignment is the same for each user query.
I say easy *innefficnet because you search the text file for each user query each time a name is entered. To do it properly, you should probably build a dictionary of all names and their information. But that solution is more complicated, and you are only a week in! Moreover, its longer and you should give it a try. Look up dict.
I believe the reason you are not seeing the field the second time around is because the iterator returned by csv.DictReader() is a one-time iterator. That is to say, once you've reached the last row of the csv file, it will not reset to the first position.
So, by doing this:
data = csv.DictReader(open("C:\file.csv"))
for row in data:
print row
You are running it out. Try commenting those lines and see if that helps.
Related
Write a script high_scores.py that will read in a CSV file of users' scores and display the
highest score for each person. The file you will read in is named scores.csv. You should store the high scores as values
in a dictionary with the associated names as dictionary keys. This way, as you read in
each row of data, if the name already has a score associated with it in the dictionary, you
can compare these two scores and decide whether or not to replace the "current" high
score in the dictionary.
Use the sorted() function on the dictionary's keys in order to display an ordered list of
high scores, which should match this output:
Empiro 23
L33tH4x 42
LLCoolDave 27
MaxxT 25
Misha46 25
O_O 22
johnsmith 30
red 12
tom123 26
scores.csv :
LLCoolDave,23
LLCoolDave,27
red,12
LLCoolDave,26
tom123,26
O_O,7
Misha46,24
O_O,14
Empiro,18
Empiro,18
MaxxT,25
L33tH4x,42
Misha46,25
johnsmith,30
Empiro,23
O_O,22
MaxxT,25
Misha46,24
I stumbled on how to check if i need to replace score of certain name
import csv
dic = {}
with open("scores.csv", "r") as my_file:
my_file_reader = csv.reader(my_file)
for i in my_file_reader:
dic[i[0]] = i[1]
If you run your code on the csv, you'll see that LLCoolDave's score is 26 instead of 27. This is because you update your dictionary every time a new entry is seen, and effectively, you're keeping the most recent scores -- not the highest. To fix this, you can try something like:
import csv
dic = {}
with open("scores.csv", "r") as my_file:
my_file_reader = csv.reader(my_file)
for row in my_file_reader:
if row[0] in dic:
dic[row[0]] = max(dic[row[0]], row[1])
else:
dic[row[0]] = row[1]
print(dic)
Essentially, we are first checking whether an entry exists for the current user. If yes, his best score is the maximum of the new score and the previous best score. Otherwise, his best score is just whatever the new score is.
firstly, i apologize for my bad english as it is not my native language but i will try my best to explain everything as best as i can. i am trying to find students with highest and lowest credits from a .csv file
here is my csv looks like
and here is my code so far:
i appended the first names into first_names array(same thing with the last name and credits)
def arrays(i):
import csv
with open('FCredits.csv','r+') as f_data:
csv_reader = csv.reader(f_data, delimiter=',')
first_names = []
last_names = []
f_credits = []
for row in csv_reader:
csv_reader = csv.reader(f_data, delimiter=',')
first_name = row[0]
last_name = row[1]
f_credit = row[2]
first_names.append(first_name)
last_names.append(last_name)
f_credits.append(f_credit)
find_min_max(first_names,last_names,f_credits)
but then stuck on the next part
def find_min_max(first_names,last_names,f_credits):
minVal, maxVal = [],[]
for i in f_credits:
minVal.append(f_credits)
maxVal.append(f_credits)
print(min(minVal))
print(max(minVal))
basically, what i wanted to do on the second part is to print out the student with lowest and highest amount of credits and write them in a new csv file but gave up halfway.
There are a few things that I have noted in your question:
The highest or lowest mark can be scored by multiple people.
That means the output may or may not be single and so a list is required.
Please see my code below:
def get_min_max(first_names,last_names,f_credits):
max_value = max(f_credits)
min_value = min(f_credits)
minVal = []
maxVal = []
for element in zip(first_names, last_names, f_credits):
if element[-1] == max_value:
maxVal.append(element)
elif element[-1] == min_value:
minVal.append(element)
print(len(maxVal), maxVal)
print(len(minVal), minVal)
return maxVal, minVal
For the If .. Elif part i suggest that you use a list comprehension to make the code concise and faster. I have written it this way so that you understand it better.
You may also want to read about min, max and zip functions of python from the official python documentation.
Input (new.csv:)
student Jack
Choice Phy
Choice Chem
Choice Maths
Choice Biology
student Jill
Choice Phy
Choice Biology
Choice Maths
Expected Output (out.csv)
Student Phy Chem Maths Biology
Jack Yes Yes Yes Yes
Jill Yes No Yes Yes
Parsing new.csv and writing result in out.csv.For each student name, writing YES if a choice of subject is present and NO if the subject is not in the choice(subjects become new header in out.csv).
Here I have used nested if to get desired output.Please help me with better pythonic way of code.
I am newbie to python.Eager to learn better way of coding.
P.S: Choice of subjects is not in the same order.
import csv
la =[]
l2=[]
with open("new.csv","r",newline='\n') as k:
k=csv.reader(k, delimiter=',', quotechar='_', quoting=csv.QUOTE_ALL)
counter = 0
for col in k :
# number of rows in csv is 600
if counter<=600:
if col[0] =='student':
la.append("\n "+col[1])
a=next(k)
if a[1] == 'Phy':
la.append('yes')
a = next(k)
else:
la.append('no')
if a[1] == 'Chem':
la.append('yes')
a = next(k)
else:
la.append('no')
if a[1] == 'Maths':
la.append('yes')
a = next(k)
else:
la.append('no')
if a[1] == 'Biology':
la.append('yes')
a = next(k)
counter += 1
else:
la.append('no')
counter += 1
l2=",".join(la)
with open("out.csv","w") as w:
w.writelines(l2)
IMHO, it is time to learn how to debug simple prorams. Some IDE come with nice debuggers, but you can still use the good old pdb or simply add print traces in your code to easily understand what happens.
Here, the first and most evident problem is here:
tot = sum(1 for col in k)
It is pretty useless because for col in k would be enough, but it consumes the totality of the k iterator, so the next line for col in k: try to access an iterator that has already reached its end and the loop immediately stops.
That is not all:
first line contains Student with an upper case S while you test student with a lower case s: they are different strings... This case problems exists on all the other comparisons.
when you find student, you set a to the line following it... and never change it. So even if you fix your case errors, you will consistently use that only line for the student!
If you are a beginner, the rule is Keep It Simple, Stupid. So start from something you can control and then start to add other features:
read the input file with the csv module and just print the list for every row. Do not step further until this gives what you want! That would have stopped you from the tot = sum(1 for col in k) error...
identify every student. Just print it first, then store its name in a list and print the list after the loop
identify subject. Just print them first, then feed a dictionnary with the subjects
wonder how you can get that at the end of the loop...
just realize that you could store the student name in that dictionnary, and put the full dictionnary in the list (feel free to ask a new question if you are stuck there...)
print the list of dictionnaries at the end of the loop
build one row for student that could feed the csv writer, or as you already have a list of dict, considere using a DictWriter.
Good luck in practicing Python!
Here is a possible way for the read part:
import csv
la = {} # use a dict to use the student name as index
with open("new.csv","r",newline='\n') as k:
k=csv.reader(k, delimiter=',', quotechar='_', quoting=csv.QUOTE_ALL)
# counter = 0 # pretty useless...
for col in k :
if col[0] =='student':
l2 = set() # initialize a set to store subjects
la[col[1]] = l2 # reference it in la indexed by the student's name
else: # it should be a subject line
l2.add(col[1]) # note the subject
# Ok la is a dict with studend names as key, and a set containing subjects for that student as value
print(la)
For the write part, you should:
build an union of all sets to get all the possible subjects (unless you know that)
for each item (name, subjects) from la, build a list storing yes or no for each of the possible subject
write that list to the output csv file
...left as an exercise...
I am trying to create an attendance logger where I create a dictionary which I fill with student names. The names will be lists where I append their class attendance data (whether they attended class or not). The code I have so far is displayed below`
#! /bin/python3
#add student to dict
def add_student(_dict):
student=input('Add student :')
_dict[student]=[]
return _dict
#collect outcomes
def collector(student,_dict, outcome):
_dict[student].append(outcome)
return _dict
#counts target
def count(_dict,target):
for i in _dict:
# records total attendance names
attendance_stat = len(_dict[i])
# records total instances absent
freq_of_absence=_dict[i].count(target)
# records percentage of absence
perc_absence = float((freq_of_absence/attendance_stat)*100)
print(i,'DAYS ABSENT =',freq_of_absence)
print('TOTAL DAYS: ', i, attendance_stat)
print('PERCENTAGE OF ABSENCE:', i, str(round(perc_absence, 2))+'%')
#main function
def main():
#date=input('DATE: ')
outcomes=['Y','N']
student_names = {}
try:
totalstudents = int(input('NO. OF STUDENTS: '))
except ValueError:
print('input an integer')
totalstudents = int(input('NO. OF STUDENTS: '))
while len(student_names) < totalstudents:
add_student(student_names)
print(student_names)
i = 0
while i < totalstudents:
i = i + 1
target='Y'
student=str(input('student :'))
outcome=str(input('outcome :'))
collector(student,student_names,outcome)
count(student_names,target)
if __name__=='__main__':
main()
`
The code works well so far but the problem is when the number of students is too large, time taken to input is extensive cutting in on class time. Since the number of absentees is usually less than those present, is it possible to select from the dictionary students absent which will append the value Y for each selected absent, while appending N to the remaining lists in dictionary.
This isn't exactly what you're asking for, but I think it will help. Instead of asking the user to input a name each time for the second part, why not just print the name yourself, and only ask for the outcome? Your last while loop would then become a for loop instead, like this:
for student_name in student_names:
outcome = input("Outcome for {}: ".format(sudent_name))
collector(student_name, student_names, outcome)
You could also add some logic to check if outcome is an empty string, and if so, set it to 'N'. This would just allow you to hit enter for most of the names, and only have to type in 'Y' for the certain ones that are absent. That would look like this:
for student_name in student_names:
outcome = input("Outcome for {}: ".format(sudent_name))
if outcome = "":
outcome = "N"
collector(student_name, student_names, outcome)
Right now I am trying to make a simple program to separate the javascript links of a website but I'm running into issues with a while loop.
Here would be an example of an input:
001_usa_wool.jpg
002_china_silk.jpg
003_canada_cotton.jpg
004_france_wool.jpg
done
A simplified version of my code with just 3 parts is the following:
def ParseData(input):
data = input.split('_')
d = {}
d['sku'] = data[0]
d['country'] = data[1].capitalize()
d['material'] = data[2].capitalize()
return d
def Sku():
myData = ParseData(input)
sku = myData['sku']
return sku
def Country():
myData = ParseData(input)
country = myData['country']
return country
def Material():
myData = ParseData(input)
material = myData['material']
return material
def Output():
print (Sku()+'\t'+
Country()+'\t'+
Material()+'\t'+
'\n')
Now here is how I tried to read it line by line:
def CleanInput(input):
clean = input.split('.jpg')
count = 0
while (clean[count] != 'done'):
ParseData(clean[count])
Output()
count = count+1
input = input('Enter your data: ')
CleanInput(input)
I believe I am not implementing the while loop correcting since my output is similar to:
001 Usa Wool
001 Usa Wool
001 Usa Wool
The issue is not exactly in your while loop , but in your functions - Output() , Sku() , Material() and Country() .
In the function Output() , you are printing the values by directly calling Sku(), etc.
In each of the function, I will take one as example - Sku() , you are calling parseData on input (Though this is a very bad naming, please use a better name, with this name you are overwriting the built-in input function, and later on you cannot call input() to take inputs from user)
The input always contains the entire string you inputted , and hence it contains all the .jpg names, when parseData goes through it, it always only picks up the first one.
Instead of using input in each function, we should make the functions parameterized and send in the value that needs to be printed as a parameter, as you are doing for parseData . And Example code -
def Sku(toprint):
myData = ParseData(toprint)
sku = myData['sku']
return sku
.
.
.
def Output(toprint):
print (Sku(toprint)+'\t'+
Country(toprint)+'\t'+
Material(toprint)+'\t'+
'\n')
And in the while loop send in the current value to print as parameter to Output() -
def CleanInput(input):
clean = input.split('.jpg')
count = 0
while (clean[count] != 'done'):
ParseData(clean[count])
Output(clean[count])
count = count+1
Also , please do not use input as the name of the variable , it can cause issues , as i previously stated , as you are overwriting the built-in input function with that.
Personally, I would make it more pythonic:
def CleanInput(input):
clean = input.split('.jpg')
for count, elem in enumerate(clean):
if elem == 'done':
break
ParseData(elem)
Output()
return count
input_data = input('Enter your data: ')
how_many = CleanInput(input_data)
Assuming you really need count. By the way: you aren't using the return value of ParseData
You have too many functions that call each other and take on vague requirements. It's hard to see what returns something, what prints, and so on. For example, your CleanInput calls ParseData and Output, but Output calls Sku, Country, and Material, each of which also calls ParseData. Oh, and capitalized variables should be reserved for classes - use snake_case for functions.
>>> s = "001_usa_wool.jpg002_china_silk.jpg003_canada_cotton.jpg004_france_wool.jpgdone"
>>> print(*("{}\t{}\t{}".format(*map(str.capitalize, item.split('_')))
... for item in s.split('.jpg') if item != 'done'), sep='\n')
001 Usa Wool
002 China Silk
003 Canada Cotton
004 France Wool