firstly, i apologize for my bad english as it is not my native language but i will try my best to explain everything as best as i can. i am trying to find students with highest and lowest credits from a .csv file
here is my csv looks like
and here is my code so far:
i appended the first names into first_names array(same thing with the last name and credits)
def arrays(i):
import csv
with open('FCredits.csv','r+') as f_data:
csv_reader = csv.reader(f_data, delimiter=',')
first_names = []
last_names = []
f_credits = []
for row in csv_reader:
csv_reader = csv.reader(f_data, delimiter=',')
first_name = row[0]
last_name = row[1]
f_credit = row[2]
first_names.append(first_name)
last_names.append(last_name)
f_credits.append(f_credit)
find_min_max(first_names,last_names,f_credits)
but then stuck on the next part
def find_min_max(first_names,last_names,f_credits):
minVal, maxVal = [],[]
for i in f_credits:
minVal.append(f_credits)
maxVal.append(f_credits)
print(min(minVal))
print(max(minVal))
basically, what i wanted to do on the second part is to print out the student with lowest and highest amount of credits and write them in a new csv file but gave up halfway.
There are a few things that I have noted in your question:
The highest or lowest mark can be scored by multiple people.
That means the output may or may not be single and so a list is required.
Please see my code below:
def get_min_max(first_names,last_names,f_credits):
max_value = max(f_credits)
min_value = min(f_credits)
minVal = []
maxVal = []
for element in zip(first_names, last_names, f_credits):
if element[-1] == max_value:
maxVal.append(element)
elif element[-1] == min_value:
minVal.append(element)
print(len(maxVal), maxVal)
print(len(minVal), minVal)
return maxVal, minVal
For the If .. Elif part i suggest that you use a list comprehension to make the code concise and faster. I have written it this way so that you understand it better.
You may also want to read about min, max and zip functions of python from the official python documentation.
Related
I'm getting this issue in quite a few different places throughout my code, but I'll only post the simplest code where I'm getting this issue so I can learn from it.
My code works sometimes, and other times it doesn't. When it doesn't work, I get IndexError: list index out of range returned. Its in a class called Students, data is referencing a .txt file that has 800 students in it (give or take).
def SearchStudent(self, data):
students = []
with open(data, "r") as datafile:
for line in datafile:
datum = line.split()
students.append(datum)
searchFirstName = input('Enter students first name: ')
for datum in students:
if datum[1] == searchFirstName:
print(datum)
Error seems to hit the if datum[1] == searchFirstName: part when it happens, but struggling to wrap my head around why it's happening.
Revise like below to do a basic check:
for datum in students:
if len(datum) > 1 and datum[1] == searchFirstName:
# note index[0] on the list would mean a list length of 1, so looking >1 to get a list containing at least an index[1]
print(datum)
It could be because the line in the datafile might be empty which is ultimately resulting in an error in the if condition that you have configured.
i.e
error in the following line
datum = line.split()
You can add another if condition before this if condition if datum[1] == searchFirstName: as if len(datum) > 1:
i.e
if len(datum) > 1:
if datum[1] == searchFirstName:
Suppose I create a dictionary
d = {i:False for i in range(0,100)}
Then I make a list
l = [d[12], d[10], d[70]]
and then change the dictionary:
d[12] = True...
the list doesn't change, is this behavior expected? If so, how can I add the values as references?
I'm doing this in a more complicated context, but this is the first thing I wanted to investigate (of many potential issues, I just wrote this).
Here's the full code:
import csv
# create a bingo num flyweight --> we don't have to update every board
b = {i:False for i in range(0,100)}
boards = []
with open("04.csv") as f:
reader = csv.reader(f)
# get the numbers to be drawn
draw = list(map(int, next(reader)))
# skip the first emptyline before boards start
next(reader)
# we want to store each board as a list of rows and columns
# i.e. 5x5 board is a 5x10 list, this will make checking easier
board = []
for row in reader:
print(row)
if len(row) < 1:
boards.append(board[:] + list(map(list, list(zip(*board)))))
board = []
else:
board.append([b[int(i)] for i in filter(None,row[0].split(" "))])
for i in range(0, 100):
b[i] = True
print(boards[0])
I suspect the way that I append board to boards could be the culprit...
You might recognize this as an 'advent of code' problem, I promise I can solve it with a more rudimentary technique, I just don't want to give up on this method... thank you!
For an assignment, I have to take a large number of "lottery" values from an Excel sheet (individually, a little over 2,300 integers, but they're grouped in six) and create an algorithm to try and predict the most likely next six numbers that will be drawn. So far, I have split up 396 six-number/six-slot combinations into arrays.
My plan was to count how many times each number (possible values, 1-75) shows up in each slot and divide it by 396, but I'm not exactly sure how to go through each slot array, count the matches, and store the percentages for each possible value.
The only way I have been able to count the matches was a super simple but absolutely humongous block of code (I created a variable for each possible 1-75 and made a for loop with a nested if loop that would simply add a count to the correct variable) that I couldn't figure out how to condense and is horrible to try and use to complete algorithm. I still have it, but I'm not going to post that part since it is horrible. But the rest of my code, I'll put down below:
# Open the Lottery_Data.csv and return the raw data.
def open_csv_file():
file = open("Lottery_Data.csv", "r")
csv_data = file.read()
csv_data = csv_data.split("\n")
file.close()
return csv_data
# Main Function
def main():
# Copy raw data from CSV file into a variable.
csv_data = open_csv_file()
# Create a separate array for the date (day and specific date), and the six lottery number-slots.
lottery_date_day = []
lottery_date_specific = []
lottery_col_1 = []
lottery_col_2 = []
lottery_col_3 = []
lottery_col_4 = []
lottery_col_5 = []
lottery_col_6 = []
# Split the raw data from csv_data and spread them into the arrays
for i in range(len(csv_data)-2):
line = csv_data[i].split(",")
l_d_d = line[0]
l_d_s = line[1]
l1 = line[2]
l2 = line[3]
l3 = line[4]
l4 = line[5]
l5 = line[6]
l6 = line[7]
lottery_date_day.append(l_d_d)
lottery_date_specific.append(l_d_s)
lottery_col_1.append(l1)
lottery_col_2.append(l2)
lottery_col_3.append(l3)
lottery_col_4.append(l4)
lottery_col_5.append(l5)
lottery_col_6.append(l6)
# Go into lottery arrays 1-6 and find how many matches for numbers 1-75 there are
# Divide the total matches from the total entries for each number's percentage
main()```
Input (new.csv:)
student Jack
Choice Phy
Choice Chem
Choice Maths
Choice Biology
student Jill
Choice Phy
Choice Biology
Choice Maths
Expected Output (out.csv)
Student Phy Chem Maths Biology
Jack Yes Yes Yes Yes
Jill Yes No Yes Yes
Parsing new.csv and writing result in out.csv.For each student name, writing YES if a choice of subject is present and NO if the subject is not in the choice(subjects become new header in out.csv).
Here I have used nested if to get desired output.Please help me with better pythonic way of code.
I am newbie to python.Eager to learn better way of coding.
P.S: Choice of subjects is not in the same order.
import csv
la =[]
l2=[]
with open("new.csv","r",newline='\n') as k:
k=csv.reader(k, delimiter=',', quotechar='_', quoting=csv.QUOTE_ALL)
counter = 0
for col in k :
# number of rows in csv is 600
if counter<=600:
if col[0] =='student':
la.append("\n "+col[1])
a=next(k)
if a[1] == 'Phy':
la.append('yes')
a = next(k)
else:
la.append('no')
if a[1] == 'Chem':
la.append('yes')
a = next(k)
else:
la.append('no')
if a[1] == 'Maths':
la.append('yes')
a = next(k)
else:
la.append('no')
if a[1] == 'Biology':
la.append('yes')
a = next(k)
counter += 1
else:
la.append('no')
counter += 1
l2=",".join(la)
with open("out.csv","w") as w:
w.writelines(l2)
IMHO, it is time to learn how to debug simple prorams. Some IDE come with nice debuggers, but you can still use the good old pdb or simply add print traces in your code to easily understand what happens.
Here, the first and most evident problem is here:
tot = sum(1 for col in k)
It is pretty useless because for col in k would be enough, but it consumes the totality of the k iterator, so the next line for col in k: try to access an iterator that has already reached its end and the loop immediately stops.
That is not all:
first line contains Student with an upper case S while you test student with a lower case s: they are different strings... This case problems exists on all the other comparisons.
when you find student, you set a to the line following it... and never change it. So even if you fix your case errors, you will consistently use that only line for the student!
If you are a beginner, the rule is Keep It Simple, Stupid. So start from something you can control and then start to add other features:
read the input file with the csv module and just print the list for every row. Do not step further until this gives what you want! That would have stopped you from the tot = sum(1 for col in k) error...
identify every student. Just print it first, then store its name in a list and print the list after the loop
identify subject. Just print them first, then feed a dictionnary with the subjects
wonder how you can get that at the end of the loop...
just realize that you could store the student name in that dictionnary, and put the full dictionnary in the list (feel free to ask a new question if you are stuck there...)
print the list of dictionnaries at the end of the loop
build one row for student that could feed the csv writer, or as you already have a list of dict, considere using a DictWriter.
Good luck in practicing Python!
Here is a possible way for the read part:
import csv
la = {} # use a dict to use the student name as index
with open("new.csv","r",newline='\n') as k:
k=csv.reader(k, delimiter=',', quotechar='_', quoting=csv.QUOTE_ALL)
# counter = 0 # pretty useless...
for col in k :
if col[0] =='student':
l2 = set() # initialize a set to store subjects
la[col[1]] = l2 # reference it in la indexed by the student's name
else: # it should be a subject line
l2.add(col[1]) # note the subject
# Ok la is a dict with studend names as key, and a set containing subjects for that student as value
print(la)
For the write part, you should:
build an union of all sets to get all the possible subjects (unless you know that)
for each item (name, subjects) from la, build a list storing yes or no for each of the possible subject
write that list to the output csv file
...left as an exercise...
I am so new to python (a week in) so I hope I ask this question properly.
I have imported a grade sheet in csv format into python 2.7. The first column is the name of the student and the column titles are the name of the assignments. So the data looks something like this:
Name Test1 Test2 Test3
Robin 89 78 100
...
Rick 72 100 98
I want to be able to do (or have someone else do) 3 things just by typing in the name of the person and the assignment.
1. Get the score for that person for that assignment
2. Get the average score for that assignment
3. Get that persons average score
But for some reason I get lost at figuring how to get python to recognize the field I am trying to call in. So far this is what I have (so far the only part that works is calling in file):
data = csv.DictReader(open("C:\file.csv"))
for row in data:
print row
def grade()
student= input ("Enter a student name: ")
assignment= input("Enter a assignment: ")
for row in data:
task_grade= data.get(int(row["student"], int(row["assignment"])) # specific grade
task_total= sum(int(row['assignment'])) #assignment total
student_total= #student assignments total-- no clue how to do this
task_average= task_total/11
average_score= student_total/9
You can access the individual "columns" of your csv this way:
import csv
def parse_csv():
csv_file = open('data.csv', 'r')
r = csv.reader(csv_file)
grade_averages = {}
for row in r:
if row[0].startswith('Name'):
continue
#print "Student: ", row[0]
grades = []
for column in row[1:]:
#print "Grade: ", column
grades.append(int(column.strip()))
grade_total = 0
for i in grades:
grade_total += i
grade_averages[row[0]] = grade_total / len(grades)
#print "grade_averages: ", grade_averages
return grade_averages
def get_grade(student_name):
grade_averages = parse_csv()
return grade_averages[student_name]
print "Rick: ", get_grade('Rick')
print "Robin: ", get_grade('Robin')
What you are trying to do is not meant for Python because you have keys and values. However...
If you know that your columns are always the same, no need to use keywords, you can use positions:
Here is the easy, inefficient* way to do 1 and 3:
students_name = ...
number = ...
for line in open("C:\file.csv")).readlines()
items = line.split()
num_assignments = len(items)-1
name = items[0]
if name = students_name:
print("assignment score: {0}".format(items[number]))
asum = 0
for k in range(0,num_assignments):
asum+= items[k+1]
print("their average: {0}".format(asum / num_assignments)
To do 2, you should precompute the averages and return them beucase the averages for each assignment is the same for each user query.
I say easy *innefficnet because you search the text file for each user query each time a name is entered. To do it properly, you should probably build a dictionary of all names and their information. But that solution is more complicated, and you are only a week in! Moreover, its longer and you should give it a try. Look up dict.
I believe the reason you are not seeing the field the second time around is because the iterator returned by csv.DictReader() is a one-time iterator. That is to say, once you've reached the last row of the csv file, it will not reset to the first position.
So, by doing this:
data = csv.DictReader(open("C:\file.csv"))
for row in data:
print row
You are running it out. Try commenting those lines and see if that helps.