Sum of rows from CSV

Sum of rows from CSV - python

I have the following code:
with open("expenses.csv") as read_exp:
reader = csv.reader(read_exp, delimiter=',')
header = next(reader)
if header != None:
for row in reader:
month_str = row[0]
month_dt= datetime.strptime(month_str, '%d/%m/%Y').month
if month_dt == month1:
sum1 = sum((map(int,row[2:7])))
print(sum1)
This gives me the sum of each individual row that is from the month I am looking for.
Output:
Enter selected month number: 7
Selected Month is: July
15
26
7
23
21
19
30
Is there a way to combine the individual sums into one total sum?
My csv is as below:
Date,Budget,Groceries,Transport,Food,Bills,Others
12/7/2021,30,1,0,4,2,8
13/7/2021,30,9,3,5,7,2
14/7/2021,30,3,3,0,0,1
15/7/2021,30,1,0,10,7,5
16/7/2021,30,9,9,0,2,1
17/7/2021,30,0,6,4,1,8
18/7/2021,30,0,9,9,8,4
16/8/2021,30,7,10,7,10,1
17/8/2021,30,5,6,10,9,1
18/8/2021,30,6,1,9,10,5
19/8/2021,30,0,8,8,3,5
20/8/2021,30,4,0,6,9,4
21/8/2021,30,6,2,1,1,5
22/8/2021,30,3,3,1,1,10
13/9/2021,30,8,2,9,4,6
14/9/2021,30,10,7,10,5,7
15/9/2021,30,5,5,6,9,6
16/9/2021,30,5,7,4,6,2
17/9/2021,30,3,7,10,5,7
18/9/2021,30,8,9,6,8,1
19/9/2021,30,5,3,1,9,5

I assume you want to print the full value of the month in your example correct?
If that is the case you could just have a variable total_sum for example where u add the content of sum1(I m assuming sum1 is a value) into it like this:
reader = csv.reader(read_exp, delimiter=',')
header = next(reader)
if header != None:
for row in reader:
month_str = row[0]
month_dt= datetime.strptime(month_str, '%d/%m/%Y').month
if month_dt == month1:
sum1 = sum((map(int,row[2:7])))
print(sum1)
total_sum += sum1
print(total_sum)

Related

Ignoring a specific letter grade when calculating average student grade

I'm trying to calculate the average of the numeric grade for students who take [0]=14224. But how do I tell my program to ignore any grades with a 'W'?
import sys
import csv
def findnumericgrade(grade):
if grade == 'A':
return 4.0
elif grade == 'B':
return 3.0
else:
return 2.0
def loaddata(filename, course):
count = 0
total = 0.0
with open(filename, 'r') as f:
lines = csv.reader(f)
next(lines)
for row in lines:
if course in row[0]:
get_grade = findnumericgrade(row[3])
total += float(get_grade)
count += 1
avg = total / count
print(f"The {course} average is: {round(avg, 2)}")
loaddata('studentdata.csv', sys.argv[1])
#example of studentdata.csv:

There are certainly a number of ways. The easiest approach is probably just to check for the 'W' string and continue to the next row.
One approach to doing this is to use the continue control to move on to the next iteration in the loop.
def loaddata(filename, course):
count = 0
total = 0.0
with open(filename, 'r') as f:
lines = csv.reader(f)
next(lines)
for row in lines:
if row[3] == 'W':
continue # Go to next iteration in loop
if course in row[0]:
get_grade = findnumericgrade(row[3])
total += float(get_grade)
count += 1
avg = total / count
print(f"The {course} average is: {round(avg, 2)}")
You can also do this by making your if statement the and boolean operation to also ensure that Course_Grade is not 'W'.
def loaddata(filename, course):
count = 0
total = 0.0
with open(filename, 'r') as f:
lines = csv.reader(f)
next(lines)
for row in lines:
if course in row[0] and row[3] != 'W':
get_grade = findnumericgrade(row[3])
total += float(get_grade)
count += 1
avg = total / count
print(f"The {course} average is: {round(avg, 2)}")
The above solutions are probably most practical, since this looks like some sort of utility script, but depending on how large you expect your dataset to be, you could use something like pandas. Then you'd have access to all of the data manipulation and analysis tools it offers.
import sys
import pandas as pd
def find_numeric_grade(grade):
if grade == 'A':
return 4.0
elif grade == 'B':
return 3.0
else:
return 2.0
df = pd.read_csv('studentdata.csv')
section_number = int(sys.argv[1])
print(df[(section_number == df['Section_Number']) & (df['Course_Grade'] != 'W')]
['Course_Grade'].apply(find_numeric_grade).mean())
*Solutions tested with the following data in studentdata.csv
Section_Number,Prof_ID,Student_ID,Course_Grade,Student_Name,Course_ID
14224,5,109,B,John Smith,IT1130
14224,5,110,B,Jennifer Johnson,IT1130
14224,5,111,W,Kristen Hawkins,IT1130
14224,5,112,A,Tom Brady,IT1130
14224,5,113,C,Cam Newton,IT1130
14224,5,114,C,Tim Tebow,IT1130
14225,5,115,A,Peyton Manning,IT1130
14225,5,116,B,Maria Sharapova,IT1130
14225,5,117,W,Brian McCoy,IT1130

if course in row[0]:
if row[3]!='W':
get_grade = findnumericgrade(row[3])
total += float(get_grade)
count += 1
avg = total / count

openpyxl start writing from particular column/cell

I have the following code:
ws = wb.worksheets[1]
print(ws)
with open('out.txt', 'r+') as data:
reader = csv.reader(data, delimiter='\t')
for row in reader:
print(row)
ws.append(row)
wb.save('test.xlsx')
by default it's written to xlsx file starting from A0
Is there a more convinient way to start appending data, let's say from C2?
Or only xxx.cell(row=xx , column=yy ).value=zz ?
i = 2
j = 3
with open('out.txt', 'r+') as data:
reader = list(csv.reader(data, delimiter='\t'))
for row in reader:
for element in row:
ws.cell(row=i, column=j).value = element
j += 1
j = 3
i += 1

Just pad the rows with Nones
ws.append([]) # move to row 2
for row in reader:
row = (None)*2 + row
ws.append(row)

How do I put all my looped output in a variable (for generating an output file)? (CSV related)

I am quite new to working with python, so i hope you can help me out here. I have to write a programm that opens a csv file, reads it and let you select columns you want by entering the number. those have to be put in a new file. the problem is: after doing the input of which columns i want and putting "X" to start the main-part it generates exactly what i want but by using a loop, not printing a variable that contains it. But for the csv-writer i need to have a variable containg it. any ideas? here you have my code, for questions feel free to ask. the csvfile is just like:
john, smith, 37, blue, michigan
tom, miller, 25, orange, new york
jack, o'neill, 40, green, Colorado Springs
...etc
Code is:
import csv
with open("test.csv","r") as t:
t_read = csv.reader(t, delimiter=",")
t_list = []
max_row = 0
for row in t_read:
if len(row) != 0:
if max_row < len(row):
max_row = len(row)
t_list = t_list + [row]
print([row], sep = "\n")
twrite = csv.writer(t, delimiter = ",")
tout = []
counter = 0
matrix = []
for i in range(len(t_list)):
matrix.append([])
print(len(t_list), max_row, len(matrix), "Rows / Columns / Matrix Dimension")
eingabe = input("Enter column number you need or X to start generating output: ")
nr = int(eingabe)
while type(nr) == int:
colNr = nr-1
if max_row > colNr and colNr >= 0:
nr = int(nr)
# print (type(nr))
for i in range(len(t_list)):
row_A=t_list[i]
matrix[i].append(row_A[int(colNr)])
print(row_A[int(colNr)])
counter = counter +1
matrix.append([])
else:
print("ERROR")
nr = input("Enter column number you need or X to start generating output: ")
if nr == "x":
print("\n"+"Generating Output... " + "\n")
for row in matrix:
# Loop over columns.
for column in row:
print(column + " ", end="")
print(end="\n")
else:
nr = int(nr)
print("\n")
t.close()

Well you have everything you need with matrix, apart from an erroneous line that adds an unneeded row:
counter = counter +1
matrix.append([]) # <= remove this line
else:
print("ERROR")
You can then simply do:
if nr == "x":
print("\n"+"Generating Output... " + "\n")
with open("testout.csv", "w") as out:
wr = csv.writer(out, delimiter=",")
wr.writerows(matrix)

Calculate value difference between two different CSV files python

I have two differenct csv files:
outputnovember.csv
symbol,name,amount
A,john,2
D,mary,6
E,bob,9
m,liz,-8
p,peter,-2
A total 2,Positive total 17,Negative total -10
outputdecember.csv
symbol,name,amount
A,john,2
D,mary,26
m,liz,-1
p,peter,-2
A total 2,Positive total 26,Negative total -3
how do i calculate the difference between the calculated values of the two file so that the following is appended to outdecember: A total 0, Posiitve total 9, Negative total-17
here's my code so far:
import csv
f=open('outputnovember.csv')
csv_f= csv.reader(f)
with open('input.csv', 'r') as f_input, open('outdecember.csv', 'w') as f_output:
csv_input = csv.reader(f_input)
csv_output = csv.writer(f_output)
header = next(csv_input)
csv_output.writerow(header)
sum_positive = sum_negative = sum_a = 0
for cols in csv_input:
csv_output.writerow(cols)
value = int(cols[2])
if cols[0] == 'A':
sum_a += value
if value >= 0:
sum_positive += value
else:
sum_negative += value
csv_output.writerow(["A total {}".format(sum_a)],
csv_output.writerow(["Positive total {}".format(sum_positive)])
csv_output.writerow(["Negative total {}".format(sum_negative)])
... here is where i'm stuck to retrieve the values from outputnovember.csv and find the difference from outputdecember.csv
Thanks all
B

TypeError: 'float' object is not iterable 3

import csv
csvfile = open(r"C:\Users\Administrator\Downloads\canberra_2011_2012.csv")
header = csvfile.readline()
csv_f = csv.reader(csvfile)
for row in csv_f:
first_value = float(row[5])
total = sum(first_value)
length = len(first_value)
average = total/length
print("average = ",average)
When i run this code, it said
TypeError: 'float' object is not iterable
But when I change the line 7 to
first_value = [float(row[5]) for row in csv_f
then it works. This confuses me, can anyone help me?

The other answer is much more elegant than mine, but the following is closer to the spirit of your original code. It may make your errors more obvious. I apologize for the crappy formatting. I'm new to this site.
import csv
csvfile = open(r"C:\Users\Administrator\Downloads\canberra_2011_2012.csv")
header = csvfile.readline()
csv_f = csv.reader(csvfile)
length = 0
total = 0.0
for row in csv_f:
first_value = float(row[5])
total = total + first_value
length += 1
if length > 0:
average = total/length
print("average = ",average)

I think you want to collect all the first_values and then do some calculations. To do that, you must step through each row of the csv file and first collect all the values, otherwise you are summing one value and that's the source of your error.
Try this version:
import csv
with open(r"C:\Users\Administrator\Downloads\canberra_2011_2012.csv") as f:
reader = csv.reader(f)
values = [float(line[5]) for line in reader]
# Now you can do your calculations:
total = sum(values)
length = len(values)
# etc.

You are getting this error at this line of your code,
total = sum(first_value)
The error is raised because sum is a function of iterable object. As in your code, the first_value is a float object. So you can not use sum function on it. But when you use list compression,
first_value = [float(row[5]) for row in csv_f]
then the first_value is a list type object consisting the float values of row[5]. So you can apply sum function on it without raising error.
Apart from list compression, you can also append the values in a list in your for loop and calculate the sum and length after the loop.
first_values = []
for row in csv_f:
first_value = float(row[5])
first_values.append(first_value)
total = sum(first_values)
length = len(first_values)
average = total/length

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Sum of rows from CSV - python

Related

Ignoring a specific letter grade when calculating average student grade

openpyxl start writing from particular column/cell

How do I put all my looped output in a variable (for generating an output file)? (CSV related)

Calculate value difference between two different CSV files python

TypeError: 'float' object is not iterable 3

Categories

Resources