I have an asignment in which I need to imput random grades of different students in a csv file using Python 3, and get the average of each student(the average thing and how to get the random grades, I know how to do it), the thing is that I don't know how to write the grades on those specific columns and rows(highlighted ones).
Highlighted area is the space in which I need to write random grades:
Is there anyway that this can be done? I'm fairly new to programming and Python 3, and as far as I've read, specifics cells can't be changed using normal means.
csv module doesn't have functions to modify specific cells.
You can read rows from original file, append grades and write modified rows to new file:
import random
import csv
inputFile = open('grades.csv', 'r')
outputFile = open('grades_out.csv', 'w')
reader = csv.reader(inputFile)
writer = csv.writer(outputFile)
for row in reader:
grades = row.copy()
for i in range(5):
grades.append(random.randint(1, 5))
writer.writerow(grades)
inputFile.close()
outputFile.close()
Then you can delete original file and rename new file (it is not good to read the whole original file to a variable, close it, open it again in writing mode and then write data, because it can be big).
Related
I'm currently working with csv-files with around 100k-500k rows (rows contain text and are up to 500mb).
As I need to process text data in every line, my goal is to open each file, iterate over the rows, append two new vars to each row and write the rows into a new file (each initial csv file gets a single new output file).
My first consideration was to not read all rows into memory first and then write them to a new file subsequently but to do it all in one step:
import glob
import csv
from datetime import datetime
all_initial_files = glob.glob('/path/to/my/files/*.csv')
for file in all_initial_files:
output_file = create_new_path(file)
with open(output_file, 'w') as w:
writer = csv.DictWriter(w, some_header)
writer.writeheader()
with open(output_file, 'a') as f_o:
writer = csv.writer(f_o)
with open(file, 'r') as f_i:
data = csv.reader(f_i)
for i,row in enumerate(data):
new_var1,new_var2 = process_row(row)
row = row.extend([new_var1,new_var2])
writer.writerow(row)
print(datetime.now().strftime('%d.%m.%Y, %H:%M:%S:'),f'{file}: Processed row #{i}.')
However, I encountered that the script slows down, the more rows it processes in a file (around row ~30k, you can notice that it takes the script longer to process single rows).
My question therefore is: What's the best/fastest way in python to read a csv file, process/add information and write to a new csv file. Maybe there is a solution involving pandas that works faster?
This is my code that adds the data to the CSV file known as studentScores.csv
myfile = open("studentScores.csv", "a+")
newRecord = Score, Name, Gender, FormGroup, Percentage
myfile.write(str(newRecord))
myfile.write("\n")
myfile.close()
As a part of my task, I need to alphabetise the data in the CSV, I have searched, and searched for a solution, but I am unable to find a working solution for me. I am pretty new to Python, so the simplest solution will be appreciated.
import csv
from operator import itemgetter
with open('studentScores.csv', 'r') as f:
data = [line for line in csv.reader(f)]
newRecord = [Score, Name, Gender, FormGroup, Percentage]
data.append(newRecord)
data.sort(key=itemgetter(1)) # 1 being the column number
with open('studentScores.csv', 'w') as f:
csv.writer(f).writerows(data)
First of all, this uses functions from the csv module for properly parsing and creating CSV syntax. Secondly, it reads all existing entries into data, appends the new record, sorts all records, then dumps them back to the file.
If you're using a header row in your CSV file to add names to columns, look at DictReader and DictWriter, that would allow you to handle columns by name, not number (e.g. in the sorting step).
I am in the process of creating a simple random number generator in python for a school project. This is what I have so far:
import random
amnt = input('Please enter the amount of numbers you would like:')
for i in range(0,amnt):
x = random.randint(0,100000000)
print x
This has the desired result, it generates a set amount of random numbers based on the user input. The problem I need to solve now is how to export the numbers generated into one CSV file so that they can be analysed. I believe that the CSV module needs to be imported and implemented but I am not sure how to do this. I am trying to analyze the effectiveness of the random module in order to write an essay so being able to use excel to sort and filter the numbers would be very helpful. Any changes or modifications to the code would also be very much appreciated.
You just need a one line code to convert a variable to csv format.
Let me know if this does not work.
If the code works for you please rate the answer.
x.to_csv('file_name.csv')
No, you don't really the csv module for a case this simple. You just need to create a text file in which the values are separated by commas. (Hence the name, Comma-Separated Values, CSV).
Try this:
import random
amnt = int(raw_input('Please enter the amount of numbers you would like:'))
data = (random.randint(0,100000000) for _ in range(amnt))
data = (str(datum) for datum in data)
data = ','.join(data) + '\n'
with open("random.csv", "w") as fp:
fp.write(data)
import random
import csv
amnt = input('Please enter the amount of numbers you would like:')
ofile = open('ttest.csv', "wb")
writer = csv.writer(ofile, delimiter=',')
for i in range(0,amnt):
x = random.randint(0,100000000)
writer.writerow([x])
ofile.close()
Might be a quick solution to your problem. writerow will write a row to your csv. Since you want to open it in excel I wrote one number/row, so you can order it based on the column.
However, you could also sort the numbers programatically without having to use excel. As some already mentioned CSV is especially aimed for storing data structures.
More info can be found in the csv module documentation
The following script draws the Normal Distribution of a sort of data given.
import numpy as np
import scipy.stats as stats
import pylab as pl
h = sorted ([0.9, 0.6, 0.5, 0.73788,...]) #Data that I would like to change
fit = stats.norm.pdf(h, np.mean(h), np.std(h))
pl.plot(h,fit,'-o')
pl.show()
I would like to find how to plot the data taken from a .csv file instead of having to introduce it manually. Suppose the data wanted is in the 2nd column of a given .csv file, the way I know to do something similar to isolate the data is by creating an intermediate file, but maybe this is not even necessary.
with open('infile.csv','rb') as inf, open('outfile.csv','wb') as outf:
incsv = csv.reader(inf, delimiter=',')
outcsv = csv.writer(outf, delimiter=',')
outcsv.writerows(row[1] in incsv)
Anyway, basically my two questions here are,
- Would I be writing correctly the second column of a .csv into a new .csv file?
- How could I merge those two scripts so that I can substitute the static data in the first one for the data in a column of a .csv file?
It seems very roundabout to write the data back out to a file, presumably to read it back in again later. Why not create a list of the data?
def import_data(filename):
"""Import data in the second column of the supplied filename as floats."""
with open(filename, 'rb') as inf:
return [float(row[1]) for row in csv.reader(inf)]
You can then call this function to get the data you want to plot
h = sorted(import_data('infile.csv'))
As to your question "Would I be writing correctly the second column of a .csv into a new .csv file?", the answer is: test it and find out.
I have a mass CSV file, that I'd like to do some calculations on several fields and output the result to another CSV file.
Let's imagine that I have 12 fields on my file1.csv.
Here is my sample code :
import csv
file1 = csv.reader(open('file1.csv', 'rb'), delimiter=';') #traffic
for record in file1:
print record[0], int(record[1]) * int(record[4])
Now.. I would like to save these rows in a new csv file.. But I got stuck there.
writerrow() method only accept the whole row, and not a pattern like what I've put on my for loop.
Any suggestion ??
writerow takes an iterable. You can easily compose a new row by creating a list
# new_csv_writer = open file for writing
for record in file1:
new_csv_writer.writerow([record[0], int(record[1]) * int(record[4])])
The above will write a row with 2 columns