Python getting exact cell from csv file - python

import csv
filename = str(input("Give the file name: "))
file = open(filename, "r")
with file as f:
size = sum(1 for _ in f)
print("File", filename, "has been read, and it has", size, "lines.", size - 1, "rows has been analyzed.")
I pretty much type the csv file path to analyze and do different things with it.
First question is: How can I print the exact cell from the CSV file? I have tried different methods, but I can't seem to get it working.
For example I want to print the info of those two cells
The other question is: Can I automate it to print the very first cell(1 A) and the very last row first cell (1099 A), without me needing to type the cell locations?
Thank you
Small portion of data
Example of the data:
Time Solar Carport Solar Fixed SolarFlatroof Solar Single
1.1.2016 317 1715 6548 2131
2.1.2016 6443 1223 1213 23121
3.1.2016 0 12213 0 122

You import csv at the very top but then decided not to use it. I wonder why – it seems just what you need here. So after a brief peek at the official documentation, I got this:
import csv
data = []
with open('../Downloads/htviope2016.csv') as csvfile:
spamreader = csv.reader(csvfile, delimiter=';')
for row in spamreader:
data.append (row)
print("File has been read, and it has ", len(data), " lines.")
That is all you need to read in the entire file. You don't need to – for some operations, it is sufficient to process one line at a time – but with the full data loaded and ready in memory, you can play around with it.
print (f'First row length: {len(data[0])}')
The number of cells per row. Note that this first row contains the header, and you probably don't have any use for it. Let's ditch it.
print ('Discarding 1st row NOW. Please wait.')
data.pop(0)
Done. A plain pop() removes the last item but you can also use an index. Alternatively, you could use the more pythonic (because "slicing") data = data[1:] but I assume this could involve copying and moving around large amounts of data.
print ('First 10 rows are ...')
for i in range(10):
print ('\t'.join(data[i])+'(end)')
Look, there is data in memory! I pasted on the (end) because of the following:
print (f'First row, first cell contains "{data[0][0]}"')
print (f'First row, last cell contains "{data[0][-1]}"')
which shows
First row, first cell contains "2016-01-01 00:00:00"
First row, last cell contains ""
because each line ends with a ;. This empty 'cell' can trivially be removed during reading (ideally), or afterwards (as we still have it in memory):
data = [row[:-1] for row in data]
and then you get
First row, last cell contains "0"
and now you can use data[row][column] to address any cell that you want (in valid ranges only, of course).
Disclaimer: this is my very first look at the csv module. Some operations could possibly be done more efficiently. Practically all examples verbatim from the official documentation, which proves it's always worth taking a look there first.

Related

Code won't print several lists but is showing no errors

I'm trying to make a program which allows a user to enter in a 5 digit product number and the program will search the included csv file by that number until it finds it, at which point it will print the corresponding name and price but not the number. In order to get to this point I decided to create a list with each row from the file in it and then print them for troubleshooting, none of them had issues individually printing their lists but when I tried to print all 5 at once it printed the first list then showed 4 empty brackets for the others. The assistant is showing no errors at all and I'm not sure how to fix it.
import csv
f = open('products.csv')
csv_f = csv.reader(f)
next(f)
pNumber = []
pName = []
pDescription = []
pCategory = []
pPrice = []
for row in csv_f:
pNumber.append(row[0])
for row in csv_f:
pName.append(row[1])
for row in csv_f:
pDescription.append(row[2])
for row in csv_f:
pCategory.append(row[3])
for row in csv_f:
pPrice.append(row[4])
print(pNumber)
print(pName)
print(pDescription)
print(pCategory)
print(pPrice)
The products csv file looks like this
Product #,Name,Description,Category,Price
38500,Backpacking Tent,"2-Person Backpacking Tent - 20D Ripstop Nylon",Outdoor,205.99
27840,Sit-Stand Desk,"Sit-Stand Compact Workstation Desk Converter, 37in",Household,139.99
37992,Mouse,"Dark Matter by Monoprice Rover Optical Gaming Mouse - 6200DPI",Office,19.99
24458,Subwoofer,"15in THX Ultra Certified 1000 Watt Powered Subwoofer",Audio,1280.07
38323,USB Cable,"USB 2.0 Type-C to Type-A Charge & Sync Kevlar-Reinforced Nylon-Braid Cable, 6ft, purple",Office,7.55
Your 2nd-5th lists are empty because the first loop read all of the data in the file; there's nothing left to read. If you want to iterate through the entire file again, you need to reset the cursor position in the file object.
f.seek(0)
is often the simplest way to do it.
Better yet, store all of your data fields within one loop, rather than reading the entire file for one column at a time.
Even better than that, simply read the file straight to a data frame.

All of my data from columns of one file go into one column in my output file. How to keep it the same?

I'm trying to delete some number of data rows from a file, essentially just because there are too many data points. I can easily print them to IDLE but when I try to write the lines to a file, all of the data from one row goes into one column. I'm definitely a noob but it seems like this should be "trivial"
I've tried it with writerow and writerows, zip(), with and without [], I've changed the delimiter and line terminator.
import csv
filename = "velocity_result.csv"
with open(filename, "r") as source:
for i, line in enumerate(source):
if i % 2 == 0:
with open ("result.csv", "ab") as result:
result_writer = csv.writer(result, quoting=csv.QUOTE_ALL, delimiter=',', lineterminator='\n')
result_writer.writerow([line])
This is what happens:
input = |a|b|c|d| <row
|e|f|g|h|
output = |abcd|
<every other row deleted
(just one column)
My expectaion is
input = |a|b|c|d| <row
|e|f|g|h|
output = |a|b|c|d|
<every other row deleted
Once you've read the line, it becomes a single item as far as Python is concerned. Sure, maybe it is a string which has comma separated values in it, but it is a single item still. So [line] is a list of 1 item, no matter how it is formatted.\
If you want to make sure the line is recognized as a list of separate values, you need to make it such, perhaps with split:
result_writer.writerow(line.split('<input file delimiter here>'))
Now the line becomes a list of 4 items, so it makes sense for csv writer to write them as 4 separated values in the file.

Compare 2 csv files with the same header and output a third csv with some calculations

I want to compare 2 csv files and store the results in a new csv file.
I have 2 csv (old.csv and new.csv) with the same headers.
How can I compare the values of each and do calculations based on those?
with open('new.csv') as new_csv, open('old.csv') as old_csv:
reader_old = csv.DictReader(old_csv)
reader_new = csv.DictReader(new_csv)
for row_o in reader_old:
for row_n in reader_new:
if row_n['Account'] == row_o['Account']:
amt_diff = float(row_n['Number']) - float(row_o['Number'])
print(amt_diff)
Python has a module called csv that will let you do all sorts of reading and writing of csv files, without having to go through the tedious task of manually writing lines to take strings, breaking them up along commas, etc.. For example, you can use csv.DictReader() to read lines into a dictionary where the keys are the same as your column names:
import csv
with open('new.csv') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
ranking = row['Ranking']
percentage = row['Percentage']
print("The percentage in this row is", percentage)
After extracting what you need and doing the calculations, you can use csv.DictWriter to write data to your new, third csv file. A search on the web for python csv module should give you a number of examples.
EDIT: I read your comment and saw your updated code. Let's look at what your nested loop does, as far as I can tell:
Take the first line of the old CSV data
Take the first line of the new CSV data
Compare their values for "Account". If they're the same, then print their difference (which should be zero if the two numbers are the same, right?)
Do the same with line #1 of the old and line #2 of the new.
Do the same with line #1 of the old and line #3 of the new.
Continue until you compare line #1 of the old and the last line of the new.
Repeat all of the above with line #2 of the old and line #1 of the new, then line #2 of the old and line #2 of the new, line #2 of the old and line #3 of the new, etc.
Is that what you want? Or are you just trying to compare them line by line and write the differences?
EDIT #2:
I don't know if this will make a difference, but try this instead:
reader_old = csv.DictReader(open("old.csv"))
reader_new = csv.DictReader(open("new.csv"))
for row_o in reader_old:
for row_n in reader_new:
amt_diff = float(row_n['Number']) - float(row_o['Number'])
print(amt_diff)
If you want to write this to a new file instead of just printing the results, see csv.DictWriter().

Deleting/rearranging/adding in very large tsv files Python

I have a very large tsv file (1.2GB, 5 columns, 38m lines). I want to delete a column, add a column of ID's (1 to 38m), and rearrange the column order. How can I do this without using a ridiculous amount of memory?
Language of choice is Python, though open to other solutions.
You can read, manipulate, and write one row at a time. Not loading the entire file to memory, this will have a very low memory signature.
import csv
with open(fileinpath, 'rb') as fin, open(fileoutpath, 'wb') as fout:
freader = csv.reader(fin, delimiter = '\t')
fwriter = csv.writer(fout, delimiter = '\t')
idx = 1
for line in freader:
line[4], line[0] = line[0], line[4] #switches position between first and last column
del line[3] #delete fourth column
line.insert(0, idx)
fwriter.writerow(line)
idx += 1
(This is written in python2.7, and deletes the fourth column for an example)
Regarding rearranging the order - I assume it's the order of columns - this could be done in the manipulation part. There's an example of switching the order of the first and last column.
you can use awk to do this, i will not say 1.2GB will take huge amount of memory.
if you want to delete c3
awk -F"\t" 'BEGIN{OFS="\t"}{print $1,$2,$4,$5,NR}' input.txt > output.txt
the raw output is
c1 c2 c4 c5 columnId(1 to 38m)
$1 is coloumn1, $2 is column2, and so on. NR is the number of line.
if you want to rearrange, just change the order of $1,$2,$4,$5 and NR,
The answer depends enormously on how much context is needed need to rewrite the lines and to determine the new ordering.
If it's possible to rewrite the individual lines without regard to context (depends on how the ID number is derived), then you can use the csv module to read the file line-by-line as #Tal Kremerman illustrates, and write it out line-by-line in the same order. If you can determine the correct ordering of the lines at this time, then you can add an extra field indicating the new order they should appear in.
Then you can do a second pass to sort/rearrange the lines into the correct order. There are many recent threads on "how to sort huge files with Python", e.g. How to sort huge files with Python? I think Tal Kremerman is right that the OP only wants to rearrange columns, and not rows

csv row and column fetch

So working on a program in Python 3.3.2. New to it all, but I've been getting through it. I have an app that I made that will take 5 inputs. 3 of those inputs are comboboxs, two are entry widgets. I have then created a button event that will save those 5 inputs into a text file, and a csv file. Opening each file everything looks proper. For example saved info would look like this:
Brad M.,Mike K.,Danny,Iconnoshper,Strong Wolf Lodge
I then followed a csv demo and copied this...
import csv
ifile = open('myTestfile.csv', "r")
reader = csv.reader(ifile)
rownum = 0
for row in reader:
# Save header row.
if rownum == 0:
header = row
else:
colnum = 0
for col in row:
print('%-15s: %s' % (header[colnum], col))
colnum += 1
rownum += 1
ifile.close()
and that ends up printing beautifully as:
rTech: Brad M.
pTech: Mike K.
cTech: Danny
proNam: ohhh
jobNam: Yeah
rTech: Damien
pTech: Aaron
so on and so on. What I'm trying to figure out is if I've named my headers via
if rownum == 0:
header = row
is there a way to pull a specific row / col combo and print what is held there??
I have figured out that I could after the program ran do
print(col)
or
print(col[0:10]
and I am able to print the last col printed, or the letters from the last printed col. But I can't go any farther back than that last printed col.
My ultimate goal is to be able to assign variables so I could in turn have a label in another program get it's information from the csv file.
rTech for job is???
look in Jobs csv at row 1, column 1, and return value for rTech
do I need to create a dictionary that is loaded with the information then call the dictionary?? Thanks for any guidance
Thanks for the direction. So been trying a few different things one of which Im really liking is the following...
import csv
labels = ['rTech', 'pTech', 'cTech', 'productionName', 'jobName']
fn = 'my file.csv'
cameraTech = 'Danny'
f = open(fn, 'r')
reader = csv.DictReader(f, labels)
jobInformation = [(item["productionName"],
item["jobName"],
item["pTech"],
item["rTech"]) for item in reader if \
item['cTech'] == cameraTech]
f.close()
print ("Camera Tech: %s\n" % (cameraTech))
print ("\n".join(["Production Name: %s \nJob Name: %s \nPrep Tech: %s \nRental Agent: %s\n" % (item) for item in jobInformation]))
That shows me that I could create a variable through cameraTech and as long as that matched what was loaded into the reader that holds the csv file and that if cTech column had a match for cameraTech then it would fill in the proper information. 95% there WOOOOOO..
So now what I'm curious about is calling each item. The plan is in a window I have a listbox that is populated with items from a .txt file with "productionName" and "jobName". When I click on one of those items in the listbox a new window opens up and the matching information from the .csv file is then filled into the appropriate labels.
Thoughts??? Thanks again :)
I think that reading the CSV file into a dictionary might be a working solution for your problem.
The Python CSV package has built-in support for reading CSV files into a Python dictionary using DictReader, have a look at the documentation here: http://docs.python.org/2/library/csv.html#csv.DictReader
Here is an (untested) example using DictReader that reads the CSV file into a Python dictionary and prints the contents of the first row:
import csv
csv_data = csv.DictReader(open("myTestfile.csv"))
print(csv_data[0])
Okay so I was able to put this together after seeing the following (https://gist.github.com/zstumgoren/911615)
That showed me how to give each header a variable I could call. From there I could then create a function that would allow for certain variables to be called and compared and if that matched I would be able to see certain data needed. So the example I made to show myself it could be done is as follows:
import csv
source_file = open('jobList.csv', 'r')
for line in csv.DictReader(source_file, delimiter=','):
pTech= line['pTech']
cTech= line['cTech']
rAgent= line['rTech']
prodName= line['productionName']
jobName= line['jobName']
if prodName == 'another':
print(pTech, cTech, rAgent, jobName)
However I just noticed something, while my .csv file has one line this works great!!!! But, creating my proper .csv file, I am only able to print information from the last line read. Grrrrr.... Getting closer though.... I'm still searching but if someone understands my issue, would love some light.

Categories

Resources