Reading and writing to/from csv files - python

I want my program to read 2 columns (the first and the second one) and add them to an array. They are dependent on eachother - so they need to be written alongside eachother, as in the first row (both columns) next to eachother, and then the second row and so on.
I have managed to write the first column (containing the names) to the array, however have not managed to write the second column to the array.
rownum=1
array=[]
for row in reader:
if row[1] != '' and row[1] != 'Score':
array.append(row[1])
rownum=rownum+1
if rownum==11:
break
I attempted to append more than one row however it returns the error message 'only accepts one argument'.
Any ideas how I can do this so i can reference the score for each name from the csv file

Try using a dictionary.
d = {} #curly braces denote an empty dictionary
for row in reader:
d[row[0]] = row[1]
d, in this case, would be a dictionary with the first column of your csv file as the keys and the second column as the corresponding values.
You can access it very similar to how you access a list. Say you had Brian,80 as one of the entries in your csv file, d["Brian"] would return 80.
EDIT
OP has requested (in the comments) for a more complete version of the code. Assuming OP's code already works, I'll modify that code so it works with a dictionary:
rownum=1
d={} #denotes an empty dictionary
for row in reader:
if row[1] != '' and row[1] != 'Score':
d[row[0]]=row[1] #first column is the key/index, second column is the value
rownum=rownum+1
if rownum==11:
break

Related

How do you modify a dictionary from a text file when you only need to get specific values?

So say we have some sort of file with maybe like 6 columns, and 6 rows. If I wanted to get one specific column that reads one line, and modify a current dictionary I have, how would I approach that?
The output should be all the data with the the key being the second column, and the 2 values being the first column and 4th column?
Can somebody please help me start this off?
I've tried using:
for line in file:
(key, val) = line.split()
data[int(key)] = val
print (data)
But, obviously this'll fail, since this expects only 2 values. I need 1 key value, and 2 value values.
split returns a list. Use the list instead of expanding into named variables.
data = {}
for line in file:
row = line.strip().split()
data[int(row[1])] = row[0], row[3]
print (data)

Comparing QLineEdit() and csv file values

I'm using PyQt5 and want to compare values from a csv file with values imputed by the user through QLineEdit(). Then, if the values are the same, I want to get the whole row imported to a QTableWidget.
The csv file contains 3 different columns, with width values, height values and thickness values.
I've tried this to solve the first problem:
import csv
with open('csvTest.csv') as file:
reader = csv.reader(file)
for row in reader:
if row[0] == self.widthTextbox.text() or row[1] == self.heightTextbox.text() or row[2] == self.thickTextbox.text():
print("Found: {}".format(row))
This didn't work, and I know that using "or" is problematic because I want this to act like a filter, so if the user only inputs one of the three atributes he'll get some rows, if he inputs two he'll get fewer rows and if he inputs all three of them he will get even fewer. But using "or" allows any line that fits any condition valid.
The second problem is, if this worked, I'd like to make the number of rows in the table equal to the number of rows that passed through the filter, using something like self.tableWidget.setRowCount('''number of rows found''') .
Finally, the last issue would be to make the QTableWidget rows identical to the ones that the filter found.
To solve first and second issue this could be a way:
import csv
from collections import Counter
rows_finded = []
with open('csvTest.csv') as file:
reader = csv.reader(file)
for row in reader:
values = [self.widthTextbox.text(), self.heightTextbox.text(), self.thickTextbox.text()]
if Counter(values) == Counter(row):
rows_finded.append(row)
self.tableWidget.setRowCount(len(rows_finded))
To solve last issue (source: Python - PyQt - QTable Widget - adding rows):
for i, row in enumerate(rows_finded):
for j, col in enumerate(row):
item = QTableWidgetItem(col)
self.tableWidget.setItem(i, j, item)

Mapping CSV data into Python

I am new to Python, and I am trying to sort of 'migrate' a excel solver model that I have created to Python, in hopes of more efficient processing time.
I receive a .csv sheet that I use as my input for the model, it is always in the same format.
This model essentially uses 4 different metrics associated with product A, B and C, and I essentially determine how to price A, B, and C accordingly.
I am at the very nascent stage of effectively inputting this data to Python. This is what I have, and I would not be surprised if there is a better approach, so open to trying anything you veterans have to recommend!
import csv
f = open("141881.csv")
for row in csv.reader(f):
price = row[0]
a_metric1 = row[1]
a_metric2 = row[2]
a_metric3 = row[3]
a_metric4 = row[4]
b_metric1 = row[7]
b_metric2 = row[8]
b_metric3 = row[9]
b_metric4 = row[10]
c_metric1 = row[13]
c_metric2 = row[14]
c_metric3 = row[15]
c_metric4 = row[16]
The .csv file comes in the format of price,a_metric1,a_metric2,a_metric3,a_metric4,,price,b_metric1,b_metric2,b_metric3,b_metric4,price,,c_metric1,c_metric2,c_metric3,c_metric4
I skip the second and third price column as they are identical to the first one.
However when I run the python script, I get the following error:
c_metric1 = row[13]
IndexError: list index out of range
And I have no idea why this occurs, when I can see the data is there myself (in excel, this .csv file would go all the way to column Q, or what I understand as row[16].
Your help is appreciated, and any advice on my approach is more than welcomed.
Thanks in advance!
Using print() can be your friend here:
import csv
with open('141881.csv') as file_handle:
file_reader = csv.reader(file_handle)
for row in file_reader:
print(row)
The code above will print out EACH row.
To print out ONLY the first row replace the for loop with: print(file_reader.__next__()) (assuming Python3)
Printing out row(s) will allow you to see what exactly a "row" is.
P.S.
Using with is advisable because it handles the opening and closing of the file for you
Look into pandas.
Read file as:
data = pd.read_csv('141881.csv'))
to read a columns:
col = data.columns['column_name']
to read a row:
row = data.ix[row_number]
CSV Module in Python transforms a spreadsheet into a matrice : a list of list
The python module to read csv transform each line of your input into a list.
For each row, it will split the row into a list of cell.In other words, one array is composed of as many columns you have into your excel spreadsheet.
Try in terminal:
>>> f = open("141881.csv")
>>> print csv.reader(f)
>>>[["id", "name", "company", "email"],[1563, "defoe", "SuperFastCompany",],["def#superfastcie.net"],[1564, "doe", "Awsomestartup", "doe#awesomestartup"], ...]`
So that's why you iterate throught the rows of your spreadsheet assigning the value into a new variable.
I recommend you to read on basics of list manipulation.
But...
What is an IndexError? catching exception:
If one cell is empty or one row has less columns than other: it will thraw an Error. Such as you described. IndexError means Python wasn't able to find a value for this specific cell. In other words if some line of your excel spreadsheet are smaller than the other it will say there is no such value to asign and throw an Index Error. That why knowing how to catch exception could be very useful to see the problem. Try to verify that the list of each has the same lenght if not assign an empty value for example
try:
#if row has always 17 cells with values
#I can just assign it directly using a little trick
price,a_metric1,a_metric2,a_metric3,a_metric4,,price,b_metric1,b_metric2,b_metric3,b_metric4,price,c_metric1,c_metric2,c_metric3,c_metric4 = row'
except IndexError:
# if there is no 17 cells
# tell me how many cells is actually in the list
# you will see there that there less than 17 elements
print len(row)
Now you can just skip the error by assigning None value to those who don't appears in the csv file
You can read more about Catching Exception
Thanks everyone for your input - printing the results made me realize that I was getting the IndexError because of the very first row, which only had headers. Skipping that row got rid of the error.
I will look into pandas, it seems like that will be useful for the type of work I am doing.
Thanks again for all of your help, much appreciated.

Table/Data manipulation with Python Dictionary

I need help finishing up this python script. I'm an intern at a company, and this is my first week. I was asked to develop a python script that will take a .csv and put(append) any related columns into one column so that they have only the 15 or so necessary columns with the data in them. For example, if there are zip4, zip5, or postal code columns, they want those to all be underneath the zip code column.
I just started learning python this week as I was doing this project so please excuse my noobish question and vocabulary. I'm not looking for you guys to do this for me. I'm just looking for some guidance. In fact, I want to learn more about python, so anyone who could lead me in the right direction, please help.
I'm using dictionary key and values. The keys are every column in the first row. The values of each key are the remaining rows(second through 3000ish). Right now, I'm only getting one key:value pair. I'm only getting the final row as my array of values, and I'm only getting one key. Also, I'm getting a KeyError message, so my key's aren't being identified correctly. My code so far is underneath. I'm gonna keep working on this, and any help is immensely appreciated! Hopefully, I can by the person who helps me a beer and I can pick their brain a little:)
Thanks for your time
# To be able to read csv formated files, we will frist have to import the csv module
import csv
# cols = line.split(',')# each column is split by a comma
#read the file
CSVreader = csv.reader(open('N:/Individual Files/Jerry/2013 customer list qc, cr, db, gb 9-19-2013_JerrysMessingWithVersion.csv', 'rb'), delimiter=',', quotechar='"')
# define open dictionary
SLSDictionary={}# no empty dictionary. Need column names to compare to.
i=0
#top row are your keys. All other rows are your values
#adjust loop
for row in CSVreader:
# mulitple loops needed here
if i == 0:
key = row[i]
else:
[values] = [row[1:]]
SLSDictionary = dict({key: [values]}) # Dictionary is keys and array of values
i=i+1
#print Dictionary to check errors and make sure dictionary is filled with keys and values
print SLSDictionary
# SLSDictionary has key of zip/phone plus any characters
#SLSDictionary.has_key('zip.+')
SLSDictionary.has_key('phone.+')
#value of key are set equal to x. Values of that column set equal to x
#[x]=value
#IF SLSDictionary has the key of zip plus any characters, move values to zip key
#if true:
# SLSDictionary['zip'].append([x])
#SLSDictionary['phone_home'].append([value]) # I need to append the values of the specific column, not all columns
#move key's values to correct, corresponding key
SLSDictionary['phone_home'].append(SLSDictionary[has_key('phone.+')])#Append the values of the key/column 'phone plus characters' to phone_home key/column in SLSDictionary
#if false:
# print ''
# go to next key
SLSDictionary.has_value('')
if true:
print 'Error: No data in column'
# if there's no data in rows 1-?. Delete column
#if value <= 0:
# del column
print SLSDictionary
Found a couple of errors just quickly looking at it. One thing you need to watch out for is that you're assigning a new value to the existing dictionary every time:
SLSDictionary = dict({key: [values]})
You're re-assigning a new value to your SLSDictionary every time it enters that loop. Thus at the end you only have the bottom-most entry. To add a key to the dictionary you do the following:
SLSDictionary[key] = values
Also you shouldn't need the brackets in this line:
[values] = [row[1:]]
Which should instead just be:
values = row[1:]
But most importantly is that you will only ever have one key because you constantly increment your i value. So it will only ever have one key and everything will constantly be assigned to it. Without a sample of how the CSV looks I can't instruct you on how to restructure the loop so that it will catch all the keys.
Assuming your CSV is like this as you've described:
Col1, Col2, Col3, Col4
Val1, Val2, Val3, Val4
Val11, Val22, Val33, Val44
Val111, Val222, Val333, Val444
Then you probably want something like this:
dummy = [["col1", "col2", "col3", "col4"],
["val1", "val2", "val3", "val4"],
["val11", "val22", "val33", "val44"],
["val111", "val222", "val333", "val444"]]
column_index = []
SLSDictionary = {}
for each in dummy[0]:
column_index.append(each)
SLSDictionary[each] = []
for each in dummy[1:]:
for i, every in enumerate(each):
try:
if column_index[i] in SLSDictionary.keys():
SLSDictionary[column_index[i]].append(every)
except:
pass
print SLSDictionary
Which Yields...
{'col4': ['val4', 'val44', 'val444'], 'col2': ['val2', 'val22', 'val222'], 'col3': ['val3', 'val33', 'val333'], 'col1': ['val1', 'val11', 'val111']}
If you want them to stay in order then change the dictionary type to OrderedDict()

write comma delimited text to excel

I'm iterating through a bunch of SDE's, then iterating through those and producing feature class titles, and it's in a list. I then perform a list.replace() and turn it into a comma delimited string.
So, I want to take that delimited string such as:
SDE_name1,thing1,thing2,thing3,thing4
SDE_name2,thing1,thing2,thing3,thing4
SDE_name3,thing1,thing2,thing3,thing4
and insert it into excel using XLWT
I can get it to write the entire length into one cell, with or without the commas,
but I'd like it to write each item into a new column...
So column A would have SDE_name1
Column B would have thing1
columb C would have thing2 etc etc etc
So far I've tried:
listrow=2
listcol=0
for row in list:
worksheet.write(listrow,listcol,row,style)
listrow+=1
wbk.save(bookname)
and
listrow=2
listcol=0
list=something.split(anotherlist,delimiter=",")
for row in list:
worksheet.write(listrow,listcol,row,style)
listrow+=1
wbk.save(bookname)
So both with and without a delimiter. Either way, both write everything to one column left to right. It will write each item in the list to a new row...but I need it to write each item after the comma to a new column.
Any idea?
You are not iterating through the columns. As it's not clear exactly what the variables in your example refer to, assume you start with the following
data = ["SDE_name1,thing1,thing2,thing3,thing4",
"SDE_name2,thing1,thing2,thing3,thing4",
"SDE_name3,thing1,thing2,thing3,thing4"]
The code to write the Excel file becomes:
rowCount = 2
for row in data: # row = "SDE_name1,thing1,thing2,thing3,thing4"
colCount = 0
for column in row.split(","): # column = ["SDE_name1", "thing1", "thing2", "thing3", "thing4"]
worksheet.write(rowCount, colCount, column, style)
colCount += 1
rowCount += 1
wbk.save(bookname)

Categories

Resources