I'm trying to create a list in python from a csv file. The CSV file contains only one column, with about 300 rows of data. The list should (ideally) contain a string of the data in each row.
When I execute the below code, I end up with a list of lists (each element is a list, not a string). Is the CSV file I'm using formatted incorrectly, or is there something else I'm missing?
filelist = []
with open(r'D:\blah\blahblah.csv', 'r') as expenses:
reader = csv.reader(expenses)
for row in reader:
filelist.append(row)
row is a row with one field. You need to get the first item in that row:
filelist.append(row[0])
Or more concisely:
filelist = [row[0] for row in csv.reader(expenses)]
It seems your "csv" doesn't contain any seperator like ";" or ",".
Because you said it only contains 1 column. So it ain't a real csv and there shouldn't be a seperator.
so you could simply read the file line-wise:
filelist = []
for line in open(r'D:\blah\blahblah.csv', 'r').readlines():
filelist.append(line.strip())
Each row is read as list of cells.
So what you want to do is
output = [ row[0] for row in reader ]
since you only have the first cell filled out in each row.
Related
I created a blank csv with some field names, and then have a script that calculates some values for each of those new fields/columns, and appends that row to the initially blank csv. I iterate through a folder of .csv files, and based on the data in those files, I create a new row of values for each iteration in the for loop, and then append those rows consecutively to the initially made .csv file. However, when I tried doing this, and then looking at the newly created .csv file, I saw that there was a space, a blank row, after each entry, so a space after the header row, and then a space after each newly appended row. I am using python for this.
I created the .csv file with this code:
fields = ['Field_1', 'Field_2', 'Field_3']
filename = 'final_results.csv'
with open(filename, 'w') as csvfile:
# creating a csv writer object
csvwriter = csv.writer(csvfile)
# writing the fields
csvwriter.writerow(fields)
In my script, I do calculations on the data from each of the .csv files I loop through. I ultimately compute a "Value_1", "Value_2", and a "Value_3", where "Value_1" should fall under "Field_1", "Value_2" should fall under "Field_2", and "Value_3" should fall under "Field_3". I then create a new row of these new values to be appended to my .csv file with simply:
new_row = [Value_1, Value_2, Value_3]
I then appended with this code:
with open('Final_results.csv', 'a') as f_object:
writer_object = writer(f_object)
writer_object.writerow(new_row)
f_object.close()
This led to spaces after the header rows and then spaces after each consecutively added row when I looked at the "Final_results.csv", the final product.
I then tried this:
with open('Final_results.csv', 'a', newline="") as f_object:
writer_object = writer(f_object)
writer_object.writerow(new_row)
f_object.close()
adding newline=""
And now when I look at "Final_results.csv", there are no spaces/blank rows between the newly appended rows, which is what I want, but there is still a blank row between the header row and the appended rows. How can I get rid of this space? I can't find what specific argument/parameter that would need to be added/changed within the "writer" module to address this issue.
I have a CSV file with text data separated by commas in some columns, but not in others, e.g.:
https://i.imgur.com/X6bq09I.png
I want to export each row of my CSV file to a new CSV file. An example desired output for the first row of my original file would look like this:
https://i.imgur.com/QB9sLeL.png
I have tried the code offered in the first answer of this post: Open CSV file and writing each row to new, dynamically named CSV file.
This is the code I used:
import csv
counter = 1
with open('mock_data.csv', 'rU') as csvfile:
reader = csv.reader(csvfile)
for row in reader:
if row:
filename = "trial%s" % str(counter)
with open(filename, 'w') as csvfile_out:
writer = csv.writer(csvfile_out)
writer.writerow(row)
counter = counter + 1
This code does produce a new .csv file for each row. However...
EDIT: I have three remaining issues, for which I have not found the right code:
I want each word to have its own cell in each row; I don't know
how to do this when certain cells contain a multiple words separated
by commas, while other cells contain only a single word;
Once each word has its own cell, I want to transpose each row into a single column in the new .csv file;
I want to remove duplicate values from the column.
If you actually want a file extension, then use filename = "trial%s.csv" % str(counter)
But CSV files don't care about file extensions. Any file reader or code should be able to read the file.
TextEdit is just the Mac default for that.
I need a single column with one word in each cell, in each new output file
When you do writer.writerow(row), then make sure if len(row) == 1 rather than if row
Started learning python after lots of ruby experience. With that context in mind:
I have a csv file that looks something like this:
city_names.csv
"abidjan","addis_ababa","adelaide","ahmedabad"
With the following python script I'd like to read this into a list:
city_names_reader.py
import csv
city_name_file = r"./city_names.csv"
with open(city_name_file, 'rb') as file:
reader = csv.reader(file)
city_name_list = list(reader)
print city_name_list
The result surprised me:
[['abidjan', 'addis_ababa', 'adelaide', 'ahmedabad']]
Any idea why I'm getting a nested list rather than a 4-element list? I must be overlooking something self-evident.
A CSV file represents a table of data. A table contains both columns and rows, like a spreadsheet. Each line in a CSV file is one row in the table. One row contains multiple columns, separated by ,
When you read a CSV file you get a list of rows. Each row is a list of columns.
If your file have only one row you can easily just read that row from the list:
city_name_list = city_name_list[0]
Usually each column represent some kind of data (think "column of email addresses"). Each row then represent a different object (think "one object per row, each row can have one email address"). You add more objects to the table by adding more rows.
It is not common with wide tables. Wide tables are those that grow by adding more columns instead of rows. In your case you have only one kind of data: city names. So you should have one column ("name"), with one row per city. To get city names from your file you could then read the first element from each row:
city_name_list = [row[0] for row in city_name_list]
In both cases you can flatten the list by using itertools.chain:
city_name_list = itertools.chain(city_name_list)
As others suggest, your file is not an idiomatic CSV file. You can simply do:
with open(city_name_file, "rb") as fp:
city_names_list = fp.read().split(",")
Based on comments, here is a possible solution:
import csv
city_name_file = r"./city_names.csv"
city_name_list = []
with open(city_name_file, 'rb') as file:
reader = csv.reader(file)
for item in reader:
city_name_list += item
print city_name_list
I'm trying to skip the first pipe delimited piece of data in my .txt file when reading it with a csv.DictReader. Here is a sample of the data I'm working with:
someCSVfile.csv|cust_no,0|streetaddr,1|city,2|state,3|zip,4|phone_home,5|firstname,6|lastname,7|status,9|
someCSVfile1.csv|cust_no,0|streetaddr,1|city,2|state,3|zip,4|phone_home,5|firstname,6|lastname,7|status,9|
And here is my code so far:
import csv
reader = csv.reader(open('match_log.txt','rb'), dialect='excel', delimiter='|')
for row in reader:
skipfirstRow=reader.next()
skipfirstRowAgain=reader.next()
Dictreader=csv.DictReader(reader,skipfirstRow)
print row
I've been researching .next() pretty thoroughly, but that doesn't seem to work. When I print my rows, it prints every row, when I don't want the first row (the .csv files) to be printed. Is there another method that may work?
EDIT: Here is my latest code:
import csv
reader = csv.reader(open('match_log.txt','rb'), dialect='excel', delimiter='|')
data = {}
for row in reader:
filenameVariable = row[0]
data = dict(item.split(',') for item in row[1:])
print data
print filenameVariable
Right now, data and filenameVariable are printing the final row when I need all rows. I tried .append but that didn't work. What else could I use?
The .csv parts are the first column/field, not the first row. Advancing reader will indeed skip rows, but won't affect what's in each individual row. (Rows go across!)
If you want to leave off the first item in a sequence, print row[1:] instead of row.
I have an Excel file that I converted to CSV. There are several tables each separated by an empty row. After converting the Excel file to CSV, I see each empty row represented by a row of commas, with a comma for every column/field element. Can the CSV module (or some other Python module) account for multiple tables from this information? If not, is my only option to separate the tables into different files manually in Excel before conversion?
I know the CSV module will turn each row into a list. I'd like a table to be its own list and all the rows it has as lists within. Each table has the first row as fields. The fields can be different from table to table, and the number of fields can be different as well.
You can give this a try:
def extract_table(f):
table = []
for line in f:
if not len(line):
# Table delimeter reached
break
fields = line.split(',')
table.append(fields)
return table
def main():
with open("myfile.csv") as f:
while True:
table = extract_table(f)
if not len(table):
# No table found, reached end of file
break
# Do something with table
# ...
Sure, it's easy to read the data in that way. You have to decide what constitutes the separator row (is it sufficient to check for the first column being empty, or do you have to check that all columns are empty?) Assuming just the first row (and being extra verbose for clarity):
rdr = csv.reader(open(filename))
tables = []
this_table = []
tables.append(this_table)
for row in rdr:
if row[0] is None:
this_table = []
tables.append(this_table)
this_table.append(row)
The result is a list called tables. Each entry is a list containing the data for one table. Each entry in a table is a list containing the column values for one row.