I writing some unicode output to csv using the unicodecsv module. Everything is working as expected, but I'm trying to build it out by adding some headers or field names. So far, I've tried a number of different ways, but I can't come up with how to add the field names.
I've tried other unicode solutions and this module seems to be the most elegant to implement so I'm trying to use it if possible. If there are other suggestions, I'm up for them. Any ideas?? Please see relevant code below.
import unicodecsv
with open('c:\pull.csv', 'wb+') as f:
csv_writer = unicodecsv.writer(f, encoding='utf-8')
for i in changes['user']['login'], changes['title'], str(changes['changed_files']), str(changes['commits']) :
csv_writer.writerow([changes['user']['login'], changes['title'],changes['changed_files'], changes['commits']])
Sample output for changes in the csv file:
'John Doe', 'Some Title', 1, 1
For the json data you have, there is only one user entry, so the following should work:
with open('c:\pull.csv', 'wb+') as f:
csv_writer = unicodecsv.writer(f, encoding='utf-8')
# Write a header row (do once)
# csv_writer.writerow(["login", "title", "changed_files", "commits"])
# Write data row
csv_writer.writerow([changes['user']['login'], changes['title'],changes['changed_files'], changes['commits']])
If you want a header row, uncomment the line. This would then give you an output file:
login,title,changed_files,commits
octocat,new-feature,5,3
Related
I'm trying to scrape comments from a certain submission on Reddit and output them to a CSV file.
import praw
import csv
reddit = praw.Reddit(client_id='ClientID', client_secret='ClientSecret', user_agent='UserAgent')
Submission = reddit.submission(id="SubmissionID")
with open('Reddit.csv', 'w') as csvfile:
for comment in submission.comments:
csvfile.write(comment.body)
The problem is that for each cell the comments seem to be randomly split up. I want each comment in its own cell. Any ideas on how to achieve this?
You are importing the csv library but you are not actually utilizing it. Utilize it and your problem may go away.
https://docs.python.org/3/library/csv.html#csv.DictWriter
import csv
# ...
comment = "this string was created from your code"
# ...
with open('names.csv', 'w', newline='') as csvfile:
fieldnames = ['comment']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
writer.writerow({'comment': comment})
To write a CSV file in Python, use the csv module, specifically csv.writer(). You import this module at the top of your code, but you never use it.
Using this in your code, this looks like:
with open('Reddit.csv', 'w') as csvfile:
comment_writer = csv.writer(csvfile)
for comment in submission.comments:
comment_writer.writerow([comment.body])
Here, we use csv.writer() to create a CSV writer from the file that we've opened, and we call it comment_writer. Then, for each comment, we write another row to the CSV file. The row is represented as a list. Since we only have one piece of information to write on each row, the list contains just one item. The row is [comment.body].
The csv module takes care of making sure that values with new lines, commas, or other special characters are properly formatted as CSV values.
Note that, for some submissions with many comments, your PRAW code might raise an exception along the lines of 'MoreComments' object has no attribute 'body'. The PRAW docs discuss this, and I encourage you to read that to learn more, but know that we can avoid this happening in code by further modifying our loop:
from praw.models import Comment
# ...
with open('Reddit.csv', 'w') as csvfile:
comment_writer = csv.writer(csvfile)
for comment in submission.comments:
if isinstance(comment, Comment):
comment_writer.writerow([comment.body])
Also, your code only gets the top level comments of a submission. If you're interested in more, see this question, which is about how to get more than just top-level comments from a submission.
I'm guessing that the cells are not being split randomly, but being split at a comma, space semi-colon. You can choose what character you want the cells to be split at by using the delimiter parameter.
import csv
with open('Reddit.csv', 'w') as csvfile:
comments = ['comment one','comment two','comment three']
csv_writer = csv.writer(csvfile, delimiter='-')
csv_writer.writerow(comments)
I am trying to create a script to reformat a .CSV file. The read file starts as pipe delimited and get written as comma.
All it needs to do is index the columns and output them to file the way I want.
I am able to make it work perfectly when printing the output to screen (see two commented lines at bottom of code), but when I attempt to write to file I get the following error. I have tried to change the format of csv_writer.writerow({'F3'}) several different ways. It would seem I don't completely understand how to use writerow(). Or if I am completely missing something to make it function properly.
I also have to put static fields in, in front of the index fields. (i.e I need a "1" put in front of the F3 field) Is there an additional trick to that?
import csv
csv.register_dialect('piper', delimiter='|')
with open('pbfile.txt', 'r') as csv_file:
csv_reader = csv.DictReader(csv_file, dialect='piper',quoting=csv.QUOTE_MINIMAL)
with open('ouput2.csv', 'w', newline='') as new_file:
fieldnames = ['F0','F1','F2','F3','F4','F5','F6']
csv_writer = csv.DictWriter(new_file, delimiter=',',fieldnames=fieldnames)
for line in csv_reader:
#csv_writer.writeheader()
csv_writer.writerow({'F3'})
csv_writer.writerow({'F1', 'F2', 'F6'})
#print('1', line['F3'])
#print('380', line['F1'], line['F2'], line['F6'])
I am using a certain REST api to get data, and then attemping to write it to a csv using python 2.7
In the csv, every item with a tuple has u' ' around it. For example, with the 'tags' field i am retrieving, i am getting [u'01d/02d/major/--', u'45m/04h/12h/24h', u'internal', u'net', u'premium_custom', u'priority_fields_swapped', u'priority_saved', u'problem', u'urgent', u'urgent_priority_issue'] . However, if I print the data in the program prior to it being written in the csv, the data looks fine, .ie ('01d/02d/major--', '45m/04h/12h/24h', etc). So I am assuming I have to modify something in the csv write command or within the the csv writer object itself. My question is how to write the data into the csv properly so that there are no unicode characters.
In Python3:
Just define the encoding when opening the csv file to write in.
If the row contains non ascii chars, you will get UnicodeEncodeError
row = [u'01d/02d/major/--', u'45m/04h/12h/24h', u'internal', u'net', u'premium_custom', u'priority_fields_swapped', u'priority_saved', u'problem', u'urgent', u'urgent_priority_issue']
import csv
with open('output.csv', 'w', newline='', encoding='ascii') as f:
writer = csv.writer(f)
writer.writerow(row)
I am currently trying to write a csv file in python. The format is as following:
1; 2.51; 12
123; 2.414; 142
EDIT: I already get the above format in my CSV, so the python code seems ok. It appears to be an excel issue which is olved by changing the settigs as #chucksmash mentioned.
However, when I try to open the generated csv file with excel, it doesn't recognize decimal separators. 2.414 is treated as 2414 in excel.
csvfile = open('C:/Users/SUUSER/JRITraffic/Data/data.csv', 'wb')
writer = csv.writer(csvfile, delimiter=";")
writer.writerow(some_array_with_floats)
Did you check that the csv file is generated correctly as you want? Also, try to specify the delimeter character that your using for the csv file when you import/open your file. In this case, it is a semicolon.
For python 3, I think your above code will also run into a TypeError, which may be part of the problem.
I just made a modification with your open method to be 'w' instead of 'wb' since the array has float and not binary data. This seemed to generate the result that you were looking for.
csvfile = open('C:/Users/SUUSER/JRITraffic/Data/data.csv', 'w')
An ugly solution, if you really want to use ; as the separator:
import csv
import os
with open('a.csv', 'wb') as csvfile:
csvfile.write('sep=;'+ os.linesep) # new line
writer = csv.writer(csvfile, delimiter=";")
writer.writerow([1, 2.51, 12])
writer.writerow([123, 2.414, 142])
This will produce:
sep=;
1;2.51;12
123;2.414;142
which is recognized fine by Excel.
I personally would go with , as the separator in which case you do not need the first line, so you can basically:
import csv
with open('a.csv', 'wb') as csvfile:
writer = csv.writer(csvfile) # default delimiter is `,`
writer.writerow([1, 2.51, 12])
writer.writerow([123, 2.414, 142])
And excel will recognize what is going on.
A way to do this is to specify dialect=csv.excel in the writer. For example:
a = [[1, 2.51, 12],[123, 2.414, 142]]
csvfile = open('data.csv', 'wb')
writer = csv.writer(csvfile, delimiter=";", dialect=csv.excel)
writer.writerows(a)
csvfile.close()
Unless Excel is already configured to use semicolon as its default delimiter, it will be necessary to import data.csv using Data/FromText and specify semicolon as the delimiter in the Text Import Wizard step 2 screen.
Very little documentation is provided for the Dialect class at csv.Dialect. More information about it is at Dialects in the PyMOTW's "csv – Comma-separated value files" article on the Python csv module. More information about csv.writer() is available at https://docs.python.org/2/library/csv.html#csv.writer.
I have a python list as such:
[['a','b','c'],['d','e','f'],['g','h','i']]
I am trying to get it into a csv format so I can load it into excel:
a,b,c
d,e,f
g,h,i
Using this, I am trying to write the arary to a csv file:
with open('tables.csv','w') as f:
f.write(each_table)
However, it prints out this:
[
[
'
a
'
,
...
...
So then I tried putting it into an array (again) and then printing it.
each_table_array=[each_table]
with open('tables.csv','w') as f:
f.write(each_table_array)
Now when I open up the csv file, its a bunch of unknown characters, and when I load it into excel, I get a character for every cell.
Not too sure if it's me using the csv library wrong, or the array portion.
I just figured out that the table I am pulling data from has another table within one of its cells, this expands out and messes up the whole formatting
You need to use the csv library for your job:
import csv
each_table = [['a','b','c'],['d','e','f'],['g','h','i']]
with open('tables.csv', 'w') as csvfile:
writer = csv.writer(csvfile)
for row in each_table:
writer.writerow(row)
As a more flexible and pythonic way use csv module for dealing with csv files Note that as you are in python 2 you need the method newline='' * in your open function . then you can use csv.writer to open you csv file for write:
import csv
with open('file_name.csv', 'w',newline='') as csvfile:
spamwriter = csv.writer(csvfile, delimiter=',')
spamwriter.writerows(main_list)
From python wiki: If newline='' is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use \r\n linendings on write an extra \r will be added. It should always be safe to specify newline='', since the csv module does its own (universal) newline handling.