I have a sentence like !=This is great
How do I write the sentence into a .CSV file with "!" in the first column and "This is great" in another column?
You can use pandas to_csv method
code:
import pandas as pd
col1 = []
col2 = []
f = '!=This is great'
l1 = f.split('=')
col1.append(l1[0])
col2.append(l1[1])
df = pd.DataFrame()
df['col1'] = col1
df['col2'] = col2
df.to_csv('test.csv')
split the text, and write it to an output file:
text = open('in.txt').read() #if from input file
text = '!=This is great' #if not from input file
with open('out.csv','w') as f:
f.write(','.join(text.split('=')))
output:
!,This is great
if you have multiple lines, you will have to loop through the input file and split each one
Of course, you could write using standard io with open() and manually write with comma delimiter for each line, but python has csv standard library that will help you with this. You could specify the dialect
In [1]: import csv
In [2]: sentence="!=This is great"
In [3]: with open("test.csv", "w", newline='') as f:
...: my_csvwriter = csv.writer(f)
...: my_csvwriter.writerow(sentence.split("="))
With multiple data, assuming it's in list, you could iterate through it when writing.
with open("test.csv", "w", newline='') as f:
my_csvwriter = csv.writer(f)
for sentence in sentences:
my_csvwriter.writerow(sentence.split("="))
This library helps handling comma in a sentence, instead of handling it yourself. For instance you have:
sentence = "!=Hello, my name is.."
with open("test.csv", "w", newline='') as f:
my_csvwriter = csv.writer(f)
my_csvwriter.writerow(sentence.split("="))
# This will be written: !,"Hello, my name is.."
# With that quote, you could still open it in excel without confusing it
# and it knows that `Hello, my name is..` is in the same column
As per other posts I saw on stackoverflow, I wrote the following code to convert an XLSX to TXT however it throws: AttributeError: exit :
import xlrd
import csv
with xlrd.open_workbook('data.xlsx').sheet_by_index(0) as in_xslx:
in_reader = csv.reader(in_xslx)
with open("data.txt", "w", newline='', encoding='utf8') as out_text:
out_writer = csv.writer(out_text, delimiter = '\t')
for row in in_reader:
out_writer.writerow(row)
However it successfully converts a CSV if I replace the first two rows with:
with open("data.csv", "r", encoding='utf-8') as in_csv:
in_reader = csv.reader(in_csv)
Any idea why is that happening when converting XSLX->TXT and how to correct?
Thank you
What you need is:
import xlrd
import csv
with open("data.txt", "w") as out_text:
# define output writer
out_writer = csv.writer(out_text, delimiter = '\t')
# Open and read an Excel file
data_file = xlrd.open_workbook('data.xlsx')
# get the first worksheet
worksheet= data_file.sheet_by_index(0)
# get the row values and write into output file
for rownum in xrange( worksheet.nrows ):
out_writer.writerow(worksheet.row_values(rownum))
I have an input csv that look like
email,trait1,trait2,trait3
foo#gmail,biz,baz,buzz
bar#gmail,bizzy,bazzy,buzzy
foobars#gmail,bizziest,bazziest,buzziest
and I need the output format to look like
Indv,AttrName,AttrValue,Start,End
foo#gmail,"trait1",biz,,,
foo#gmail,"trait2",baz,baz,,
foo#gmail,"trait3",buzz,,,
For each row in my input file I need to write a row for the N-1 columns in the input csv. The Start and End fields in the output file can be empty in some cases.
I'm trying to read in the data using a DictReader. So for i've been able to read in the data with
import unicodecsv
import os
import codecs
with open('test.csv') as csvfile:
reader = unicodecsv.csv.DictReader(csvfile)
outfile = codecs.open("test-write", "w", "utf-8")
outfile.write("Indv", "ATTR", "Value", "Start","End\n")
for row in reader:
outfile.write([row['email'],"trait1",row['trait1'],'',''])
outfile.write([row['email'],"trait2",row['trait2'],row['trait2'],''])
outfile.write([row['email'],"trait3",row['trait3'],'','')
Which doesn't work. (I think I need to cast the list to a string), and is also very brittle as I'm hardcoding the column names for each row. The bigger issue is that the data within the for loop isn't written to "test-write". Only the line
outfile.write("Indv", "ATTR", "Value", "Start","End\n") actually write out to the file. Is DictReader the appropriate class to use in my case?
This uses a unicodecsv.DictWriter and the zip() function to do what you want, and the code is fairly readable in my opinion.
import unicodecsv
import os
import codecs
with open('test.csv') as infile, \
codecs.open('test-write.csv', 'w', 'utf-8') as outfile:
reader = unicodecsv.DictReader(infile)
fieldnames = 'Indv,AttrName,AttrValue,Start,End'.split(',')
writer = unicodecsv.DictWriter(outfile, fieldnames)
writer.writeheader()
for row in reader:
email = row['email']
trait1, trait2, trait3 = row['trait1'], row['trait2'], row['trait3']
writer.writerows([ # writes three rows of output from each row of input
dict(zip(fieldnames, [email, 'trait1', trait1])),
dict(zip(fieldnames, [email, 'trait2', trait2, trait2])),
dict(zip(fieldnames, [email, 'trait3', trait3]))])
Here's the contents of the test-write.csv file it produced from your example input csv file:
Indv,AttrName,AttrValue,Start,End
foo#gmail,trait1,biz,,
foo#gmail,trait2,baz,baz,
foo#gmail,trait3,buzz,,
bar#gmail,trait1,bizzy,,
bar#gmail,trait2,bazzy,bazzy,
bar#gmail,trait3,buzzy,,
foobars#gmail,trait1,bizziest,,
foobars#gmail,trait2,bazziest,bazziest,
foobars#gmail,trait3,buzziest,,
I may be completely off since I don't do a lot of work with unicode, but it seems to me that the following should work:
import csv
with open('test.csv', 'ur') as csvin, open('test-write', 'uw') as csvout:
reader = csv.DictReader(csvin)
writer = csv.DictWriter(csvout, fieldnames=['Indv', 'AttrName',
'AttrValue', 'Start', 'End'])
for row in reader:
for traitnum in range(1, 4):
key = "trait{}".format(traitnum)
writer.writerow({'Indv': row['email'], 'AttrName': key,
'AttrValue': row[key]})
import pandas as pd
pd1 = pd.read_csv('input_csv.csv')
pd2 = pd.melt(pd1, id_vars=['email'], value_vars=['trait1','trait2','trait3'], var_name='AttrName', value_name='AttrValue').rename(columns={'email': 'Indv'}).sort(columns=['Indv','AttrName']).reset_index(drop=True)
pd2.to_csv('output_csv.csv', index=False)
Unclear on what the Start and End fields represent, but this gets you everything else.
With the Amazing help of Martijn i came this far in my python programming. However i tried to export the content of my cells to a csv file. I succeeded in importing it, but my resuit is as follows:
import urllib2
from bs4 import BeautifulSoup
soup = BeautifulSoup(urllib2.urlopen('https://clinicaltrials.gov/ct2/show/study/NCT01718158?term=NCT01718158&rank=1&show_locs=Y#locn').read())
import csv
filename = 'Trial1.csv'
f = open(filename, 'wb')
with f:
writer = csv.writer(f)
for row in soup('table')[5].findAll('tr'):
tds = row('td')
result = u' '.join([cell.string for cell in tds if cell.string])
writer.writerow(result)
print result
f.close()
Result: |j|o|h|n|1|2|3
instead of |john|123| for each particular cell.
How do i correct this. Thanks.
Well the problem is your cell in tds contains , but some don't, which the writer got confused. As you know, it's csv writer (Comma Separate Value).
Anyway, just change the delimiter should correct the issue you had, like this:
...
# I'd suggest using with ... as f as in 1 line
with open(filename, 'wb') as f:
# set the delimiter to \t tab than comma
writer = csv.writer(f, delimiter='\t')
for row in soup('table')[5].findAll('tr'):
tds = row('td')
# you can writerow the list directly as it will convert it to string for you
writer.writerow([cell.string for cell in tds if cell.string])
...
Hope this helps.
I have some data that needs to be written to a CSV file. The data is as follows
A ,B ,C
a1,a2 ,b1 ,c1
a2,a4 ,b3 ,ct
The first column has comma inside it. The entire data is in a list that I'd like to write to a CSV file, delimited by commas and without disturbing the data in column A. How can I do that? Mentioning delimiter = ',' splits it into four columns on the whole.
Just use the csv.writer from the csv module.
import csv
data = [['A','B','C']
['a1,a2','b1','c1']
['a2,a4','b3','ct']]
fname = "myfile.csv"
with open(fname,'wb') as f:
writer = csv.writer(f)
for row in data:
writer.writerow(row)
https://docs.python.org/library/csv.html#csv.writer
No need to use the csv module since the ',' in the first column is already part of your data, this will work:
with open('myfile.csv', 'w') as f:
for row in data:
f.write(', '.join(row))
f.write('\n')
You could try the below.
Code:
import csv
import re
with open('infile.csv', 'r') as f:
lst = []
for line in f:
lst.append(re.findall(r',?(\S+)', line))
with open('outfile.csv', 'w', newline='') as w:
writer = csv.writer(w)
for row in lst:
writer.writerow(row)
Output:
A,B,C
"a1,a2",b1,c1
"a2,a4",b3,ct