Replacing commas with blank spaces from a read in text file - python

import os
os.chdir('my directory')
data = open('text.txt', 'r')
data = data.replace(",", " ")
print(data)
I get the error:
AttributeError: '_io.TextIOWrapper' object has no attribute 'replace'

You should open files in a with statement:
with open('text.txt', 'r') as data:
plaintext = data.read()
plaintext = plaintext.replace(',', '')
the with statement ensures that resources are released properly, so you don't have to worry about remembering to close them.
The more substantial thing you were missing is that data is a file object, and replace works on strings. data.read() returns the string of text in the file.

Related

Extra blank line is getting printed at the end of the output in Python

I am trying to read a file from command line and trying to replace all the commas in that file with blank. Below is my code:
import sys
datafile = sys.argv[1];
with open(datafile, 'r') as data:
plaintext = data.read()
plaintext = plaintext.replace(',', '')
print(plaintext)
But while printing the plaintext I am getting one extra blank row at the end. Why is it happening and how can I get rid of that?
You might be able to use
plaintext.rstrip('\n')
This should remove the extra line

_io.TextIOWrapper Error trying to open a file

I'm working with TSV file (here below my_file) and trying to write it down to another temp file with a random ID (here below my_temp_file)and this is what I wrote:
def temp_generator():
while True:
my_string = 'tmp' + str(random.randint(1,1000000000))
if not os.path.exists(my_string):
return my_string
randomID = temp_generator()
my_temp_file = open('mytemp_'+randomID + '.tsv', 'w')
with open(my_file, 'r+') as mf:
for line in mf:
my_temp_file.write(line)
my_temp_file.close()
mf.close()
The output is something like:
mytemp_1283189.tsv
Now I'd like to work with my_temp_file.tsv in order to modify its content and rename it but if I try to open it with:
with open (my_temp_file.tsv, 'r') as mtf:
data = mtf.read()
print(data)
This is what I obtain:
TypeError: expected str, bytes or os.PathLike object, not _io.TextIOWrapper
What can I do?
Issue
The pattern handle = open(path)
is opening a file at path from path and returns the handle assigned to handle. You can use handle to .write, .read, or .close. But you can not open it again or use it as input to open - which expects a Path-like object, e.g. a filename.
Fixed
def temp_generator():
while True:
my_string = 'tmp' + str(random.randint(1,1000000000))
if not os.path.exists(my_string):
return my_string
randomID = temp_generator()
# copy from input (my_file) to output, a random temp file (my_temp_file)
my_temp_file = 'mytemp_' + randomID + '.tsv'
with open(my_temp_file, 'w') as mtf:
with open(my_file, 'r+') as mf: # my_file is supposed to be a Path-like object
for line in mf:
mtf.write(line)
# since with..open used, no close needed (auto-close!)
# modify output (content and rename the file)
# remember: my_temp_file is holding a Path or filename
with open(my_temp_file, 'r') as mtf: # open the file again
data = mtf.read()
print(data)
See also:
[Solved] Python TypeError: expected str, bytes or os.PathLike object, not _io.TextIOWrapper
Python documentation about TextIOWrapper: io — Core tools for working with streams
Your error is here, assuming you are actually using open(my_temp_file
my_temp_file = open('mytemp_'+randomID + '.tsv', 'w')
with open(my_file, 'r+') as mf:
You've already opened the file, so you shouldn't open it again using the file handle as the parameter. You should prefer only using the with way of opening files, too
For example
my_temp_file = 'mytemp_'+randomID + '.tsv'
with open(my_temp_file, 'r+') as mf:
Even then, if you're going to eventually rename the file, just make it the name you want from the start

Writing yaml file: attribute error

I'm trying to read a yaml file, replacing part of it and write the result it into the same file, but I get an attribute error.
Code
import yaml
import glob
import re
from yaml import load, dump
from yaml import CLoader as Loader, CDumper as Dumper
import io
list_paths = glob.glob("my_path/*.yaml")
for path in list_paths:
with open(path, 'r') as stream:
try:
text = load(stream, Loader=Loader)
text = str(text)
print text
if "my_string" in text:
start = "'my_string': '"
end = "'"
m = re.compile(r'%s.*?%s' % (start,end),re.S)
m = m.search(text).group(0)
text[m] = "'my_string': 'this is my string'"
except yaml.YAMLError as exc:
print(exc)
with io.open(path, 'w', encoding = 'utf8') as outfile:
yaml.dump(text, path, default_flow_style=False, allow_unicode=True)
Error
I get this error for the yaml_dump line
AttributeError: 'str' object has no attribute 'write'
What I have tried so far
Not converting the text to a string, but then I get an error on the m.search line:
TypeError: expected string or buffer
Convert first to string and then to dictagain, but I get this error from the code text: dict(text) : ValueError: dictionary update sequence element #0 has length 1; 2 is required
Yaml file
my string: something
string2: something else
Expected result: yaml file
my string: this is my string
string2: something else
To stop getting that error all you need to do is change the
with io.open(path, 'w', encoding = 'utf8') as outfile:
yaml.dump(text, path, default_flow_style=False, allow_unicode=True)
to
with open(path, 'w') as outfile:
yaml.dump(text.encode("UTF-8"), outfile, default_flow_style=False, allow_unicode=True)
As the other answer says, this solution simply replaces the string path with the open file descriptor.
This
yaml.dump(text, path, default_flow_style=False, allow_unicode=True)
is not possible if path is a str. It must be an open file.

AttributeError: '_io.TextIOWrapper' object has no attribute 'decode'

I'm trying read multiple text files, doing word segmentation (use jieba) and then save the results to CSV files respectively. It shows
AttributeError: '_io.TextIOWrapper' object has no attribute 'decode'
Thanks for anyone's help.
The python code is:
import jieba
import csv
import glob
list_of_files = glob.glob('C:/Users/user/Desktop/speech./*.txt')
for file_name in list_of_files:
FI = open(file_name, 'r')
FO = open(file_name, 'w')
seglist = jieba.cut(FI, cut_all=False)
w = csv.writer(FO)
w.writerows(seglist)
FI.close()
FO.close()
It seems that you need to send bytes to cut and not a file object
try this code instead:
list_of_files = glob.glob('C:/Users/user/Desktop/speech./*.txt')
for file_name in list_of_files:
with open(file_name, 'rb') as f:
text = f.read()
seglist = jieba.cut(text, cut_all=False)
with open(file_name, 'w') as f:
w = csv.writer(f)
w.writerows(seglist)
From what I read from source code as example and jieba.cut definition it seems jieba.cut needs string as parameter.
But you are giving an instance of file.
seglist = jieba.cut(FI.read(), cut_all=False)
Fixed the issue from what I saw. (FI.read() is the fix).
Btw, do not call variables like FI / FO, this is valid name for constants or maybe classes, but not variable.
Explicit is better than implicit: prefere something like: file_output & file_input.

How to read a csv django http response

In a view, I create a Django HttpResponse object composed entirely of a csv using a simply csv writer:
response = HttpResponse(content_type='text/csv')
response['Content-Disposition'] = 'attachment; filename="foobar.csv"'
writer = csv.writer(response)
table_headers = ['Foo', 'Bar']
writer.writerow(table_headers)
bunch_of_rows = [['foo', 'bar'], ['foo2', 'bar2']]
for row in bunch_of_rows:
writer.writerow(row)
return response
In a unit test, I want to test some aspects of this csv, so I need to read it. I'm trying to do so like so:
response = views.myview(args)
reader = csv.reader(response.content)
headers = next(reader)
row_count = 1 + sum(1 for row in reader)
self.assertEqual(row_count, 3) # header + 1 row for each attempt
self.assertIn('Foo', headers)
But the test fails with the following on the headers = next(reader) line:
nose.proxy.Error: iterator should return strings, not int (did you open the file in text mode?)
I see in the HttpResponse source that response.content is spitting the string back out as a byte-string, but I'm not sure the correct way to deal with that to let csv.reader read the file correctly. I thought I would be able to just replace response.content with response (since you write to the object itself, not it's content), but that just resulted in a slight variation in the error:
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)
Which seems closer but obviously still wrong. Reading the csv docs, I assume I am failing to open the file correctly. How do I "open" this file-like object so that csv.reader can parse it?
response.content provides bytes. You need to decode this to a string:
foo = response.content.decode('utf-8')
Then pass this string to the csv reader using io.StringIO:
import io
reader = csv.reader(io.StringIO(foo))
You can use io.TextIOWrapper to convert the provided bytestring to a text stream:
import io
reader = csv.reader(io.TextIOWrapper(io.BytesIO(response.content), encoding='utf-8'))
This will convert the bytes to strings as they're being read by the reader.

Categories

Resources