Writing yaml file: attribute error - python

I'm trying to read a yaml file, replacing part of it and write the result it into the same file, but I get an attribute error.
Code
import yaml
import glob
import re
from yaml import load, dump
from yaml import CLoader as Loader, CDumper as Dumper
import io
list_paths = glob.glob("my_path/*.yaml")
for path in list_paths:
with open(path, 'r') as stream:
try:
text = load(stream, Loader=Loader)
text = str(text)
print text
if "my_string" in text:
start = "'my_string': '"
end = "'"
m = re.compile(r'%s.*?%s' % (start,end),re.S)
m = m.search(text).group(0)
text[m] = "'my_string': 'this is my string'"
except yaml.YAMLError as exc:
print(exc)
with io.open(path, 'w', encoding = 'utf8') as outfile:
yaml.dump(text, path, default_flow_style=False, allow_unicode=True)
Error
I get this error for the yaml_dump line
AttributeError: 'str' object has no attribute 'write'
What I have tried so far
Not converting the text to a string, but then I get an error on the m.search line:
TypeError: expected string or buffer
Convert first to string and then to dictagain, but I get this error from the code text: dict(text) : ValueError: dictionary update sequence element #0 has length 1; 2 is required
Yaml file
my string: something
string2: something else
Expected result: yaml file
my string: this is my string
string2: something else

To stop getting that error all you need to do is change the
with io.open(path, 'w', encoding = 'utf8') as outfile:
yaml.dump(text, path, default_flow_style=False, allow_unicode=True)
to
with open(path, 'w') as outfile:
yaml.dump(text.encode("UTF-8"), outfile, default_flow_style=False, allow_unicode=True)
As the other answer says, this solution simply replaces the string path with the open file descriptor.

This
yaml.dump(text, path, default_flow_style=False, allow_unicode=True)
is not possible if path is a str. It must be an open file.

Related

Convert bytes object to string object in python

python code
#!python3
import sys
import os.path
import codecs
if not os.path.exists(sys.argv[1]):
print("File does not exist: " + sys.argv[1])
sys.exit(1)
file_name = sys.argv[1]
with codecs.open(file_name, 'rb', errors='ignore') as file:
file_contents = file.readlines()
for line_content in file_contents:
print(type(line_content))
line_content = codecs.decode(line_content)
print(line_content)
print(type(line_content))
File content : Log.txt
b'\x03\x00\x00\x00\xc3\x8a\xc3\xacRb\x00\x00\x00\x00042284899:ATBADSFASF:DSF456582:US\r\n1'
Output:
python3 file_convert.py Log.txt  ✔  19:08:22 
<class 'bytes'>
b'\x03\x00\x00\x00\xc3\x8a\xc3\xacRb\x00\x00\x00\x00042284899:ATBADSFASF:DSF456582:US\r\n1'
<class 'str'>
I tried all the below methods
line_content = line_content.decode('UTF-8')
line_content = line_content.decode()
line_content = codecs.decode(line_content, 'UTF-8')
Is there any other way to handle this?
The line_content variable still holds the byte data and only the type changes to str which is kind off confusing.
The data in Log.txt is the string representation of a python Bytes object. That is odd but we can deal with it. Since its a Bytes literal, evaluate it, which converts it to a real python Bytes object. Now there is still a question of what its encoding is.
I don't see any advantage to using codecs.open. That's a way to read unicode files in python 2.7, not usually needed in python 3. Guessing UTF-8, your code would be
#!python3
import sys
import os
import ast
if not os.path.exists(sys.argv[1]):
print("File does not exist: " + sys.argv[1])
sys.exit(1)
file_name = sys.argv[1]
with open(file_name) as file:
file_contents = file.readlines()
for line_content in file_contents:
print(type(line_content))
line_content = ast.literal_eval(line_content).decode("utf-8")
print(line_content)
print(type(line_content))
I think it's a list not a string. Whenever you look at byte-string started with \ (reverse backslash), it's potentially a list
try this
decoded_line_content = list(line_content)

_io.TextIOWrapper Error trying to open a file

I'm working with TSV file (here below my_file) and trying to write it down to another temp file with a random ID (here below my_temp_file)and this is what I wrote:
def temp_generator():
while True:
my_string = 'tmp' + str(random.randint(1,1000000000))
if not os.path.exists(my_string):
return my_string
randomID = temp_generator()
my_temp_file = open('mytemp_'+randomID + '.tsv', 'w')
with open(my_file, 'r+') as mf:
for line in mf:
my_temp_file.write(line)
my_temp_file.close()
mf.close()
The output is something like:
mytemp_1283189.tsv
Now I'd like to work with my_temp_file.tsv in order to modify its content and rename it but if I try to open it with:
with open (my_temp_file.tsv, 'r') as mtf:
data = mtf.read()
print(data)
This is what I obtain:
TypeError: expected str, bytes or os.PathLike object, not _io.TextIOWrapper
What can I do?
Issue
The pattern handle = open(path)
is opening a file at path from path and returns the handle assigned to handle. You can use handle to .write, .read, or .close. But you can not open it again or use it as input to open - which expects a Path-like object, e.g. a filename.
Fixed
def temp_generator():
while True:
my_string = 'tmp' + str(random.randint(1,1000000000))
if not os.path.exists(my_string):
return my_string
randomID = temp_generator()
# copy from input (my_file) to output, a random temp file (my_temp_file)
my_temp_file = 'mytemp_' + randomID + '.tsv'
with open(my_temp_file, 'w') as mtf:
with open(my_file, 'r+') as mf: # my_file is supposed to be a Path-like object
for line in mf:
mtf.write(line)
# since with..open used, no close needed (auto-close!)
# modify output (content and rename the file)
# remember: my_temp_file is holding a Path or filename
with open(my_temp_file, 'r') as mtf: # open the file again
data = mtf.read()
print(data)
See also:
[Solved] Python TypeError: expected str, bytes or os.PathLike object, not _io.TextIOWrapper
Python documentation about TextIOWrapper: io — Core tools for working with streams
Your error is here, assuming you are actually using open(my_temp_file
my_temp_file = open('mytemp_'+randomID + '.tsv', 'w')
with open(my_file, 'r+') as mf:
You've already opened the file, so you shouldn't open it again using the file handle as the parameter. You should prefer only using the with way of opening files, too
For example
my_temp_file = 'mytemp_'+randomID + '.tsv'
with open(my_temp_file, 'r+') as mf:
Even then, if you're going to eventually rename the file, just make it the name you want from the start

Decode from Escaped Unicode to Arabic using Python

I was trying to decode a json file that has escaped unicode text /uHHH .. the original text is Arabic
my research lead me to the following code using python.
s = '\u00d8\u00b5\u00d9\u0088\u00d8\u00b1 \u00d8\u00a7\u00d9\u0084\u00d9\u008a\u00d9\u0088\u00d9\u0085\u00d9\u008a\u00d8\u00a7\u00d8\u00aa'
ouy= s.encode('utf-8').decode('unicode-escape').encode('latin1').decode('utf-8')
print(ouy)
the result text will be: صÙر اÙÙÙÙÙات
which still needs some fix using online tool to become the original text: صور اليوميات
Is there any way to perform that fix using the above code?
Would appreciate your help guys, thanks in advance
you can use this script to update all JSON files
import json
filename = 'YourFile.json' # file name we want to compress
newname = filename.replace('.json', '.min.json') # Output file name
with open(filename, encoding="utf8") as fp:
print("Compressing file: " + filename)
print('Compressing...')
jload = json.load(fp)
newfile = json.dumps(jload, indent=None, separators=(',', ':'), ensure_ascii=False)
newfile = newfile.encode('latin1').decode('utf-8') # remove this
#print(newfile)
with open(newname, 'w', encoding="utf8") as f: # add encoding="utf8"
f.write(newfile)
print('Compression complete!')
DecodeJsonToOrigin

Merging multiple JSONs into one: TypeError, a bytes-like object is required, not 'str'

So, I am trying write a small program in python 3.6 to merge multiple JSON files (17k) and I am getting the above error.
I put together the script by reading other SO Q&As. I played around a little bit, getting various errors, nevertheless, I couldn't get it to work. Here is my code:
# -*- coding: utf-8 -*-
import glob
import json
import os
import sys
def merge_json(source, target):
os.chdir(source)
read_files = glob.glob("*.json")
result = []
i = 0
for f in glob.glob("*.json"):
print("Files merged so far:" + str(i))
with open(f, "rb") as infile:
print("Appending file:" + f)
result.append(json.load(infile))
i = i + 1
output_folder = os.path.join(target, "mergedJSON")
output_folder = os.path.join(output_folder)
if not os.path.exists(output_folder):
os.makedirs(output_folder)
os.chdir(output_folder)
with open("documents.json", "wb") as outfile:
json.dump(result, outfile)
try:
sys.argv[1], sys.argv[2]
except:
sys.exit("\n\n Error: Missing arguments!\n See usage example:\n\n python merge_json.py {JSON source directory} {output directory} \n\n")
merge_json(sys.argv[1], sys.argv[2])
In your case you are opening the file in 'wb' mode which means it works with bytes-like objects only. But json.dump is trying to write a string to it. You can simply change the open mode from 'wb' to 'w' (text mode) and it will work.

Replacing commas with blank spaces from a read in text file

import os
os.chdir('my directory')
data = open('text.txt', 'r')
data = data.replace(",", " ")
print(data)
I get the error:
AttributeError: '_io.TextIOWrapper' object has no attribute 'replace'
You should open files in a with statement:
with open('text.txt', 'r') as data:
plaintext = data.read()
plaintext = plaintext.replace(',', '')
the with statement ensures that resources are released properly, so you don't have to worry about remembering to close them.
The more substantial thing you were missing is that data is a file object, and replace works on strings. data.read() returns the string of text in the file.

Categories

Resources