JSON File I/O : Extra Data Error

JSON File I/O : Extra Data Error - python

I am learning Python currently. For a small project, I am writing a script to dump and load JSON extracted from the web. The file needs to be constantly updated after pulling the data each time and for the same, I have written the following code.
with open(os.path.join(d,fname),'a+') as f:
try:
f.seek(0)
t = json.load(f)
for i in t:
tmp[i]=t[i]
except Exception as e:
print(e,"New File ",fname," is created in ",d)
f.truncate()
json.dump(tmp,f)
I have put a try-catch block since the first time this program runs, the file would have no data written.
When I run the script, it works as expected but when I run the same script the fourth time, it gives EXTRA DATA exception.
Extra data: line 1 column 29245 (char 29244) New File TSLA_dann is created in 2017-12-20
I am not sure how another dictionary is being written in the file. Please guide me to the same.

It is nearly impossible to write another json with such code. Your code is not good. You mix too much try open, seek and truncate, wrong file mode choice maybe. I will teach you little to be much better:
try should cover only what can raise error.
Seek is not need always seek(0) is after open.
open(x, 'a+) mean append to the end I think (i can be reason of error).
use spaces.
be patient.
Problem is probably 'a+' mode but it is not matter clean the code :)
Believe me I writing 250 000 line programs without problems.
Clean code for you as good example should work (I was not tested - you can fix it if one letter missed or just run):
# read
file_path = os.path.join(d, fname)
with open(file_path, 'r') as f: # 'r' is read can be skipped
try:
t = json.load(f)
except Exception as e:
print('%s %s' % (e, file_path))
for i in t:
tmp[i] = t[i]
# write
with open(file_path, 'w') as f:
json.dump(tmp, f)

Related

Appending a text file to a text file

I've run to a error. I've been trying to append a text file to itself like so:
file_obj = open("text.txt", "a+")
number = 6
def appender(obj, num):
count = 0
while count<=num:
read = file_obj.read()
file_obj.seek(0,2)
file_obj.write(read)
count+=1
appender(file_obj, number)
However, the text.txt file is then filled with strange ASCII symbols. At first, the file contains only a simple "hello", but after the code, it contains this:
hellohello䀀 猀· d娀 Ť搀Ŭ娀ͤ攀ɪ昀Ѥ萀 夀ɚ搀ť樀Ŧ搀茀 婙ݤ攀Ѫ昀ࡤ萀 夀њ搀
ɥ攀ժ昀൤
茀 婙୤攀ť樀ɦ搀茀 婙൤萀 ݚ搀࡚攀४攀ƃ娀਍搀⡓ 癳  祐桴湯䌠慨慲瑣牥䴠灡楰杮
䌠摯捥挠ㅰ㔲‰敧敮慲整⁤牦浯✠䅍偐义升嘯久佄卒䴯䍉䙓⽔䥗䑎坏⽓偃㈱〵吮员‧楷桴朠湥潣敤⹣祰
മഊ椊 and so on.
Any help will be appreciated

I think I can fix your problem, even though I can't reproduce it. There's a logic error: after you write, you fail to return to the start of the file for reading. In terms of analysis, you failed to do anything to diagnose the problem. At the very least, use a print statement to see what you're reading: that highlights the problem quite well. Here's the loop I used:
count = 0
while count<=num:
file_obj.seek(0) # Read from the beginning of the file.
read = file_obj.read()
print(count, read) # Trace what we're reading.
file_obj.seek(0, 2)
file_obj.write(read)
count+=1
This gives the expected output of 128 (2^(6+1)) repetitions of "hello".
EXTENSIONS
I recommend that you learn to use both the for loop and the with open ... as idiom. These will greatly shorten your program and improve the readability.

I am using this code and everything is working as expected:
with open("file.txt") as f:
for line in f:
f.write(line)

You just have the wrong mode - use 'r+' rather than 'a+'. See this link for a list of modes and an explanation of reading files.

Read the file while python is still rewriting the file

I am trying to access the content of a file while that file is still being updated. Following is my code that does the writing to file job:
for i in range(100000):
fp = open("text.txt", "w")
fp.write(str(i))
fp.close()
#time.sleep(1)
My problem now is that whenever I try to open my file while the for loop is still running, I get an empty file in text editor(I except to see an "updated" number).
I am wondering is there a way that allows me to view the content of file before the for loop ends?
Thanks in advance for any help: )

Do not open file inside for loop. It is a bad practice and bad code.
Each and every time you create a new object. That is why you get an empty file.
fp = open("text.txt", "r+")
for i in range(100000):
fp.seek(0)
fp.write(str(i))
fp.truncate()
fp.close()

Modern operating systems buffer file writes to improve performance: several consecutive writes are lumped together before they are actually written to the disc. If you want the writes to propagate to the disk immediately, use method flush(), but remember that it drastically reduces application performance:
with open("text.txt", "w") as fp:
for i in range(100000):
fp.write(str(i))
fp.flush()

When you write to a file it usually does not actually get written to disk until file is closed. If you want it to be written immediately you need to add a flush to each iteration, hence:
fp = open("text.txt", "w")
for i in range(100000):
fp.write(str(i))
fp.write("\n")
fp.flush()
fp.close()

os.remove() in windows gives "[Error 32] being used by another process"

I know this question has been asked before quiet a lot on SO and elsewhere too. I still couldn't get it done. And im sorry if my English is bad
Removing file in linux was much more simpler. Just os.remove(my_file) did the job, But in windows it gives
os.remove(my_file)
WindowsError: [Error 32] The process cannot access the file because it is being used by another process: (file-name)
my code :
line_count = open(my_file, mode='r') #
t_lines = len(line_count.readlines()) # For total no of lines
outfile = open(dec_file, mode='w')
with open(my_file, mode='r') as contents:
p_line = 1
line_infile = contents.readline()[4:]
while line_infile:
dec_of_line = baseconvert(line_infile.rstrip(),base16,base10)
if p_line == t_lines:
dec_of_line += str(len(line_infile)).zfill(2)
outfile.write(dec_of_line + "\r\n")
else:
outfile.write(dec_of_line + "\r\n")
p_line += 1
line_infile = contents.readline()[4:]
outfile.close()
os.remove(my_file)
Here my_file is a variable that contains complete path structure of a file. Like wise dec_file also contains path, but to a new file. And the file im trying to remove is the file that's being used under read mode. Need some help please.
my try's :
Tried closing the file my_file.close(). The corresponding error i got was AttributeError: 'str' object has no attribute 'close'. I knew, when a file is in
read mode it automatically closes when it comes to the end of the
file. But still i gave it a try
Also tried by os.close(my_file) as per https://stackoverflow.com/a/1470388/3869739. i got error as TypeError: an integer is required
Or, am i getting this error just because i have opened the file
twice (for counting the line and to read the file content),..?

Pythonic way of reading from or writing to a file is by using a with context.
To read a file:
with open("/path/to/file") as f:
contents = f.read()
#Inside the block, the file is still open
# Outside `with` here, f.close() is automatically called.
To write:
with open("/path/to/file", "w") as f:
print>>f, "Goodbye world"
# Outside `with` here, f.close() is automatically called.
Now, if there's no other process reading or writing to the file, and assuming you have all the permission you should be able to close the file. There is a very good chance that there's a resource leak (file handle not being closed), because of which Windows will not allow you to delete a file. Solution is to use with.
Further, to clarify on few other points:
Its the garbage collector that causes the closure of the stream when the object is destroyed. The file is not auto-closed upon complete reading. That wouldn't make sense if the programmer wanted to rewind, would it?
os.close(..) internally calls the C-API close(..) that takes an integer file descriptor. Not string as you passed.

Try opening a file as an archive, otherwise read as a regular file

I am trying to process a list of files, where each may be a regular text file OR a bz2 archive.
How can I use try-except blocks most efficiently to attempt to open each file in the appropriate format? I would rather not check the file's extension, as this cannot always be relied upon (and is not very EAFP).
Currently I am doing:
def data_generator(*corpora):
def parse_lines(fobj):
for line in fobj:
# Do lots of processing.
# ...
# Many lines here omitted.
yield ('lots', 'of', 'data')
for corpus in corpora:
try:
with bz2.BZ2File(corpus, mode='r') as f:
for data in parse_lines(f):
yield data
except IOError:
with codecs.open(corpus, encoding='utf-8') as f:
for data in parse_lines(f):
yield data
I think the repeated for data in parse_lines(f): ... code looks superfluous, but I can't think of a way to get rid of it. Is there any way to reduce the previous, or is there another way to try to "smart open" a file?
Edit: Optional followup
What would be an appropriate way to scale up the number of formats checked? As an example, the program 7zip allows you to right-click on any file and attempt to open it as an archive (any that 7zip supports). With the current try-except block strategy, it seems like you would start getting nested in blocks pretty quickly even after just a few formats, like:
try:
f = ...
except IOError:
try:
f = ...
except IOError:
try:
...

If it's really just the duplicate loops that have got you concerned, you could move f out of the scope of the try-catch block, then put a single copy of the loop after everything's said and done:
try:
f = bz2.BZ2File(corpus, mode='r')
except IOError:
f = codecs.open(corpus, encoding='utf-8')
for data in parse_lines(f):
yield data
f.close()
Although I'd look into only opening the file once, checking for the BZ2 header (the characters BZ as the first two bytes), and using that to decide whether to continue reading it as plaintext, or pass the data into a bz2.BZ2Decompressor instance.

Check to see if file exists is failing

Here is my code:
# header.py
def add_header(filename):
header = '"""\nName of Project"""'
try:
f = open(filename, 'w')
except IOError:
print "Sorry could not open file, please check path"
else:
with f:
f.seek(0,0)
f.write(header)
print "Header added to", filename
if __name__ == "__main__":
filename = raw_input("Please provide path to file: ")
add_header(filename)
When I run this script (by doing python header.py), even when I provide a filename which does not exist it does not return the messages in the function. It returns nothing even when I replace the print statements with return statements. How would I show the messages in the function?

I believe you are always creating the file. Therefore, you won't see a file not there exception. It does not hurt to put a write or file open write under try except, because you might not have privileges to create the file.
I have found with statements like try except and else to test those at the Python command line, which is a very excellent place to work out cockpit error, and I'm very experienced at generating a lot of cockpit error while proving out a concept.
The fact you're using try except is very good. I just have to go review what happens when a logic flow goes through one of them. The command line is a good place to do that.

The correct course of action here is to try and read the file, if it works, read the data, then write to the file with the new data.
Writing to a file will create the file if it doesn't exist, and overwrite existing contents.
I'd also note you are using the with statement in an odd manner, consider:
try:
with open(filename, 'w') as f:
f.seek(0,0)
f.write(header)
print("Header added to", filename)
except IOError:
print("Sorry could not open file, please check path")
This way is more readable.
To see how to do this the best way possible, see user1313312's answer. My method works but isn't the best way, I'll leave it up for my explanation.
Old answer:
Now, to solve your problem, you really want to do something like this:
def add_header(filename):
header = '"""\nName of Project"""'
try:
with open(filename, 'r') as f:
data = f.read()
with open(filename, 'w') as f:
f.write(header+"\n"+data)
print("Header added to"+filename)
except IOError:
print("Sorry could not open file, please check path")
if __name__ == "__main__":
filename = raw_input("Please provide path to file: ")
add_header(filename)
As we only have the choices of writing to a file (overwriting the existing contents) and appending (at the end) we need to construct a way to prepend data. We can do this by reading the contents (which handily checks the file exists at the same time) and then writing the header followed by the contents (here I added a newline for readability).

This is a slightly modified version of Lattywares solution. Since it is not possible to append data to the beginning of a file, the whole content is read and the file is written anew including your header. By opening the file in read/write mode we can do both operations with the same file handler without releasing it. This should provide some protection against race conditions.
try:
with open(filename, 'r+') as f:
data = f.read()
f.seek(0,0)
f.write(header)
f.write(data)
#f.truncate() is not needed here as the file will always grow
print("Header added to", filename)
except IOError:
print("Sorry, could not open file for reading/writing")

this script opens a file in "w" mode (write mode),which means once the file dose not exist,it will be created. So No IOError.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

JSON File I/O : Extra Data Error - python

Related

Appending a text file to a text file

Read the file while python is still rewriting the file

os.remove() in windows gives "[Error 32] being used by another process"

Try opening a file as an archive, otherwise read as a regular file

Check to see if file exists is failing

Categories

Resources