Reading file stops at halfway - python

I am writing a direct x importer for Blender, to be able to open ascii .x files. For this I try to make a good Python script. I am pretty new to it, actually I just started, but got good results, except a strange ... ummm ... problem: my .x file is pretty large, exactly 3 263 453 bytes long. I will not put my whole code here, just some workaround so the problem is still visible, and in console.
>>> teszt = open('d:\DRA_ACTORDEF_T0.x','rt')
>>> teszt
<_io.TextIOWrapper name='d:\\DRA_ACTORDEF_T0.x' mode='rt' encoding='cp1250'>
then I read the file:
>>> t2 = teszt.readlines()
>>> len(t2)
39768
but then again, when I verify:
>>> import os
>>> os.fstat(teszt.fileno()).st_size
3263453
Could someone lend me a hand and tell me, what the problem is? Maybe I am to set a buffer size or such? Got no idea, how this works in Python.
I open the file same way as above, and I use .readline().
Thank you very much.
EDIT:
The code simplified. I need .readline().
fajlnev = 'd:\DRA_ACTORDEF_T0.x'
import bpy
import os
fajl = open(fajlnev, 'rt')
fajl_teljes_merete = os.fstat(fajl.fileno()).st_size
while (fajl.tell() < fajl_teljes_merete):
print(fajl.tell(),fajl.readline())

readlines returns a list of lines, so when you do len(t2) this will return the number of lines in the file and the length of a file.
If you want the numbers to match you should do:
with open('your_file', 'rb') as f:
data = f.read()
print(len(data))
Also if the file is encoded rt might incorrectly interpret the newlines. So it's much safer to do something like:
import io
with io.open('your_file', 'r', encoding='your_file_encoding') as f:
lines = f.readlines()
And if you want a streaming line by line read then it's best to do:
import io
with io.open('d:\\DRA_ACTORDEF_T0.x', 'r', encoding='your_encoding') as f:
for line in f:
print line
This will take care of streaming and not reading the whole file into memory.
If you still want to use readline:
import io
filename = 'd:\\DRA_ACTORDEF_T0.x'
size = os.stat(filename).st_size
with io.open(filename, 'r', encoding='your_encoding') as f:
while f.tell() < size:
# Do what you want
line = f.readline()

Related

Write list of bytes into a file, but some records got lost

I am new to programming and got an issue with writing bytes. Here is what I wrote:
file = open('filePath/input.train', 'wb')
for i in range(len(myList)):
file.write(bytes((myList[i]),'UTF-8'));
If I print 'i' here, it is 629.
The '.train' suffix is required by the project. In order to check it, I read it and write to a txt file:
file = open('filePath/input.train', 'rb')
content = file.read()
testFile = open('filePath/test.txt', 'wb')
testFile.write(content)
Now, the problem is, len(list) = 629 while I got 591 lines in test.txt file. It brought me problems later.
Why did this happen and how should I solve it?
first, when you open and write a file, need remember close the file after the write.like this.
file = open('filePath/input.train', 'wb')
for i in range(len(myList)):
file.write(bytes((myList[i]),'UTF-8'));
file.close()
second, python code not must has ";"
third, file is python's keyword, so don't use file be your variable name. you can use f or my_file or anyone, but don't use python's keyword.
fourth, python has a iterator, use iterator is better than your for i in range(len(xxx)).
all of this, your code can look like this.
f = open('filePath/input.train', 'wb')
for line in myList:
f.write(bytes(line, 'UTF-8'))
f.close()

How to delete a specifil line by line number in a file?

I'm trying to write a simple Phyton script that alway delete the line number 5 in a tex file, and replace with another string always at line 5. I look around but I could't fine a solution, can anyone tell me the correct way to do that? Here what I have so far:
#!/usr/bin/env python3
import od
import sys
import fileimput
f= open('prova.js', 'r')
filedata = f,read()
f.close ()
newdata = "mynewstring"
f = open('prova.js', 'w')
f.write(newdata, 5)
f.close
basically I need to add newdata at line 5.
One possible simple solution to remove/replace 5th line of file. This solution should be fine as long as the file is not too large:
fn = 'prova.js'
newdata = "mynewstring"
with open(fn, 'r') as f:
lines = f.read().split('\n')
#to delete line use "del lines[4]"
#to replace line:
lines[4] = newdata
with open(fn,'w') as f:
f.write('\n'.join(lines))
I will try to point you in the right direction without giving you the answer directly. As you said in your comment you know how to open a file. So after you open a file you might want to split the data by the newlines (hint: .split("\n")). Now you have a list of each line from the file. Now you can use list methods to change the 5th item in the list (hint: change the item at list[4]). Then you can convert the list into a string and put the newlines back (hint: "\n".join(list)). Then write that string to the file which you know how to do. Now, see if you can write the code yourself. Have fun!

Python failing to read lines properly

I'm supposed to open a file, read it line per line and display the lines out.
Here's the code:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
import re
in_path = "../vas_output/Glyph/20140623-FLYOUT_mins_cleaned.csv"
out_path = "../vas_gender/Glyph/"
csv_read_line = open(in_path, "rb").read().split("\n")
line_number = 0
for line in csv_read_line:
line_number+=1
print str(line_number) + line
Here's the contents of the input file:
12345^67890^abcedefg
random^test^subject
this^sucks^crap
And here's the result:
this^sucks^crapjectfg
Some weird combo of all three. In addition to this, the result of line_number is missing. Printing out the result of len(csv_read_line) outputs 1, for some reason, no matter how many is in the input file. Changing the split type from \n to ^ gives the expected output, though, so I'm assuming the problem is probably with the input file.
I'm using a Mac, and did both the python code and the input file (on Sublime Text) on the Mac itself.
Am I missing something?
You seem to be splitting on "\n" which isn't necessary, and could be incorrect depending on the line terminators used in the input file. Python includes functionality to iterate over the lines of a file one at a time. The advantages are that it will worry about processing line terminators in a portable way, as well as not requiring the entire file to be held in memory at once.
Further, note that you are opening the file in binary mode (the b character in your mode string) when you actually intend to read the file as text. This can cause problems similar to the one you are experiencing.
Also, you do not close the file when you are done with it. In this case that isn't a problem, but you should get in the habit of using with blocks when possible to make sure the file gets closed at the earliest possible time.
Try this:
with open(in_path, "r") as f:
line_number = 0
for line in f:
line_number += 1
print str(line_number) + line.rstrip('\r\n')
So your example just works for me.
But then, i just copied your text into a text editor on linux, and did it that way, so any carriage returns will have been wiped out.
Try this code though:
import os
in_path = "input.txt"
with open(in_path, "rb") as inputFile:
for lineNumber, line in enumerate(inputFile):
print lineNumber, line.strip()
It's a little cleaner, and the for line in file style deals with line breaks for you in a system independent way - Python's open has universal newline support.
I'd try the following Pythonic code:
#!/usr/bin/env python
in_path = "../vas_output/Glyph/20140623-FLYOUT_mins_cleaned.csv"
out_path = "../vas_gender/Glyph/"
with open(in_path, 'rb') as f:
for i, line in enumerate(f):
print(str(i) + line)
There are several improvements that can be made here to make it more idiomatic python.
import csv
in_path = "../vas_output/Glyph/20140623-FLYOUT_mins_cleaned.csv"
out_path = "../vas_gender/Glyph/"
#Lets open the file and make sure that it closes when we unindent
with open(in_path,"rb") as input_file:
#Create a csv reader object that will parse the input for us
reader = csv.reader(input_file,delimiter="^")
#Enumerate over the rows (these will be lists of strings) and keep track of
#of the line number using python's built in enumerate function
for line_num, row in enumerate(reader):
#You can process whatever you would like here. But for now we will just
#print out what you were originally printing
print str(line_num) + "^".join(row)

Why can't I read file after my for loops python using ubuntu

For some reason after my for loops I am not able to read the ouput text file.
For example :
for line in a:
name = (x)
f = open('name','w')
for line in b:
get = (something)
f.write(get)
for line in c:
get2 = (something2)
f.write(get2)
(the below works if the above is commented out only)
f1 = open(name, 'r')
for line in f1:
print line
If I comment out the loops, I am able to read the file and print the contents.
I am very new to coding and guessing this is something obvious that I am missing.However I can't seem to figure it out. I have used google but the more I read the more I feel I am missing something. Any advice is appreciated.
#bernie is right in his comment above. The problem is that when you do open(..., 'w'), the file is rewritten to be blank, but Python/the OS doesn't actually write out the things you write to disk until its buffer is filled up or you call close(). (This delay helps speed things up, because writing to disk is slow.) You can also call flush() to force this without closing the file.
The with statement bernie referred to would look like this:
with open('name', 'w') as f:
for line in b:
b.write(...)
for line in c:
b.write(...)
# f is closed now that we're leaving the with block
# and any writes are actually written out to the file
with open('name', 'r') as f:
for line in f:
print line
If you're using Python 2.5 rather than 2.6 or 2.7, you'll have to do from __future__ import with_statement at the top of your file.

Prepend a line to an existing file in Python

I need to add a single line to the first line of a text file and it looks like the only options available to me are more lines of code than I would expect from python. Something like this:
f = open('filename','r')
temp = f.read()
f.close()
f = open('filename', 'w')
f.write("#testfirstline")
f.write(temp)
f.close()
Is there no easier way? Additionally, I see this two-handle example more often than opening a single handle for reading and writing ('r+') - why is that?
Python makes a lot of things easy and contains libraries and wrappers for a lot of common operations, but the goal is not to hide fundamental truths.
The fundamental truth you are encountering here is that you generally can't prepend data to an existing flat structure without rewriting the entire structure. This is true regardless of language.
There are ways to save a filehandle or make your code less readable, many of which are provided in other answers, but none change the fundamental operation: You must read in the existing file, then write out the data you want to prepend, followed by the existing data you read in.
By all means save yourself the filehandle, but don't go looking to pack this operation into as few lines of code as possible. In fact, never go looking for the fewest lines of code -- that's obfuscation, not programming.
I would stick with separate reads and writes, but we certainly can express each more concisely:
Python2:
with file('filename', 'r') as original: data = original.read()
with file('filename', 'w') as modified: modified.write("new first line\n" + data)
Python3:
with open('filename', 'r') as original: data = original.read()
with open('filename', 'w') as modified: modified.write("new first line\n" + data)
Note: file() function is not available in python3.
Other approach:
with open("infile") as f1:
with open("outfile", "w") as f2:
f2.write("#test firstline")
for line in f1:
f2.write(line)
or a one liner:
open("outfile", "w").write("#test firstline\n" + open("infile").read())
Thanks for the opportunity to think about this problem :)
Cheers
with open("file", "r+") as f: s = f.read(); f.seek(0); f.write("prepend\n" + s)
You can save one write call with this:
f.write('#testfirstline\n' + temp)
When using 'r+', you would have to rewind the file after reading and before writing.
Here's a 3 liner that I think is clear and flexible. It uses the list.insert function, so if you truly want to prepend to the file use l.insert(0, 'insert_str'). When I actually did this for a Python Module I am developing, I used l.insert(1, 'insert_str') because I wanted to skip the '# -- coding: utf-8 --' string at line 0. Here is the code.
f = open(file_path, 'r'); s = f.read(); f.close()
l = s.splitlines(); l.insert(0, 'insert_str'); s = '\n'.join(l)
f = open(file_path, 'w'); f.write(s); f.close()
This does the job without reading the whole file into memory, though it may not work on Windows
def prepend_line(path, line):
with open(path, 'r') as old:
os.unlink(path)
with open(path, 'w') as new:
new.write(str(line) + "\n")
shutil.copyfileobj(old, new)
One possibility is the following:
import os
open('tempfile', 'w').write('#testfirstline\n' + open('filename', 'r').read())
os.rename('tempfile', 'filename')
If you wish to prepend in the file after a specific text then you can use the function below.
def prepend_text(file, text, after=None):
''' Prepend file with given raw text '''
f_read = open(file, 'r')
buff = f_read.read()
f_read.close()
f_write = open(file, 'w')
inject_pos = 0
if after:
pattern = after
inject_pos = buff.find(pattern)+len(pattern)
f_write.write(buff[:inject_pos] + text + buff[inject_pos:])
f_write.close()
So first you open the file, read it and save it all into one string.
Then we try to find the character number in the string where the injection will happen. Then with a single write and some smart indexing of the string we can rewrite the whole file including the injected text now.
Am I not seeing something or couldn't we just use a buffer large-enough to read-in the input file in parts (instead of the whole content) and with this buffer traverse the file while it is open and keep exchanging file<->buffer contents?
This seems much more efficient (for big files especially) than reading the whole content in memory, modifying it in memory and writing it back to the same file or (even worse) a different one. Sorry that now I don't have time to implement a sample snippet, I'll get back to this later, but maybe you get the idea.
As I suggested in this answer, you can do it using the following:
def prepend_text(filename: Union[str, Path], text: str):
with fileinput.input(filename, inplace=True) as file:
for line in file:
if file.isfirstline():
print(text)
print(line, end="")
If you rewrite it like this:
with open('filename') as f:
read_data = f.read()
with open('filename', 'w') as f:
f.write("#testfirstline\n" + read_data)
It's rather short and simple.
For 'r+' the file needs to exist already.
this worked for me
def prepend(str, file):
with open(file, "r") as fr:
read = fr.read()
with open(file, "w") as fw:
fw.write(str + read)
fw.close()

Categories

Resources