I have use the following code to read a .txt file:
f = os.open(os.path.join(self.dirname, self.filename), os.O_RDONLY)
And when I want to output the content I use this:
os.read(f, 10);
Which means that this method reads 10 bytes from the beginning of the file on. While I need to read the content as much as it is, using some values such as -1 and so. What should I do?
You have two options:
Call os.read() repeatedly.
Open the file using the open() built-in (as opposed to os.open()), and just call f.read() with no arguments.
The second approach carries certain risk, in that you might run into memory issues if the file is very large.
Related
I am attempting to output a new txt file but it come up blank. I am doing this
my_file = open("something.txt","w")
#and then
my_file.write("hello")
Right after this line it just says 5 and then no text comes up in the file
What am I doing wrong?
You must close the file before the write is flushed. If I open an interpreter and then enter:
my_file = open('something.txt', 'w')
my_file.write('hello')
and then open the file in a text program, there is no text.
If I then issue:
my_file.close()
Voila! Text!
If you just want to flush once and keep writing, you can do that too:
my_file.flush()
my_file.write('\nhello again') # file still says 'hello'
my_file.flush() # now it says 'hello again' on the next line
By the way, if you happen to read the beautiful, wonderful documentation for file.write, which is only 2 lines long, you would have your answer (emphasis mine):
Write a string to the file. There is no return value. Due to buffering, the string may not actually show up in the file until the flush() or close() method is called.
If you don't want to care about closing file, use with:
with open("something.txt","w") as f:
f.write('hello')
Then python will take care of closing the file for you automatically.
As Two-Bit Alchemist pointed out, the file has to be closed. The python file writer uses a buffer (BufferedIOBase I think), meaning it collects a certain number of bytes before writing them to disk in bulk. This is done to save overhead when a lot of write operations are performed on a single file.
Also: When working with files, try using a with-environment to make sure your file is closed after you are done writing/reading:
with open("somefile.txt", "w") as myfile:
myfile.write("42")
# when you reach this point, i.e. leave the with-environment,
# the file is closed automatically.
The python file writer uses a buffer (BufferedIOBase I think), meaning
it collects a certain number of bytes before writing them to disk in
bulk. This is done to save overhead when a lot of write operations are
performed on a single file. Ref #m00am
Your code is also okk. Just add a statement for close file, then work correctly.
my_file = open("fin.txt","w")
#and then
my_file.write("hello")
my_file.close()
In python's OS module there is a method to open a file and a method to read a file.
The docs for the open method say:
Open the file file and set various flags according to flags and
possibly its mode according to mode. The default mode is 0777 (octal),
and the current umask value is first masked out. Return the file
descriptor for the newly opened file.
The docs for the read method say;
Read at most n bytes from file descriptor fd. Return a string
containing the bytes read. If the end of the file referred to by fd
has been reached, an empty string is returned.
I understand what it means to read n bytes from a file. But how does this differ from open?
"Opening" a file doesn't actually bring any of the data from the file into your program. It just prepares the file for reading (or writing), so when your program is ready to read the contents of the file it can do so right away.
Opening a file allows you to read or write to it (depending on the flag you pass as the second argument), whereas reading it actually pulls the data from a file that is typcially saved into a variable for processing or printed as output.
You do not always read from a file once it is opened. Opening also allows you to write to a file, either by overwriting all the contents or appending to the contents.
To read from a file:
>>> myfile = open('foo.txt', 'r')
>>> myfile.read()
First you open the file with read permission (r)
Then you read() from the file
To write to a file:
>>> myfile = open('foo.txt', 'r')
>>> myfile.write('I am writing to foo.txt')
The only thing that is being done in line 1 of each of these examples is opening the file. It is not until we actually read() from the file that anything is changed
open gets you a fd (file descriptor), you can read from that fd later.
One may also open a file for other purpose, say write to a file.
It seems to me you can read lines from the file handle without invoking the read method but I guess read() truly puts the data in the variable location. In my course we seem to be printing lines, counting lines, and adding numbers from lines without using read().
The rstrip() method needs to be used, however, because printing the line from the file handle using a for in statement also prints the invisible line break symbol at the end of the line, as does the print statement.
From Python for Everybody by Charles Severance, this is the starter code.
"""
7.2
Write a program that prompts for a file name,
then opens that file and reads through the file,
looking for lines of the form:
X-DSPAM-Confidence: 0.8475
Count these lines and extract the floating point
values from each of the lines and compute the
average of those values and produce an output as
shown below. Do not use the sum() function or a
variable named sum in your solution.
You can download the sample data at
http://www.py4e.com/code3/mbox-short.txt when you
are testing below enter mbox-short.txt as the file name.
"""
# Use the file name mbox-short.txt as the file name
fname = input("Enter file name: ")
fh = open(fname)
for line in fh:
if not line.startswith("X-DSPAM-Confidence:") :
continue
print(line)
print("Done")
I used to read files like this:
f = [i.strip("\n") for i in open("filename.txt")]
which works just fine. I prefer this way because it is cleaner and shorter than traditional file reading code samples available on the web (e.g. f = open(...) , for line in f.readlines() , f.close()).
However, I wonder if there can be any drawback for reading files like this, e.g. since I don't close the file, does Python interpreter handles this itself? Is there anything I should be careful of using this approach?
This is the recommended way:
with open("filename.txt") as f:
lines = [line.strip("\n") for line in f]
The other way may not close the input file for a long time. This may not matter for your application.
The with statement takes care of closing the file for you. In CPython, just letting the file handle object be garbage-collected should close the file for you, but in other flavors of Python (Jython, IronPython, PyPy) you definitely can't count on this. Also, the with statement makes your intentions very clear, and conforms with common practice.
From the docs:
When you’re done with a file, call f.close() to close it and free up any system resources taken up by the open file.
You should always close a file after working with it. Python will not automatically do it for you. If you want a cleaner and shorter way, use a with statement:
with open("filename.txt") as myfile:
lines = [i.strip("\n") for i in myfile]
This has two advantages:
It automatically closes the file after the with block
If an exception is raised, the file is closed regardless.
It might be fine in a limited number of cases, e.g. a temporary test.
Python will only close the file handle after it finishes the execution.
Therefore this approach is a no-go for a proper application.
When we write onto a file using any of the write functions. Python holds everything to write in the file in a buffer and pushes it onto the actual file on the storage device either at the end of the python file or if it encounters a close() function.
So if the file terminates in between then the data is not stored in the file. So I would suggest two options:
use with because as soon as you get out of the block or encounter any exception it closes the file,
with open(filename , file_mode) as file_object:
do the file manipulations........
or you can use the flush() function if you want to force python to write contents of buffer onto storage without closing the file.
file_object.flush()
For Reference: https://lerner.co.il/2015/01/18/dont-use-python-close-files-answer-depends/
in a py module, I write:
outFile = open(fileName, mode='w')
if A:
outFile.write(...)
if B:
outFile.write(...)
and in these lines, I didn't use flush or close method.
Then after these lines, I want to check whether this "outFile" object is empty or not. How can I do with it?
There are a few problems with your code.
You can't .write to a file that you opened with 'r'. You need to open(fileName, 'w').
If A or B then you've certainly written to the file, so it's not empty!
Barring those. you can get the length of a file with
os.stat(outFile.fileno())
EDIT: I'll explain what flush does. Python is often used to do quite large amounts of file reads and writes, which can be slow. It is thus tweaked to make them as fast as possible. One way that is does so is to "buffer" such writes and then do them all in one big block: when you write a small string, Python will remember it but won't actually write it to the file until it thinks it should.
This means that if you want to tell whether you have written data to the file by inspecting the file, you have to tell Python to write all the data it's remembering first, or else you might not see it. flush is the command to write all the buffered data.
Of course, if you ask Python whether it's written anything to the file, say by inspecting the position in the file (.tell()), then it will know about the buffering.
If you've already written to the file, you can use .tell() to check if the current file position is nonzero:
>>> handle = open('/tmp/file.txt', 'w')
>>> handle.write('foo')
>>> handle.tell()
3
This won't work if you .seek() back to the beginning of the file.
You can use os.stat to get file info:
import os
fileSize = os.stat(fileName).st_size
with open("filename.txt", "r+") as f:
if f.read():
# file isn't empty
f.write("something")
# uncomment this line if you want to delete everything else in the file
# f.truncate()
else:
# file is empty
f.write("somethingelse")
"r+" mode always you to read & write.
"with" will automatically close file
Is it possible to parse a file line by line, and edit a line in-place while going through the lines?
Is it possible to parse a file line by line, and edit a line in-place while going through the lines?
It can be simulated using a backup file as stdlib's fileinput module does.
Here's an example script that removes lines that do not satisfy some_condition from files given on the command line or stdin:
#!/usr/bin/env python
# grep_some_condition.py
import fileinput
for line in fileinput.input(inplace=True, backup='.bak'):
if some_condition(line):
print line, # this goes to the current file
Example:
$ python grep_some_condition.py first_file.txt second_file.txt
On completion first_file.txt and second_file.txt files will contain only lines that satisfy some_condition() predicate.
fileinput module has very ugly API, I find beautiful module for this task - in_place, example for Python 3:
import in_place
with in_place.InPlace('data.txt') as file:
for line in file:
line = line.replace('test', 'testZ')
file.write(line)
main difference from fileinput:
Instead of hijacking sys.stdout, a new filehandle is returned for writing.
The filehandle supports all of the standard I/O methods, not just readline().
Important Notes:
This solution deletes every line in the file if you don't re-write it with the file.write() line.
Also, if the process is interrupted, you lose any line in the file that has not already been re-written.
No. You cannot safely write to a file you are also reading, as any changes you make to the file could overwrite content you have not read yet. To do it safely you'd have to read the file into a buffer, updating any lines as required, and then re-write the file.
If you're replacing byte-for-byte the content in the file (i.e. if the text you are replacing is the same length as the new string you are replacing it with), then you can get away with it, but it's a hornets nest, so I'd save yourself the hassle and just read the full file, replace content in memory (or via a temporary file), and write it out again.
If you only intend to perform localized changes that do not change the length of the part of the file that is modified (e.g. changing all characters to lower case), then you can actually overwrite the old contents of the file dynamically.
To do that, you can use random file access with the seek() method of a file object.
Alternatively, you may be able to use an mmap object to treat the whole file as a mutable string. Keep in mind that mmap objects may impose a maximum file-size limit in the 2-4 GB range on a 32-bit CPU, depending on your operating system and its configuration.
You have to back up by the size of the line in characters. Assuming you used readline, then you can get the length of the line and back up using:
file.seek(offset[, whence])
Set whence to SEEK_CUR, set offset to -length.
See Python Docs or look at the manpage for seek.