This question already has answers here:
Line reading chokes on 0x1A
(2 answers)
Closed 4 years ago.
Newbie question. In Python 2.7.2., I have a problem reading text files which accidentally seem to contain some control characters. Specifically, the loop
for line in f
will cease without any warning or error as soon as it comes across a line containing the SUB character (ascii hex code 1a). When using f.readlines() the result is the same. Essentially, as far as Python is concerned, the file is finished as soon as the first SUB character is encountered, and the last value assigned line is the line up to that character.
Is there a way to read beyond such a character and/or to issue a warning when encountering one?
On Windows systems 0x1a is the End-of-File character. You'll need to open the file in binary mode in order to get past it:
f = open(filename, 'rb')
The downside is you will lose the line-oriented nature and have to split the lines yourself:
lines = f.read().split('\r\n') # assuming Windows line endings
Try opening the file in binary mode:
f = open(filename, 'rb')
Related
This question already has answers here:
What is the python "with" statement designed for?
(11 answers)
Closed 6 months ago.
Running below code in Azure Blobstorage concurrently is throwing
OS Error 22:Invalid argument pointing to f.close()
Is using Close() when using with open() creating OS error 22 issues?
Understand that Close() is not required but want to understand the root cause of OS error 22
*with open(openfilepath,"w+") as f:
f.write(writeFile )
f.close()*
Check the below possibilities and try.
1
If f.close() to be used you may try this by not using with
f = open(file_name, 'w+') # open file in write mode
f.write('write content')
f.close()
BUT
It is good practice to use with keyword when dealing with file objects. The advantage is that the file is properly closed after its suite finishes. once Python exits from the “with” block, the file is automatically closed.
So Please Remove f.close() and try.
with open(file_name, 'w+') as f :
f.write('write content')
Refer this for more info.
2
If above doen’t work check the filepath : It might be due to some invalid characters present in the file path name: It should not contain few special characters.
See if file path is for example : "dbfs:/mnt/data/output/file_name.xlsx"
Check if “/” is there before dbfs ( say /dbfs:/mnt/…).Try Removing if present.
NOTE:
``r+'': Open for reading and writing. The stream is positioned at
the beginning of the file.
``w+'': Open for reading and writing. The file is created if it does
not exist, otherwise it is truncated. The stream is positioned at the
beginning of the file.
``a+'' : Open for reading and writing. The file is created if it does
not exist. The stream is positioned at the end of the file. Subse-
quent writes to the file will always end up at the then current end of
file, irrespective of any intervening fseek(3) or similar.
So try using other modes like r+ if w+ is not mandatory .See Python documentation
References:
1 , 2 ,3 , 4
Removing the F.close() fixed the issues when using With
This question already has answers here:
Python readlines() usage and efficient practice for reading
(2 answers)
Closed 7 years ago.
I am new to computer science and am trying to create a function in python that will open files on my computer.
I know that the function f.readline() grabs the current line as a string, but what makes the functions f.read() and for line in f: different? Thanks.
read(x) will read up to x bytes in a file. If you don't supply the size, the entire file is read.
readline(x) will read up to x bytes or a newline, whichever comes first. If you don't supply a size, it will read all data until it hits a newline.
When using for line in f, it will call the next() method under the hood which really just does something very similar to readline (although I see references that is may do some buffering more efficiently since iterating usually means you are planning to read the entire file).
There is also readlines() which reads all lines into memory.
This question already has answers here:
Confused by python file mode "w+" [duplicate]
(11 answers)
Closed 8 years ago.
I imagine this is a question asked already twenty thousand times, but I cannot understand why the file is always empty. I want to open a file, remove a string from the whole file and then rewrite the content, but the file ends up being empty. This is the code I use:
f = open(filename,'w+')
f.write(f.read().replace(str_to_del,""))
f.close()
But the file is always empty. If I instead use "r+" then the content is appended and I have a duplicate text in the file. I'm using Python 3.3 . What am I missing?
Opening the file in w+ mode truncates the file. So, your f.read() is guaranteed to return nothing.
You can do this by opening the file in r+ mode, reading it, then calling f.seek(0), then writing. Or by opening the file in r mode, reading it, closing it, reopening it in w mode, and writing. Or, better, by writing a temporary file and moving it over the original (which gives you "atomic" behavior—no possibility of ending up with a half-written file).
This question already has answers here:
Editing specific line in text file in Python
(11 answers)
Closed 3 months ago.
Is there a way, in Python, to modify a single line in a file without a for loop looping through all the lines?
The exact positions within the file that need to be modified are unknown.
This should work -
f = open(r'full_path_to_your_file', 'r') # pass an appropriate path of the required file
lines = f.readlines()
lines[n-1] = "your new text for this line" # n is the line number you want to edit; subtract 1 as indexing of list starts from 0
f.close() # close the file and reopen in write mode to enable writing to file; you can also open in append mode and use "seek", but you will have some unwanted old data if the new data is shorter in length.
f = open(r'full_path_to_your_file', 'w')
f.writelines(lines)
# do the remaining operations on the file
f.close()
However, this can be resource consuming (both time and memory) if your file size is too large, because the f.readlines() function loads the entire file, split into lines, in a list.
This will be just fine for small and medium sized files.
Unless we're talking about a fairly contrived situation in which you already know a lot about the file, the answer is no. You have to iterate over the file to determine where the newline characters are; there's nothing special about a "line" when it comes to file storage -- it all looks the same.
Yes, you can modify the line in place, but if the length changes, you will have to rewrite the remainder of the file.
You'll also need to know where the line is, in the file. This usually means the program needs to at least read through the file up to the line that needs to be changed.
There are exceptions - if the lines are all fixed length, or you have some sort of index on the file for example
This question already has answers here:
How to read a file line-by-line into a list?
(28 answers)
Closed 8 years ago.
I want to prompt a user for a number of random numbers to be generated and saved to a file. He gave us that part. The part we have to do is to open that file, convert the numbers into a list, then find the mean, standard deviation, etc. without using the easy built-in Python tools.
I've tried using open but it gives me invalid syntax (the file name I chose was "numbers" and it saved into "My Documents" automatically, so I tried open(numbers, 'r') and open(C:\name\MyDocuments\numbers, 'r') and neither one worked).
with open('C:/path/numbers.txt') as f:
lines = f.read().splitlines()
this will give you a list of values (strings) you had in your file, with newlines stripped.
also, watch your backslashes in windows path names, as those are also escape chars in strings. You can use forward slashes or double backslashes instead.
Two ways to read file into list in python (note these are not either or) -
use of with - supported from python 2.5 and above
use of list comprehensions
1. use of with
This is the pythonic way of opening and reading files.
#Sample 1 - elucidating each step but not memory efficient
lines = []
with open("C:\name\MyDocuments\numbers") as file:
for line in file:
line = line.strip() #or some other preprocessing
lines.append(line) #storing everything in memory!
#Sample 2 - a more pythonic and idiomatic way but still not memory efficient
with open("C:\name\MyDocuments\numbers") as file:
lines = [line.strip() for line in file]
#Sample 3 - a more pythonic way with efficient memory usage. Proper usage of with and file iterators.
with open("C:\name\MyDocuments\numbers") as file:
for line in file:
line = line.strip() #preprocess line
doSomethingWithThisLine(line) #take action on line instead of storing in a list. more memory efficient at the cost of execution speed.
the .strip() is used for each line of the file to remove \n newline character that each line might have. When the with ends, the file will be closed automatically for you. This is true even if an exception is raised inside of it.
2. use of list comprehension
This could be considered inefficient as the file descriptor might not be closed immediately. Could be a potential issue when this is called inside a function opening thousands of files.
data = [line.strip() for line in open("C:/name/MyDocuments/numbers", 'r')]
Note that file closing is implementation dependent. Normally unused variables are garbage collected by python interpreter. In cPython (the regular interpreter version from python.org), it will happen immediately, since its garbage collector works by reference counting. In another interpreter, like Jython or Iron Python, there may be a delay.
f = open("file.txt")
lines = f.readlines()
Look over here. readlines() returns a list containing one line per element. Note that these lines contain the \n (newline-character) at the end of the line. You can strip off this newline-character by using the strip()-method. I.e. call lines[index].strip() in order to get the string without the newline character.
As joaquin noted, do not forget to f.close() the file.
Converting strint to integers is easy: int("12").
The pythonic way to read a file and put every lines in a list:
from __future__ import with_statement #for python 2.5
with open('C:/path/numbers.txt', 'r') as f:
lines = f.readlines()
Then, assuming that each lines contains a number,
numbers =[int(e.strip()) for e in lines]
You need to pass a filename string to open. There's an extra complication when the string has \ in it, because that's a special string escape character to Python. You can fix this by doubling up each as \\ or by putting a r in front of the string as follows: r'C:\name\MyDocuments\numbers'.
Edit: The edits to the question make it completely different from the original, and since none of them was from the original poster I'm not sure they're warrented. However it does point out one obvious thing that might have been overlooked, and that's how to add "My Documents" to a filename.
In an English version of Windows XP, My Documents is actually C:\Documents and Settings\name\My Documents. This means the open call should look like:
open(r"C:\Documents and Settings\name\My Documents\numbers", 'r')
I presume you're using XP because you call it My Documents - it changed in Vista and Windows 7. I don't know if there's an easy way to look this up automatically in Python.
hdl = open("C:/name/MyDocuments/numbers", 'r')
milist = hdl.readlines()
hdl.close()
To summarize a bit from what people have been saying:
f=open('data.txt', 'w') # will make a new file or erase a file of that name if it is present
f=open('data.txt', 'r') # will open a file as read-only
f=open('data.txt', 'a') # will open a file for appending (appended data goes to the end of the file)
If you wish have something in place similar to a try/catch
with open('data.txt') as f:
for line in f:
print line
I think #movieyoda code is probably what you should use however
If you have multiple numbers per line and you have multiple lines, you can read them in like this:
#!/usr/bin/env python
from os.path import dirname
with open(dirname(__file__) + '/data/path/filename.txt') as input_data:
input_list= [map(int,num.split()) for num in input_data.readlines()]