Deleting the first line of a text file in python [duplicate] - python

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Editing specific line in text file in python
I am writing a software that allows users to write data into a text file. However, I am not sure how to delete the first line of the text file and rewrite the line. I want the user to be able to update the text file's first line by clicking on a button and inputing in something but that requires deleting and writing a new line as the first line which I am not sure how to implement. Any help would be appreciated.
Edit:
So I sought out the first line of the file and tried to write another line but that doesn't delete the previous line.
file.seek(0)
file.write("This is the new first line \n")

You did not describe how you opened the file to begin with. If you used file = open(somename, "a") that file will not be truncated but new data is written at the end (even after a seek on most if not all modern systems). You would have to open the file with "r+")
But your example assumes that the line you write is exactly the same length as what the user typed. There is no line organisation in the files, just bytes, some of which indicate line ending.
Wat you need to do is use a temporary file or a temporary buffer in memory for all the lines and then write the lines out with the first replaced.
If things fit in memory (which I assume since few users are going to type so much it does not fit), you should be able to do:
lines = open(somename, 'r').readlines()
lines[0] = "This is the new first line \n"
file = open(somename, 'w')
for line in lines:
file.write(line)
file.close()

You could use readlines to get an array of lines and then use del on the first index of the array. This might help. http://www.daniweb.com/software-development/python/threads/68765/how-to-remove-a-number-of-lines-from-a-text-file-

Related

Python script not finding value in log file when the value is in the file

The code below is meant to find any xls or csv file used in a process. The .log file contains full paths with extensions and definitely contains multiple values with "xls" or "csv". However, Python can't find anything...Any idea? The weird thing is when I copy the content of the log file and paste it to another notepad file and save it as log, it works then...
infile=r"C:\Users\me\Desktop\test.log"
important=[]
keep_words=["xls","csv"]
with open(infile,'r') as f:
for line in f:
for word in keep_words:
if word in line:
important.append(line)
print(important)
I was able to figure it out...encoding issue...
with io.open(infile,encoding='utf16') as f:
You must change the line
for line in f:
to
for line in f.readlines():
You made the python search in the bytes opened file, not in his content, even in his lines (in a list, just like the readlines method);
I hope I was able to help (sorry about my bad English).

How does readline() work behind the scenes when reading a text file?

I would like to understand how readline() takes in a single line from a text file. The specific details I would like to know about, with respect to how the compiler interprets the Python language and how this is handled by the CPU, are:
How does the readline() know which line of text to read, given that successive calls to readline() read the text line by line?
Is there a way to start reading a line of text from the middle of a text? How would this work with respect to the CPU?
I am a "beginner" (I have about 4 years of "simpler" programming experience), so I wouldn't be able to understand technical details, but feel free to expand if it could help others understand!
Example using the file file.txt:
fake file
with some text
in a few lines
Question 1: How does the readline() know which line of text to read, given that successive calls to readline() read the text line by line?
When you open a file in python, it creates a file object. File objects act as file descriptors, which means at any one point in time, they point to a specific place in the file. When you first open the file, that pointer is at the beginning of the file. When you call readline(), it moves the pointer forward to the character just after the next newline it reads.
Calling the tell() function of a file object returns the location the file descriptor is currently pointing to.
with open('file.txt', 'r') as fd:
print fd.tell()
fd.readline()
print fd.tell()
# output:
0
10
# Or 11, depending on the line separators in the file
Question 2: Is there a way to start reading a line of text from the middle of a text? How would this work with respect to the CPU?
First off, reading a file doesn't really have anything to do with the CPU. It has to do with the operating system and the file system. Both of those determine how files can be read and written to. Barebones explanation of files
For random access in files, you can use the mmap module of python. The Python Module of the Week site has a great tutorial.
Example, jumping to the 2nd line in the example file and reading until the end:
import mmap
import contextlib
with open('file.txt', 'r') as fd:
with contextlib.closing(mmap.mmap(fd.fileno(), 0, access=mmap.ACCESS_READ)) as mm:
print mm[10:]
# output:
with some text
in a few lines
This is a very broad question and it's unlikely that all details about what the CPU does would fit in an answer. But a high-level answer is possible:
readline reads each line in order. It starts by reading chunks of the file from the beginning. When it encounters a line break, it returns that line. Each successive invocation of readline returns the next line until the last line has been read. Then it returns an empty string.
with open("myfile.txt") as f:
while True:
line = f.readline()
if not line:
break
# do something with the line
Readline uses operating system calls under the hood. The file object corresponds to a file descriptor in the OS, and it has a pointer that keeps track of where in the file we are at the moment. The next read will return the next chunk of data from the file from that point on.
You would have to scan through the file first in order to know how many lines there are, and then use some way of starting from the "middle" line. If you meant some arbitrary line except the first and last lines, you would have to scan the file from the beginning identifying lines (for example, you could repeatedly call readline, throwing away the result), until you have reached the line you want). There is a ready-made module for this: linecache.
import linecache
linecache.getline("myfile.txt", 5) # we already know we want line 5

How to delete a line from a file in Python

I tried to use this: Deleting a line from a file in Python. I changed the code a bit to suit my purposes, like so:
def deleteLine():
f=open("cata.testprog","w+")
content=f.readlines()
outp = []
print(f.readline())
for lines in f:
print(lines)
if(lines[:4]!="0002"):
#if the first four characters of a line do not equal
outp.append(lines)
print(outp)
f.writelines(outp)
f.close()
The problem is that the entire file gets replaced by an empty string and outp is equal to an empty list. My file is not empty, and I want to delete the line that begins with "0002". Printing f.readline() gave me an empty string. Does anyone have any idea of how to fix it?
Thanks in advance,
You changed the file handling completely compared to the linked question. There, the file is first opened to read the content, closed and then opened to write the processed content. Here you open it and you first use f.readlines(), so all the data of the file is in content and the file pointer is at the end of the file.
You have opened file in wrong mode: "w+". This mode is used for writing and reading, but at first it is truncated. At first open file for reading. If it is not huge file read it into memory, convert there and save it by opening file in write mode.

Modifying a single line in a file [duplicate]

This question already has answers here:
Editing specific line in text file in Python
(11 answers)
Closed 3 months ago.
Is there a way, in Python, to modify a single line in a file without a for loop looping through all the lines?
The exact positions within the file that need to be modified are unknown.
This should work -
f = open(r'full_path_to_your_file', 'r') # pass an appropriate path of the required file
lines = f.readlines()
lines[n-1] = "your new text for this line" # n is the line number you want to edit; subtract 1 as indexing of list starts from 0
f.close() # close the file and reopen in write mode to enable writing to file; you can also open in append mode and use "seek", but you will have some unwanted old data if the new data is shorter in length.
f = open(r'full_path_to_your_file', 'w')
f.writelines(lines)
# do the remaining operations on the file
f.close()
However, this can be resource consuming (both time and memory) if your file size is too large, because the f.readlines() function loads the entire file, split into lines, in a list.
This will be just fine for small and medium sized files.
Unless we're talking about a fairly contrived situation in which you already know a lot about the file, the answer is no. You have to iterate over the file to determine where the newline characters are; there's nothing special about a "line" when it comes to file storage -- it all looks the same.
Yes, you can modify the line in place, but if the length changes, you will have to rewrite the remainder of the file.
You'll also need to know where the line is, in the file. This usually means the program needs to at least read through the file up to the line that needs to be changed.
There are exceptions - if the lines are all fixed length, or you have some sort of index on the file for example

question about splitting a large file

Hey I need to split a large file in python into smaller files that contain only specific lines. How do I do this?
You're probably going to want to do something like this:
big_file = open('big_file', 'r')
small_file1 = open('small_file1', 'w')
small_file2 = open('small_file2', 'w')
for line in big_file:
if 'Charlie' in line: small_file1.write(line)
if 'Mark' in line: small_file2.write(line)
big_file.close()
small_file1.close()
small_file2.close()
Opening a file for reading returns an object that allows you to iterate over the lines. You can then check each line (which is just a string of whatever that line contains) for whatever condition you want, then write it to the appropriate file that you opened for writing. It is worth noting that when you open a file with 'w' it will overwrite anything already written to that file. If you want to simply add to the end, you should open it with 'a', to append.
Additionally, if you expect there to be some possibility of error in your reading/writing code, and want to make sure the files are closed, you can use:
with open('big_file', 'r') as big_file:
<do stuff prone to error>
Do you mean breaking it down into subsections? Like if I had a file with chapter 1, chapter 2, and chapter 3, you want it to be broken down into separate files for each chapter?
The way I've done this is similar to Wilduck's response, but closes the input file as soon as it reads in the data and keeps all the lines read in.
data_file = open('large_file_name', 'r')
lines = data_file.readlines()
data_file.close()
outputFile = open('output_file_one', 'w')
for line in lines:
if 'SomeName' in line:
outputFile.write(line)
outputFile.close()
If you wanted to have more than one output file you could either add more loops or open more than one outputFile at a time.
I'd recommend using Wilducks response, however, as it uses less space and will take less time with larger files since the file is read only once.
How big and does it need to be done in python? If this is on unix, would split/csplit/grep suffice?
First, open the big file for reading.
Second, open all the smaller file names for writing.
Third, iterate through every line. Every iteration, check to see what kind of line it is, then write it to that file.
More info on File I/O: http://docs.python.org/tutorial/inputoutput.html

Categories

Resources