I'm trying to write the output of something that is being done over three big iterations and each time I'm opening and closing the outfile. Counters get reset and things like this after the iterations and I'm a massive newb and would struggle to work around this with the shoddy code I've written. So even if it's slower I'd like change the way it is being output.
Currently for the output it's just rewriting over the first line so I have only the output of the last run of the program. (tau, output are variables given values in the iterations above in the code)
with open(fileName + '.autocorrelate', "w") as outfile:
outfile.writelines('{0} {1}{2}'.format(tau, output, '\n'))
I was wondering if there are any quick ways to get python to check for the first empty line when it opens a file and write the new line there?
Open with "a" instead of "w" will write at the end of the file. That's the way to not overwrite.
If you open your file in append mode : "a" instead of "w", you will be able to write a new line at the end of your file.
You do do something like that to keep a reference (line number) to every empty line in a file
# Get file contents
fd = open(file)
contents = fd.readlines()
fd.close()
empty_line = []
i = 0
# find empty line
for line in contents:
if line == "":
empty_line.append(i)
i+=1
Related
okay, this may have been talked about before, but I am unable to find it anywhere on stack so here i am.
Basically I am writing a script that will take a .txt document and store every other line (even lines say) and print them into a new text document.
I was able to successfully write my code to scan the text and remove the even numbered lines and put them into a list as independent variables but when i got to add each item of the list to the new text documents, depending on where i do that i get either the first line or the last line but never more than one.
here is what i have
f = open('stuffs.txt', 'r')
i = 1
x = []
for line in f.readlines():
if i % 2 == 0:
x.append(line)
i += 1
I have tested that this successfully takes the proper lines and stores them in list x
i have tried
for m in x:
t = open('stuffs2.txt','w')
t.write(m)
directly after, and it only prints the last line
if i do
for line in f.readlines():
if i % 2 == 0:
t = open('stuffs2.txt','w')
t.write(line)
i += 1
it will print the first line
if i try to add the first solution to the for loop as a nested for loop it will also print the first line. I have no idea why it is not taking each individual item and putting it in the .txt
when i print the list it is in there as it should be.
Did look for a canonical - did not find one...
open('stuffs2.txt','w') - "w" == kill whats there and open new empty file ...
Read the documentation: reading-and-writing-files :
7.2. Reading and Writing Files
open() returns a file object, and is most commonly used with two arguments: open(filename, mode). f = open('workfile', 'w')
The first argument is a string containing the filename.
The second argument is another string
containing a few characters describing the way in which the file will
be used.
mode can be 'r' when the file will only be read, 'w' for only
writing (an existing file with the same name will be erased), and 'a'
opens the file for appending; any data written to the file is
automatically added to the end. 'r+' opens the file for both reading
and writing. The mode argument is optional; 'r' will be assumed if
it’s omitted.
To write every 2nd line more economically:
with open("file.txt") as f, open("target.txt","w") as t:
write = True
for line in f:
if write:
t.write(line)
write = not write
this way you do not need to store all lines in memory.
The with open(...) as name : syntax is also better - it will close your filehandle (which you do not do) even if exceptions arise.
I don't know what's wrong with the code it doesn't get me any kind of error message from the shell.
What I'm trying to do is:
Merge all list files from a directory into a single list(with one single column with a single string per row) - done!
Compare that list with a big-file and copy every single correspondent line into a new single file for each line - (maybe?) done! But not working. =/
save the files from step 2 in a new output_directory. - not working.
Remove the correspondent lines from the big-file and save it in the same output_directory - no idea. (maybe pop?)
It's possible to name the output 'singlelinefiles' with the same string used to in step 2? Can anyone show me how?
It would be much appreciated
Here's the code so far:
#!/usr/bin/python
import os, sys, glob
#use: thisone.py <lists_dir><majorfile><out_dir>
lists = glob.glob(sys.argv[1]+ '*.txt')
listsmatrix = []
for line in lists:
listsmatrix.append(line.strip().split('\n'))
majorfile = open(sys.argv[2],'r')
majormatrix = []
for line in majorfile:
majormatrix.append(line.strip().split('\t'))
os.mkdir(sys.argv[3])
i=0
for line in majormatrix:
if line [0] in listsmatrix:
outfile = open(sys.argv[3]+ 'file'+str(i), 'w')
outfile.write(line)
outfile.close()
i+=1
I'll be thankful for any help from you.
When you open the file with 'w', the file gets cleared. So every time you open the file, the new line overrides the previous one.
Two possible solutions:
1) Replace 'w' with 'a', so you're appending to the file rather than overwriting it.
2) Open the file once, ideally using a 'with' block so that the file gets closed correctly even if an exception occurs:
with open(sys.argv[3]+ 'file'+str(i), 'w') as outfile:
for line in majormatrix:
if line [0] in listsmatrix:
outfile.write(line)
i+=1
I got a text file like this
Bruce
brucechungulloa#outlook.com
I've used this to read the text file and export it to a list
with open('info.txt') as f:
info = f.readlines()
for item in info:
reportePaises = open('reportePaises.txt', 'w')
reportePaises.write("%s\n" % item)
But when I want to write the elements of the list(info) into another text file, only the info[1] is written (the mail)
How can I write the entire list onto the text file?
with open('data.csv') as f:
with open('test2.txt', 'a') as wp:
for item in f.readlines():
wp.write("%s" % item)
wp.write('\n') # adds a new line after the looping is done
That will give you:
Bruce
brucechungulloa#outlook.com
In both files.
You were having problems because every time you open a file with 'w' flag, you overwrite it on the disk. So, you created a new file every time.
You should open the second file only once, in the with statement:
with open('info.txt') as f, open('reportePaises.txt', 'w') as reportePaises:
info = f.readlines()
for item in info:
reportePaises.write(item)
As #Pynchia suggested, it's probably better not to use .readlines(), and loop directly on input file instead.
with open('info.txt') as f, open('reportePaises.txt', 'w') as reportePaises:
for item in f:
reportePaises.write(item)
This way you don't create a copy of the while file in your RAM by saving it to a list, which may cause a huge delay if the file is big (and, obviously, uses more RAM). Instead, you treat the input file as an iterator and just read next line directly from your HDD on each iteration.
You also (if I did the testing right) don't need to append '\n' to every line. The newlines are already in item. Because of that you don't need to use string formatting at all, just reportePaises.write(item).
You are opening your file in write mode every time you write to a file, effectively overwriting the previous line that you wrote. Use the append mode, a, instead.
reportePaises = open('reportePaises.txt', 'a')
Edit: Alternatively, you can open the file once and instead of looping through the lines, write the whole contents as follows:
with open('reportePaises.txt', 'w') as file:
file.write(f.read())
Try this without open output file again and again.
with open('info.txt') as f:
info = f.readlines()
with open('reportePaises.txt', 'w') as f1:
for x in info:
f1.write("%s\n" % x)
That will work.
Two problems here. One is you are opening the output file inside the loop. That means it is being opened several times. Since you also use the "w" flag that means the file is truncated to zero each time it is opened. Therefore you only get the last line written.
It would be better to open the output file once outside the loop. You could even use an outer with block.
You can simply try the below code. Your code did not work because you added the opening on file handler 'reportPaises' within the for loop. You don't need to open the file handler again and again.
Try re running your code line by line in the python shell as it is very easy to debug the bugs in the code.
The below code will work
with open('something.txt') as f:
info = f.readlines()
reportePaises = open('reportePaises.txt', 'w')
for item in info:
reportePaises.write("%s" % item)
You don't need to add a \n to the output line because when you perform readlines, the \n character is preserved in the info list file. Please look observe below.
Try below
with open('something.txt') as f:
info = f.readlines()
print info
The output you will get is
['Bruce\n', 'brucechungulloa#outlook.com']
So I have a program which runs. This is part of the code:
FileName = 'Numberdata.dat'
NumberFile = open(FileName, 'r')
for Line in NumberFile:
if Line == '4':
print('1')
else:
print('9')
NumberFile.close()
A pretty pointless thing to do, yes, but I'm just doing it to enhance my understanding. However, this code doesn't work. The file remains as it is and the 4's are not replaced by 1's and everything else isn't replaced by 9's, they merely stay the same. Where am I going wrong?
Numberdata.dat is "444666444666444888111000444"
It is now:
FileName = 'Binarydata.dat'
BinaryFile = open(FileName, 'w')
for character in BinaryFile:
if charcter == '0':
NumberFile.write('')
else:
NumberFile.write('#')
BinaryFile.close()
You need to build up a string and write it to the file.
FileName = 'Numberdata.dat'
NumberFileHandle = open(FileName, 'r')
newFileString = ""
for Line in NumberFileHandle:
for char in line: # this will work for any number of lines.
if char == '4':
newFileString += "1"
elif char == '\n':
newFileString += char
else:
newFileString += "9"
NumberFileHandle.close()
NumberFileHandle = open(FileName, 'w')
NumberFileHandle.write(newFileString)
NumberFileHandle.close()
First, Line will never equal 4 because each line read from the file includes the newline character at the end. Try if Line.strip() == '4'. This will remove all white space from the beginning and end of the line.
Edit: I just saw your edit... naturally, if you have all your numbers on one line, the line will never equal 4. You probably want to read the file a character at a time, not a line at a time.
Second, you're not writing to any file, so naturally the file won't be getting changed. You will run into difficulty changing a file as you read it (since you have to figure out how to back up to the same place you just read from), so the usual practice is to read from one file and write to a different one.
Because you need to write to the file as well.
with open(FileName, 'w') as f:
f.write(...)
Right now you are just reading and manipulating the data, but you're not writing them back.
At the end you'll need to reopen your file in write mode and write to it.
If you're looking for references, take a look at theopen() documentation and at the Reading and Writing Files section of the Python Tutorial.
Edit: You shouldn't read and write at the same time from the same file. You could either, write to a temp file and at the end call shutil.move(), or load and manipulate your data and then re-open your original file in write mode and write them back.
You are not sending any output to the data, you are simply printing 1 and 9 to stdout which is usually the terminal or interpreter.
If you want to write to the file you have to use open again with w.
eg.
out = open(FileName, 'w')
you can also use
print >>out, '1'
Then you can call out.write('1') for example.
Also it is a better idea to read the file first if you want to overwrite and write after.
According to your comment:
Numberdata is just a load of numbers all one line. Maybe that's where I'm going wrong? It is "444666444666444888111000444"
I can tell you that the for cycle, iterate over lines and not over chars. There is a logic error.
Moreover, you have to write the file, as Rik Poggi said (just rember to open it in write mode)
A few things:
The r flag to open indicates read-only mode. This obviously won't let you write to the file.
print() outputs things to the screen. What you really want to do is output to the file. Have you read the Python File I/O tutorial?
for line in file_handle: loops through files one line at a time. Thus, if line == '4' will only be true if the line consists of a single character, 4, all on its own.
If you want to loop over characters in a string, then do something like for character in line:.
Modifying bits of a file "in place" is a bit harder than you think.
This is because if you insert data into the middle of a file, the rest of the data has to shuffle over to make room - this is really slow because everything after your insertion has to be rewritten.
In theory, a one-byte for one-byte replacement can be done fast, but in general people don't want to replace byte-for-byte, so this is an advanced feature. (See seek().) The usual approach is to just write out a whole new file.
Because print doesn't write to your file.
You have to open the file and read it, modify the string you obtain creating a new string, open again the file and write it again.
FileName = 'Numberdata.dat'
NumberFile = open(FileName, 'r')
data = NumberFile.read()
NumberFile.close()
dl = data.split('\n')
for i in range(len(dl)):
if dl[i] =='4':
dl[i] = '1'
else:
dl[i] = '9'
NumberFile = open(FileName, 'w')
NumberFile.write('\n'.join(dl))
NumberFile.close()
Try in this way. There are for sure different methods but this seems to be the most "linear" to me =)
Hey I need to split a large file in python into smaller files that contain only specific lines. How do I do this?
You're probably going to want to do something like this:
big_file = open('big_file', 'r')
small_file1 = open('small_file1', 'w')
small_file2 = open('small_file2', 'w')
for line in big_file:
if 'Charlie' in line: small_file1.write(line)
if 'Mark' in line: small_file2.write(line)
big_file.close()
small_file1.close()
small_file2.close()
Opening a file for reading returns an object that allows you to iterate over the lines. You can then check each line (which is just a string of whatever that line contains) for whatever condition you want, then write it to the appropriate file that you opened for writing. It is worth noting that when you open a file with 'w' it will overwrite anything already written to that file. If you want to simply add to the end, you should open it with 'a', to append.
Additionally, if you expect there to be some possibility of error in your reading/writing code, and want to make sure the files are closed, you can use:
with open('big_file', 'r') as big_file:
<do stuff prone to error>
Do you mean breaking it down into subsections? Like if I had a file with chapter 1, chapter 2, and chapter 3, you want it to be broken down into separate files for each chapter?
The way I've done this is similar to Wilduck's response, but closes the input file as soon as it reads in the data and keeps all the lines read in.
data_file = open('large_file_name', 'r')
lines = data_file.readlines()
data_file.close()
outputFile = open('output_file_one', 'w')
for line in lines:
if 'SomeName' in line:
outputFile.write(line)
outputFile.close()
If you wanted to have more than one output file you could either add more loops or open more than one outputFile at a time.
I'd recommend using Wilducks response, however, as it uses less space and will take less time with larger files since the file is read only once.
How big and does it need to be done in python? If this is on unix, would split/csplit/grep suffice?
First, open the big file for reading.
Second, open all the smaller file names for writing.
Third, iterate through every line. Every iteration, check to see what kind of line it is, then write it to that file.
More info on File I/O: http://docs.python.org/tutorial/inputoutput.html