How to copy single lines into single file for each one - python

I don't know what's wrong with the code it doesn't get me any kind of error message from the shell.
What I'm trying to do is:
Merge all list files from a directory into a single list(with one single column with a single string per row) - done!
Compare that list with a big-file and copy every single correspondent line into a new single file for each line - (maybe?) done! But not working. =/
save the files from step 2 in a new output_directory. - not working.
Remove the correspondent lines from the big-file and save it in the same output_directory - no idea. (maybe pop?)
It's possible to name the output 'singlelinefiles' with the same string used to in step 2? Can anyone show me how?
It would be much appreciated
Here's the code so far:
#!/usr/bin/python
import os, sys, glob
#use: thisone.py <lists_dir><majorfile><out_dir>
lists = glob.glob(sys.argv[1]+ '*.txt')
listsmatrix = []
for line in lists:
listsmatrix.append(line.strip().split('\n'))
majorfile = open(sys.argv[2],'r')
majormatrix = []
for line in majorfile:
majormatrix.append(line.strip().split('\t'))
os.mkdir(sys.argv[3])
i=0
for line in majormatrix:
if line [0] in listsmatrix:
outfile = open(sys.argv[3]+ 'file'+str(i), 'w')
outfile.write(line)
outfile.close()
i+=1
I'll be thankful for any help from you.

When you open the file with 'w', the file gets cleared. So every time you open the file, the new line overrides the previous one.
Two possible solutions:
1) Replace 'w' with 'a', so you're appending to the file rather than overwriting it.
2) Open the file once, ideally using a 'with' block so that the file gets closed correctly even if an exception occurs:
with open(sys.argv[3]+ 'file'+str(i), 'w') as outfile:
for line in majormatrix:
if line [0] in listsmatrix:
outfile.write(line)
i+=1

Related

How do i check for a keyword in a specific line of a text file? python [duplicate]

I want to go to line 34 in a .txt file and read it. How would you do that in Python?
Use Python Standard Library's linecache module:
line = linecache.getline(thefilename, 33)
should do exactly what you want. You don't even need to open the file -- linecache does it all for you!
This code will open the file, read the line and print it.
# Open and read file into buffer
f = open(file,"r")
lines = f.readlines()
# If we need to read line 33, and assign it to some variable
x = lines[33]
print(x)
A solution that will not read more of the file than necessary is
from itertools import islice
line_number = 34
with open(filename) as f:
# Adjust index since Python/islice indexes from 0 and the first
# line of a file is line 1
line = next(islice(f, line_number - 1, line_number))
A very straightforward solution is
line_number = 34
with open(filename) as f:
f.readlines()[line_number - 1]
There's two ways:
Read the file, line by line, stop when you've gotten to the line you want
Use f.readlines() which will read the entire file into memory, and return it as a list of lines, then extract the 34th item from that list.
Solution 1
Benefit: You only keep, in memory, the specific line you want.
code:
for i in xrange(34):
line = f.readline();
# when you get here, line will be the 34th line, or None, if there wasn't
# enough lines in the file
Solution 2
Benefit: Much less code
Downside: Reads the entire file into memory
Problem: Will crash if less than 34 elements are present in the list, needs error handling
line = f.readlines()[33]
You could just read all the lines and index the line your after.
line = open('filename').readlines()[33]
for linenum,line in enumerate(open("file")):
if linenum+1==34: print line.rstrip()
I made a thread about this and didn't receive help so I took matter into my own hands.
Not any complicated code here.
import linecache
#Simply just importing the linecache function to read our line of choosing
number = int(input("Enter a number from 1-10 for a random quote "))
#Asks the user for which number they would like to read(not necessary)
lines = linecache.getline("Quotes.txt", number)
#Create a new variable in order to grab the specific line, the variable
#integer can be replaced by any integer of your choosing.
print(lines)
#This will print the line of your choosing.
If you are completing this in python make sure you have both files (.py) and (.txt) in the same location otherwise python will not be able to retrieve this, unless you specify the file location. EG.
linecache.getline("C:/Directory/Folder/Quotes.txt
This is used when the file is in another folder than the .py file you are using.
Hope this helps!
Option that always closes the file and doesn't load the whole file into memory
with open('file.txt') as f:
for i, line in enumerate(f):
if i+1 == 34: break
print(line.rstrip())

Python: Create a file which contains a particular string and does not contain another particular string

I am trying to write a script that removes rows containing one string and keeps rows containing another. I think I have an indentation error at the end, can anyone see how to fix this?
import os
import sys
#Reading Input file
f = open(sys.argv[1]).readlines()
for line in f: #(read line 0 to last line of input file)
if 'Futures' in line and 'Elec' not in line: #if string "Futures" is not there in dictionary i.e it is unique so store it into a dictionary
#f = open("C://Python27//New_File.csv", 'w')
#f.close()
#opens and close new file
nf = open("C://Python27//New_File.csv", "w")
nf.write(data)
nf.close()
Your indentation and logic are both wrong, if you keep opening with w you will end up with a single line, you need to open the output file once outside the loop and write as you go:
import sys
#Reading Input file
with open(sys.argv[1]) as f, open("C://Python27//New_File.csv", "w") as out:
for line in f: #(read line 0 to last line of input file)
if 'Futures' in line and 'Elec' not in line: #if string "Futures" is not there in dictionary i.e it is unique so store it into a dictionary
out.write(line)
You can also iterate over the file object, there is no need or reason to use readlines unless you actually need a list of lines.
On another note, you may want to handle the cases where a file is passed that does not exist or you do not have permissions to read.
Try this:
for line in f:
if 'Futures' in line and 'Elec' not in line:
nf = open("C://Python27//New_File.csv", "a")
nf.write(data)
nf.close()

change first line of a file in python

I only need to read the first line of a huge file and change it.
Is there a trick to only change the first line of a file and save it as another file using Python? All my code is done in Python and would help me to keep consistency.
The idea is to not have to read and then write the whole file.
shutil.copyfileobj() should be much faster than running line-by-line. Note from the docs:
Note that if the current file position of the [from_file] object is not 0,
only the contents from the current file position to the end of the
file will be copied.
Thus:
from_file.readline() # and discard
to_file.write(replacement_line)
shutil.copyfileobj(from_file, to_file)
If you want to modify the top line of a file and save it under a new file name, it is not possible to simply modify the first line without iterating over the entire file. On the bright side, as long as you are not printing to the terminal, modifying the first line of a file is VERY, VERY fast even on vasy large files.
Assuming you are working with text-based files (not binary,) this should fit your needs and perform well enough for most applications.
import os
newline = os.linesep # Defines the newline based on your OS.
source_fp = open('source-filename', 'r')
target_fp = open('target-filename', 'w')
first_row = True
for row in source_fp:
if first_row:
row = 'the first row now says this.'
first_row = False
target_fp.write(row + newline)
An alternate solution that does not require iterating over the lines that are not of interest.
def replace_first_line( src_filename, target_filename, replacement_line):
f = open(src_filename)
first_line, remainder = f.readline(), f.read()
t = open(target_filename,"w")
t.write(replacement_line + "\n")
t.write(remainder)
t.close()
Unless the new line is the same length as the old line, you can not do this. If it is, you could solve this problem through a mmap.
The sh module worked for me:
import sh
first = "new string"
sh.sed("-i", "1s/.*/" + first + "/", "file.x")
The solution i would use is to use create a file missing old first line
from_file.readline() # and discard
shutil.copyfileobj(from_file, tail_file)
then create a file with the new first line
then use the following to concatenate the newfirstline file and tail_file
for f in ['newfirstline.txt','tail_file.txt']:
with open(f,'rb') as fd:
shutil.copyfileobj(fd, wfd, 1024*1024*10
Here is the working example of "Nacho" answer:
import subprocess
cmd = ['sed', '-i', '-e', '1,1s/.*/' + new_line + '/g', 'filename.txt']
subprocess.call(cmd)

Python: Write to next empty line

I'm trying to write the output of something that is being done over three big iterations and each time I'm opening and closing the outfile. Counters get reset and things like this after the iterations and I'm a massive newb and would struggle to work around this with the shoddy code I've written. So even if it's slower I'd like change the way it is being output.
Currently for the output it's just rewriting over the first line so I have only the output of the last run of the program. (tau, output are variables given values in the iterations above in the code)
with open(fileName + '.autocorrelate', "w") as outfile:
outfile.writelines('{0} {1}{2}'.format(tau, output, '\n'))
I was wondering if there are any quick ways to get python to check for the first empty line when it opens a file and write the new line there?
Open with "a" instead of "w" will write at the end of the file. That's the way to not overwrite.
If you open your file in append mode : "a" instead of "w", you will be able to write a new line at the end of your file.
You do do something like that to keep a reference (line number) to every empty line in a file
# Get file contents
fd = open(file)
contents = fd.readlines()
fd.close()
empty_line = []
i = 0
# find empty line
for line in contents:
if line == "":
empty_line.append(i)
i+=1

Replace a word in a file

I am new to Python programming...
I have a .txt file....... It looks like..
0,Salary,14000
0,Bonus,5000
0,gift,6000
I want to to replace the first '0' value to '1' in each line. How can I do this? Any one can help me.... With sample code..
Thanks in advance.
Nimmyliji
I know that you're asking about Python, but forgive me for suggesting that perhaps a different tool is better for the job. :) It's a one-liner via sed:
sed 's/^0,/1,/' yourtextfile.txt > output.txt
This applies the regex /^0,/ (which matches any 0, that occurs at the beginning of a line) to each line and replaces the matched text with 1, instead. The output is directed into the file output.txt specified.
inFile = open("old.txt", "r")
outFile = open("new.txt", "w")
for line in inFile:
outFile.write(",".join(["1"] + (line.split(","))[1:]))
inFile.close()
outFile.close()
If you would like something more general, take a look to Python csv module. It contains utilities for processing comma-separated values (abbreviated as csv) in files. But it can work with arbitrary delimiter, not only comma. So as you sample is obviously a csv file, you can use it as follows:
import csv
reader = csv.reader(open("old.txt"))
writer = csv.writer(open("new.txt", "w"))
writer.writerows(["1"] + line[1:] for line in reader)
To overwrite original file with new one:
import os
os.remove("old.txt")
os.rename("new.txt", "old.txt")
I think that writing to new file and then renaming it is more fault-tolerant and less likely corrupt your data than direct overwriting of source file. Imagine, that your program raised an exception while source file was already read to memory and reopened for writing. So you would lose original data and your new data wouldn't be saved because of program crash. In my case, I only lose new data while preserving original.
o=open("output.txt","w")
for line in open("file"):
s=line.split(",")
s[0]="1"
o.write(','.join(s))
o.close()
Or you can use fileinput with in place edit
import fileinput
for line in fileinput.FileInput("file",inplace=1):
s=line.split(",")
s[0]="1"
print ','.join(s)
f = open(filepath,'r')
data = f.readlines()
f.close()
edited = []
for line in data:
edited.append( '1'+line[1:] )
f = open(filepath,'w')
f.writelines(edited)
f.flush()
f.close()
Or in Python 2.5+:
with open(filepath,'r') as f:
data = f.readlines()
with open(outfilepath, 'w') as f:
for line in data:
f.write( '1' + line[1:] )
This should do it. I wouldn't recommend it for a truly big file though ;-)
What is going on (ex 1):
1: Open the file in read mode
2,3: Read all the lines into a list (each line is a separate index) and close the file.
4,5,6: Iterate over the list constructing a new list where each line has the first character replaced by a 1. The line[1:] slices the string from index 1 onward. We concatenate the 1 with the truncated list.
7,8,9: Reopen the file in write mode, write the list to the file (overwrite), flush the buffer, and close the file handle.
In Ex. 2:
I use the with statement that lets the file handle closing itself, but do essentially the same thing.

Categories

Resources