How to replace command line arguments sys.argv by stdin stdout? - python

I realize, that my question is a very simple one, but I can't find any explicit example of the implementation of the stdin stdout into a Python script.
I have a script, working perfectly well with command line arguments:
newlist = []
def f1()
....
def f2(input_file):
vol_id = sys.argv[3]
for line in input_file:
if ... :
line = line.replace('abc','def')
line = line.replace('id', 'id'+vol_id)
....
newlist.append(line)
return newlist
def main():
if len(sys.argv) < 4:
print 'usage: ./myscript.py [file_in... file_out... volume_id]'
sys.exit(1)
else:
filename = sys.argv[1]
filename_out = sys.argv[2]
tree = etree.parse(filename)
extract(tree)
input_file = open(filename, 'rU')
change_class(input_file)
file_new = open(filename_out, 'w')
for x in newlist:
if '\n' in x:
x = x.replace('\n', '')
print>>file_new, x
Now I should somehow use stdin and stdout instead of my arguments in order to make my script usable within pipelines, like for example using multiple files as input:
cat input1 input1 input3 | myscript.py
Or to process its output with some UNIX tools before printing it to a file.
I tried to replace arguments in my script by sys.stdin:
filename = sys.stdin
filename_out = sys.stdout
Then I ran my script like this:
./myscript.py < inputfile > outputfile
It resulted in an empty outputfile, but didn't yeld any error messages at all.
Could you please help me with this replacement?
P.S. Then I modified my main() like this:
filename = sys.argv[1]
filename_out = sys.argv[2]
if filename == '-':
filename = sys.stdin
else:
input_file = open(filename, 'rU')
if filename_out == '-':
filename_out = sys.stdout
file_new = filename_out
else:
file_new = open(filename_out, 'w')
tree = etree.parse(filename)
extract(tree)
input_file = filename
change_class(input_file)
for x in newlist:
if '\n' in x:
x = x.replace('\n', '')
print>>file_new, x
I tried to run it from the command line like this:
./myscript.py - - volumeid < filein > fileout
But I still got an empty output file :(

The common placeholder for stdin or stdout is -:
./myscript.py - - volumeid
and:
if filename == '-':
input_file = sys.stdin
else:
input_file = open(filename, 'rU')
etc.
In addition, you could default filename and filename_out to - when there are fewer than 3 command line arguments. You should consider using a dedicated command-line argument parser such as argparse, which can handle these cases for you, including defaulting to stdin and stdout, and using -.
As a side note, I'd not use print to write to a file; I'd just use:
file_new.write(x)
which removes the need to strip off the newlines as well.
You appear to read from the input file twice; once to parse the XML tree, once again to call change_class() with the open file object. What are you trying to do there? You'll have problems replicating that with sys.stdin as you cannot re-read the data from a stream the way you can from a file on disk.
You'd have to read all the data into memory first, then parse the XML from it, then read it it again for change_class(). It'd be better if you used the parsed XML tree for this instead, if possible (e.g. read the file only once, then use the parsed structure from there on out).

Related

Reading a line and replacing it in Python [duplicate]

How do I search and replace text in a file using Python 3?
Here is my code:
import os
import sys
import fileinput
print ("Text to search for:")
textToSearch = input( "> " )
print ("Text to replace it with:")
textToReplace = input( "> " )
print ("File to perform Search-Replace on:")
fileToSearch = input( "> " )
#fileToSearch = 'D:\dummy1.txt'
tempFile = open( fileToSearch, 'r+' )
for line in fileinput.input( fileToSearch ):
if textToSearch in line :
print('Match Found')
else:
print('Match Not Found!!')
tempFile.write( line.replace( textToSearch, textToReplace ) )
tempFile.close()
input( '\n\n Press Enter to exit...' )
Input file:
hi this is abcd hi this is abcd
This is dummy text file.
This is how search and replace works abcd
When I search and replace 'ram' by 'abcd' in above input file, it works as a charm. But when I do it vice-versa i.e. replacing 'abcd' by 'ram', some junk characters are left at the end.
Replacing 'abcd' by 'ram'
hi this is ram hi this is ram
This is dummy text file.
This is how search and replace works rambcd
As pointed out by michaelb958, you cannot replace in place with data of a different length because this will put the rest of the sections out of place. I disagree with the other posters suggesting you read from one file and write to another. Instead, I would read the file into memory, fix the data up, and then write it out to the same file in a separate step.
# Read in the file
with open('file.txt', 'r') as file :
filedata = file.read()
# Replace the target string
filedata = filedata.replace('abcd', 'ram')
# Write the file out again
with open('file.txt', 'w') as file:
file.write(filedata)
Unless you've got a massive file to work with which is too big to load into memory in one go, or you are concerned about potential data loss if the process is interrupted during the second step in which you write data to the file.
fileinput already supports inplace editing. It redirects stdout to the file in this case:
#!/usr/bin/env python3
import fileinput
with fileinput.FileInput(filename, inplace=True, backup='.bak') as file:
for line in file:
print(line.replace(text_to_search, replacement_text), end='')
As Jack Aidley had posted and J.F. Sebastian pointed out, this code will not work:
# Read in the file
filedata = None
with file = open('file.txt', 'r') :
filedata = file.read()
# Replace the target string
filedata.replace('ram', 'abcd')
# Write the file out again
with file = open('file.txt', 'w') :
file.write(filedata)`
But this code WILL work (I've tested it):
f = open(filein,'r')
filedata = f.read()
f.close()
newdata = filedata.replace("old data","new data")
f = open(fileout,'w')
f.write(newdata)
f.close()
Using this method, filein and fileout can be the same file, because Python 3.3 will overwrite the file upon opening for write.
You can do the replacement like this
f1 = open('file1.txt', 'r')
f2 = open('file2.txt', 'w')
for line in f1:
f2.write(line.replace('old_text', 'new_text'))
f1.close()
f2.close()
You can also use pathlib.
from pathlib2 import Path
path = Path(file_to_search)
text = path.read_text()
text = text.replace(text_to_search, replacement_text)
path.write_text(text)
(pip install python-util)
from pyutil import filereplace
filereplace("somefile.txt","abcd","ram")
Will replace all occurences of "abcd" with "ram".
The function also supports regex by specifying regex=True
from pyutil import filereplace
filereplace("somefile.txt","\\w+","ram",regex=True)
Disclaimer: I'm the author (https://github.com/MisterL2/python-util)
Open the file in read mode. Read the file in string format. Replace the text as intended. Close the file. Again open the file in write mode. Finally, write the replaced text to the same file.
try:
with open("file_name", "r+") as text_file:
texts = text_file.read()
texts = texts.replace("to_replace", "replace_string")
with open(file_name, "w") as text_file:
text_file.write(texts)
except FileNotFoundError as f:
print("Could not find the file you are trying to read.")
Late answer, but this is what I use to find and replace inside a text file:
with open("test.txt") as r:
text = r.read().replace("THIS", "THAT")
with open("test.txt", "w") as w:
w.write(text)
DEMO
With a single with block, you can search and replace your text:
with open('file.txt','r+') as f:
filedata = f.read()
filedata = filedata.replace('abc','xyz')
f.truncate(0)
f.write(filedata)
Your problem stems from reading from and writing to the same file. Rather than opening fileToSearch for writing, open an actual temporary file and then after you're done and have closed tempFile, use os.rename to move the new file over fileToSearch.
My variant, one word at a time on the entire file.
I read it into memory.
def replace_word(infile,old_word,new_word):
if not os.path.isfile(infile):
print ("Error on replace_word, not a regular file: "+infile)
sys.exit(1)
f1=open(infile,'r').read()
f2=open(infile,'w')
m=f1.replace(old_word,new_word)
f2.write(m)
Using re.subn it is possible to have more control on the substitution process, such as word splitted over two lines, case-(in)sensitive match. Further, it returns the amount of matches which can be used to avoid waste of resources if the string is not found.
import re
file = # path to file
# they can be also raw string and regex
textToSearch = r'Ha.*O' # here an example with a regex
textToReplace = 'hallo'
# read and replace
with open(file, 'r') as fd:
# sample case-insensitive find-and-replace
text, counter = re.subn(textToSearch, textToReplace, fd.read(), re.I)
# check if there is at least a match
if counter > 0:
# edit the file
with open(file, 'w') as fd:
fd.write(text)
# summary result
print(f'{counter} occurence of "{textToSearch}" were replaced with "{textToReplace}".')
Some regex:
add the re.I flag, short form of re.IGNORECASE, for a case-insensitive match
for multi-line replacement re.subn(r'\n*'.join(textToSearch), textToReplace, fd.read()), depending on the data also '\n{,1}'. Notice that for this case textToSearch must be a pure string, not a regex!
Besides the answers already mentioned, here is an explanation of why you have some random characters at the end:
You are opening the file in r+ mode, not w mode. The key difference is that w mode clears the contents of the file as soon as you open it, whereas r+ doesn't.
This means that if your file content is "123456789" and you write "www" to it, you get "www456789". It overwrites the characters with the new input, but leaves any remaining input untouched.
You can clear a section of the file contents by using truncate(<startPosition>), but you are probably best off saving the updated file content to a string first, then doing truncate(0) and writing it all at once.
Or you can use my library :D
I got the same issue. The problem is that when you load a .txt in a variable you use it like an array of string while it's an array of character.
swapString = []
with open(filepath) as f:
s = f.read()
for each in s:
swapString.append(str(each).replace('this','that'))
s = swapString
print(s)
I tried this and used readlines instead of read
with open('dummy.txt','r') as file:
list = file.readlines()
print(f'before removal {list}')
for i in list[:]:
list.remove(i)
print(f'After removal {list}')
with open('dummy.txt','w+') as f:
for i in list:
f.write(i)
you can use sed or awk or grep in python (with some restrictions). Here is a very simple example. It changes banana to bananatoothpaste in the file. You can edit and use it. ( I tested it worked...note: if you are testing under windows you should install "sed" command and set the path first)
import os
file="a.txt"
oldtext="Banana"
newtext=" BananaToothpaste"
os.system('sed -i "s/{}/{}/g" {}'.format(oldtext,newtext,file))
#print(f'sed -i "s/{oldtext}/{newtext}/g" {file}')
print('This command was applied: sed -i "s/{}/{}/g" {}'.format(oldtext,newtext,file))
if you want to see results on the file directly apply: "type" for windows/ "cat" for linux:
####FOR WINDOWS:
os.popen("type " + file).read()
####FOR LINUX:
os.popen("cat " + file).read()
I have done this:
#!/usr/bin/env python3
import fileinput
import os
Dir = input ("Source directory: ")
os.chdir(Dir)
Filelist = os.listdir()
print('File list: ',Filelist)
NomeFile = input ("Insert file name: ")
CarOr = input ("Text to search: ")
CarNew = input ("New text: ")
with fileinput.FileInput(NomeFile, inplace=True, backup='.bak') as file:
for line in file:
print(line.replace(CarOr, CarNew), end='')
file.close ()
I modified Jayram Singh's post slightly in order to replace every instance of a '!' character to a number which I wanted to increment with each instance. Thought it might be helpful to someone who wanted to modify a character that occurred more than once per line and wanted to iterate. Hope that helps someone. PS- I'm very new at coding so apologies if my post is inappropriate in any way, but this worked for me.
f1 = open('file1.txt', 'r')
f2 = open('file2.txt', 'w')
n = 1
# if word=='!'replace w/ [n] & increment n; else append same word to
# file2
for line in f1:
for word in line:
if word == '!':
f2.write(word.replace('!', f'[{n}]'))
n += 1
else:
f2.write(word)
f1.close()
f2.close()
def word_replace(filename,old,new):
c=0
with open(filename,'r+',encoding ='utf-8') as f:
a=f.read()
b=a.split()
for i in range(0,len(b)):
if b[i]==old:
c=c+1
old=old.center(len(old)+2)
new=new.center(len(new)+2)
d=a.replace(old,new,c)
f.truncate(0)
f.seek(0)
f.write(d)
print('All words have been replaced!!!')
I have worked this out as an exercise of a course: open file, find and replace string and write to a new file.
class Letter:
def __init__(self):
with open("./Input/Names/invited_names.txt", "r") as file:
# read the list of names
list_names = [line.rstrip() for line in file]
with open("./Input/Letters/starting_letter.docx", "r") as f:
# read letter
file_source = f.read()
for name in list_names:
with open(f"./Output/ReadyToSend/LetterTo{name}.docx", "w") as f:
# replace [name] with name of the list in the file
replace_string = file_source.replace('[name]', name)
# write to a new file
f.write(replace_string)
brief = Letter()
Like so:
def find_and_replace(file, word, replacement):
with open(file, 'r+') as f:
text = f.read()
f.write(text.replace(word, replacement))
def findReplace(find, replace):
import os
src = os.path.join(os.getcwd(), os.pardir)
for path, dirs, files in os.walk(os.path.abspath(src)):
for name in files:
if name.endswith('.py'):
filepath = os.path.join(path, name)
with open(filepath) as f:
s = f.read()
s = s.replace(find, replace)
with open(filepath, "w") as f:
f.write(s)

How to use commands from a txt file and store Python outputs in a txt file

Input file looks like this
I am trying do to following thing,
1-) Take shell commands from a txt file
2-) Store outputs of those commands in an another txt file.
But I am not sure how to use those commands and store them.
import os
def read_file(file_name): #file_name must be a string
current_dir_path = os.getcwd() #getting current directory path
reading_file_name = file_name
reading_file_path = os.path.join(current_dir_path, reading_file_name) #file path to read
# Open file
with open(reading_file_path, "r") as f: #"r" for reading
data = f.readlines()
for i in range(len(data)):
data[i] = data[i].replace("\n", "")
return data
This is my function to read given file and return commands as a list of strings. And,
outputs = "?"
def write_file(file_name): #file_name must be a string
current_dir_path = os.getcwd()
writing_file_name = file_name
writing_file_path = os.path.join(current_dir_path, writing_file_name)
# Open file and add
with open(writing_file_path, "w") as f:
f.write(outputs)
There are several functions that I created. Input file contains lines such that,
func1 val1 val2 val3
func3 valx valy valz
func2 val
...
I couldn't figured out how to use commands I stored in 'data' and put store their outcomes WITHOUT USING LIBRARIES other than python build-in libraries.
`
You can store the output of a command using subprocess. You can try the following:
from subprocess import Popen, PIPE
def write_file(command):
proc = Popen(command, shell=True, stdin=PIPE, stdout=PIPE,
stderr=PIPE)
ret = proc.stdout.readlines()
output = [i.decode('utf-8') for i in ret]
result = output[0]
with open('output.txt', 'a') as file:
file.write(result)
file.close()
#usage example
write_file('echo hi')
#'hi' will be written in output.txt

Function not returning an output file

I have the following python code whose purpose is to remove blank lines from an input text file. It should return an output file with all blank lines removed but it doesn't. What's the bug? Thank you!
import sys
def main():
inputFileName = sys.argv[1]
outputFileName = sys.argv[2]
inputFile = open(inputFileName, "r")
outputFile = open(inputFileName, "w")
for line in inputFile:
if "\n" in line:
removeBlank = line.replace("\n", "")
outputFile.write(removeBlank)
else:
outputFile.write(line)
inputFile.close()
outputFile.close()
main()
You have a lot of problem with your code. Specially the condition you check with empty line. People has rightly pointed out some problems.
Here is the solutions that should work and generate the output file with no empty lines.
import sys
def main():
inputFileName = sys.argv[1]
outputFileName = sys.argv[2]
with open(inputFileName) as inputFile, open(inputFileName, "w") as outputFile:
for line in inputFile.readlines():
if line.strip() != '':
outputFile.write(line)
if __name__ == '__main__':
main()
At present your code appears to truncate its input file immediately after opening it. At best this might give differing results on different platforms. On some platforms the file might be empty. I presume that opening the input file for writing was a typo.
A better way to approach this problem is to use a generator. Also, the correct test for an empty line is line == '\n', not '\n' in line, which will be true for all returned lines except perhaps the last.
def noblanks(file):
for line in file:
if line != '\n':
yield line
You can use this like so:
with open(inputFileName, "r") as inf, open(outputFilename, 'w') as outf:
for line in noblanks(inf):
outf.write(line)
The context managers in the with statement will ensure that your files are properly closed without further action on your part.

Segmentation Fault

I am using python 2.4.4 (old machine, can't do anything about it) on a UNIX machine. I am extremely new to python/programming and have never used a UNIX machine before. This is what I am trying to do:
extract a single sequence from a FASTA file (proteins + nucleotides) to a temporary text file.
Give this temporary file to a program called 'threader'
Append the output from threader (called tempresult.out) to a file called results.out
Remove the temporary file.
Remove the tempresult.out file.
Repeat using the next FASTA sequence.
Here is my code so far:
import os
from itertools import groupby
input_file = open('controls.txt', 'r')
output_file = open('results.out', 'a')
def fasta_parser(fasta_name):
input = fasta_name
parse = (x[1] for x in groupby(input, lambda line: line[0] == ">"))
for header in parse:
header = header.next()[0:].strip()
seq = "\n".join(s.strip() for s in parse.next())
yield (header, '\n', seq)
parsedfile = fasta_parser(input_file)
mylist = list(parsedfile)
index = 0
while index < len(mylist):
temp_file = open('temp.txt', 'a+')
temp_file.write(' '.join(mylist[index]))
os.system('threader' + ' temp.txt' + ' tempresult.out' + ' structures.txt')
os.remove('temp.txt')
f = open('tempresult.out', 'r')
data = str(f.read())
output_file.write(data)
os.remove('tempresult.out')
index +=1
output_file.close()
temp_file.close()
input_file.close()
When I run this script I get the error 'Segmentation Fault'. From what I gather this is to do with me messing with memory I shouldn't be messing with (???). I assume it is something to do with the temporary files but I have no idea how I would get around this.
Any help would be much appreciated!
Thanks!
Update 1:
Threader works fine when I give it the same sequence multiple times like this:
import os
input_file = open('control.txt', 'r')
output_file = open('results.out', 'a')
x=0
while x<3:
os.system('threader' + ' control.txt' + ' tempresult.out' + ' structures.txt')
f = open('tempresult.out', 'r')
data = str(f.read())
output_file.write(data)
os.remove('result.out')
x += 1
output_file.close()
input_file.close()
Update 2: In the event that someone else gets this error. I forgot to close temp.txt before invoking the threader program.

Using Argparse to create file converter in Python

I have to use the command prompt and python to recieve an input in the form of a csv file, then read it and convert it into a xml file with the same name as the csv file except with .xml file extension or the user can set the ouput file name and path using the -o --output optional command line argument. Well i have searched on google for days, and so far my program allows me to input command line arguments and i can convert the csv to an xml file but it doesn't print it using the same name as the csv file or when the user sets the name. Instead it just prints out a blank file. Here is my code:
import sys, argparse
import csv
import indent
from xml.etree.ElementTree import ElementTree, Element, SubElement, Comment, tostring
parser=argparse.ArgumentParser(description='Convert wordlist text files to various formats.', prog='Text Converter')
parser.add_argument('-v','--verbose',action='store_true',dest='verbose',help='Increases messages being printed to stdout')
parser.add_argument('-c','--csv',action='store_true',dest='readcsv',help='Reads CSV file and converts to XML file with same name')
parser.add_argument('-x','--xml',action='store_true',dest='toxml',help='Convert CSV to XML with different name')
parser.add_argument('-i','--inputfile',type=argparse.FileType('r'),dest='inputfile',help='Name of file to be imported',required=True)
parser.add_argument('-o','--outputfile',type=argparse.FileType('w'),dest='outputfile',help='Output file name')
args = parser.parse_args()
def main(argv):
reader = read_csv()
if args.verbose:
print ('Verbose Selected')
if args.toxml:
if args.verbose:
print ('Convert to XML Selected')
generate_xml(reader)
if args.readcsv:
if args.verbose:
print ('Reading CSV file')
read_csv()
if not (args.toxml or args.readcsv):
parser.error('No action requested')
return 1
def read_csv():
with open ('1250_12.csv', 'r') as data:
return list(csv.reader(data))
def generate_xml(reader):
root = Element('Solution')
root.set('version','1.0')
tree = ElementTree(root)
head = SubElement(root, 'DrillHoles')
head.set('total_holes', '238')
description = SubElement(head,'description')
current_group = None
i = 0
for row in reader:
if i > 0:
x1,y1,z1,x2,y2,z2,cost = row
if current_group is None or i != current_group.text:
current_group = SubElement(description, 'hole',{'hole_id':"%s"%i})
collar = SubElement (current_group, 'collar',{'':', '.join((x1,y1,z1))}),
toe = SubElement (current_group, 'toe',{'':', '.join((x2,y2,z2))})
cost = SubElement(current_group, 'cost',{'':cost})
i+=1
indent.indent(root)
tree.write(open('hole.xml','w'))
if (__name__ == "__main__"):
sys.exit(main(sys.argv))
for the generate_xml() function, you can ignore it since it accepts csv files formatted a certain way so you might not understand it but, i think the problem lies in tree.write() since that part generates the xml file with a name that is written in the code itself and not the arguments at the command prompt.
You need to pass a file argument to generate_xml(). You appear to have the output file in args.outputfile.
generate_xml(reader, args.outputfile)
...
def generate_xml(reader, outfile):
...
tree.write(outfile)
You should probably also make use of args.inputfile:
reader = read_csv(args.inputfile)
...
def read_csv(inputfile):
return list(csv.reader(inputfile))
And this line does not do anything useful, it processes the .csv file, but doesn't do anything with the results:
read_csv()
The following code has been adapted from FB36's recipie on code.activestate.com
It will do what you need and you don't have to worry about the headers in the csv file, though there should only be one header (the first row) in the csv file. Have a look at the bottom of this page if you want to do batch conversion.
'''Convert csv to xml file
csv2xml.py takes two arguments:
1. csvFile: name of the csv file (may need to specify path to file)
2. xmlFile: name of the desired xml file (path to destination can be specified)
If only the csv file is provided, its name is used for the xml file.
Command line usage:
example1: python csv2xml.py 'fileName.csv' 'desiredName.xml'
example2: python csv2xml.py '/Documents/fileName.csv' '/NewFolder/desiredName.xml'
example3: python csv2xml.py 'fileName.csv'
This code has been adapted from: http://code.activestate.com/recipes/577423/
'''
import csv
def converter(csvFile, xmlFile):
csvData = csv.reader(open(csvFile))
xmlData = open(xmlFile, 'w')
xmlData.write('<?xml version="1.0"?>' + "\n")
# there must be only one top-level tag
xmlData.write('<csv_data>' + "\n")
rowNum = 0
for row in csvData:
if rowNum == 0:
tags = row
# replace spaces w/ underscores in tag names
for i in range(len(tags)):
tags[i] = tags[i].replace(' ', '_')
else:
xmlData.write('<row>' + "\n")
for i in range(len(tags)):
xmlData.write(' ' + '<' + tags[i] + '>' \
+ row[i] + '</' + tags[i] + '>' + "\n")
xmlData.write('</row>' + "\n")
rowNum +=1
xmlData.write('</csv_data>' + "\n")
xmlData.close()
## for using csv2xml.py from the command line
if __name__ == '__main__':
import sys
if len(sys.argv)==2:
import os
csvFile = sys.argv[1]
xmlFile = os.path.splitext(csvFile)[0] + '.xml'
converter(csvFile,xmlFile)
elif len(sys.argv)==3:
csvFile = sys.argv[1]
xmlFile = sys.argv[2]
converter(csvFile,xmlFile)
else:
print __doc__

Categories

Resources