Read and write from pigz and subprocess with python3

Read and write from pigz and subprocess with python3 - python

I am trying to use the pigz function from linux to speed up file decompression and compression. I managed to open a file with pigz using subprocess.Popen() function but after different tries, I don't manage to read the stream from Popen() make modifications on some lines, and write it directly on a new file using pigz and subprocess as well. In the end, I use the gzip.open() function from the gzip library to write the new file and the process is as slow as reading and writing directly from the gzip.open() function.
Question:
On the following code is there a way to modify the data from the output of subprocess and directly write it to a compressed file using subprocess and pigz in order to speed up the whole operation?
inputFile = "file1.txt.gz"
outputFile = "file2.txt.gz"
def pigzStream2(inputFile, outputFile):
cmd = f'pigz -dkc {inputFile} ' # -dkc: decompress, k: do not delete original file, c:Write all processed output to stdout (won't delete)
if not sys.platform.startswith("win"):
cmd = shlex.split(cmd)
res = Popen(cmd, stdout=PIPE, stdin=PIPE, bufsize=1, text=True)
with res.stdout as f_in:
with gzip.open(outputFile, 'ab') as f_out:
count = 0
while True:
count += 1
line = f_in.readline()
if line.startswith('#'):
line = f"line {count} changed"
if not line:
print(count)
break
f_out.write(line.encode())
return 0```

Related

Unable to read file with python

I'm trying to read the content of a file with python 3.8.5 but the output is empty, I don't understand what I'm doing wrong.
Here is the code:
import subprocess
import os
filename = "ls.out"
ls_command = "ls -la"
file = open(filename, "w")
subprocess.Popen(ls_command, stdout=file, shell=True)
file.close()
# So far, all is ok. The file "ls.out" is correctly created and filled with the output of "ls -la" command"
file = open(filename, "r")
for line in file:
print(line)
file.close()
The output of this script is empty, it doesn't print anything. I'm not able to see the content of ls.out.
What is not correct here ?

Popen creates a new process and launches it but returns immediately. So the end result is that you've forked your code and have both processes running at once. Your python code in executing faster than the start and finish of ls. Thus, you need to wait for the process to finish by adding a call to wait():
import subprocess
import os
filename = "ls.out"
ls_command = "ls -la"
file = open(filename, "w")
proc = subprocess.Popen(ls_command, stdout=file, shell=True)
proc.wait()
file.close()
file = open(filename, "r")
for line in file:
print(line)
file.close()

Popen merely starts the subprocess. Chances are the file is not yet populated when you open it.
If you want to wait for the Popen object to finish, you have to call its wait method, etc; but a much better and simpler solution is to use subprocess.check_call() or one of the other higher-level wrappers.
If the command prints to standard output, why don't you read it drectly?
import subprocess
import shlex
result = subprocess.run(
shlex.split(ls_command), # avoid shell=True
check=True, text=True, capture_output=True)
line = result.stdout

Redirecting stdout to Terminal AND a log .txt file with Python Subprocess.run()

I am trying to run a command and redirect the output to a .txt file as well as be able to see it in terminal using subprocess.run(). I previously used 2>&1 | file.txt to accomplish this but would like to mimic that flavor with subprocess.run() and shell = False
I am currently able to redirect stdout to a .txt. successfully but I would like to be able to see it in terminal as well. Is there a way to accomplish this? I am on Python 3.6
with open(model_dest_dir + 'Deepspeech_progress.txt', 'w') as f:
train_model = subprocess.run(train_cmds, shell = False, cwd = '/home/', env = export_dict, stdout = f)
#for stdout_line in iter(train_model.stdout.readline, b''):
# print(stdout_line)
f.close()

Looks like I have a working solution albeit not with subprocess.run(). I had to use Popen like so:
# train the model and write logs to .txt file
with open(model_dest_dir + 'Deepspeech_progress.txt', 'w') as f:
train_model_proc = subprocess.Popen(train_cmds, cwd = '//DeepSpeech/', env = export_dict, stdout = subprocess.PIPE, stderr = subprocess.STDOUT, universal_newlines = True)
for line in train_model_proc.stdout:
sys.stdout.write(line)
f.write(line)
train_model_proc.wait() # wait for Popen to finish
f.close()

Writing to a file and reading it from a subprocess in python?

I'm creating a text file, and immediately after calling a subprocess that does some computation based on the text file.
When I call the subprocess by itself, it's able to read from the file as expected, but when I try to create a file and write to it immediately before, it is not able to read from the file.
f = open('name_data.txt', 'w')
f.write(name)
f.close()
cmd = ['g2p-seq2seq', '--decode', 'name_data.txt', '--model', 'g2p-seq2seq-cmudict']
process = subprocess.Popen(cmd, stdout=subprocess.PIPE)
process.wait()
#etc....

import subprocess
open("Edited.py", "w").write("Thing To Write")
A = subprocess.Popen('Command you want to call', shell = True, stdout = subprocess.PIPE, stderr = subprocess.PIPE)
print(A.communicate())

Module require to be run two times to produce result

I am trying to display a output from system . But, my script produces the result only when I run it two times. Below is the script. Using subprocess.Popen at both the places does not produce any out put and same with subprocess.call.
#!/usr/bin/env python
import subprocess
import re
contr = 0
spofchk='su - dasd -c "java -jar /fisc/dasd/bin/srmclient.jar -spof_chk"'
res22 = subprocess.call("touch /tmp/logfile",shell=True,stdout=subprocess.PIPE)
fp = open("/tmp/logfile","r+")
res6 =subprocess.Popen(spofchk,shell=True,stdout=fp)
fil_list=[]
for line in fp:
line = line.strip()
fil_list.append(line)
fp.close()
for i in fil_list[2:]:
if contr % 2 == 0:
if 'no SPOF' in i:
flag=0
#print(flag)
#print(i)
else:
flag = 1
else:
continue
#Incrementing the counter by 2 so that we will only read line with spof and no SPOF
contr+=2

The child process has its own file descriptor and therefore you may close the file in the parent as soon as the child process is started.
To read the whole child process' output that is redirected to a file, wait until it exits:
import subprocess
with open('logfile', 'wb', 0) as file:
subprocess.check_call(command, stdout=file)
with open('logfile') as file:
# read file here...
If you want to consume the output while the child process is running, use PIPE:
#!/usr/bin/env python3
from subprocess import Popen, PIPE
with Popen(command, stdout=PIPE) as process, open('logfile', 'wb') as file:
for line in process.stdout: # read b'\n'-separated lines
# handle line...
# copy it to file
file.write(line)
Here's a version for older Python versions and links to fix other possible issues.

Since subprocess open a new shell , so in first time it is not possible to create the file and the file and write the output of another subprocess at the same time
.. So only solution for this is to use os. System ..

Ouputting the results of os.popen()

I am trying to send the results of os.popen() to an output file. Here is the code I have been trying
import os
cmd = 'dir'
fp = os.popen(cmd)
print(fp.read()) --Prints the results to the screen
res = fp.read()
fob = open('popen_output.txt','w')
fob.write(res)
fob.close()
fp.close()
The output file is just blank. The results of the command are however displayed on screen. I have also tried using Popen like this (as per the subprocess management documentation):
import subprocess
fob = Popen('dir',stdout='popen_output.txt',shell=true).stdout
As well as:
import subprocess
subprocess.Popen('dir',stdout='popen_output.txt,shell=true)

Pass a file object to stdout not a the file name as a string, you can also use check_call in place of Popen which will raise a CalledProcessError for a non-zero exit status:
with open('popen_output.txt',"w") as f:
subprocess.check_call('dir',stdout=f)
If you are on windows subprocess.check_call('dir',stdout=f, shell=True), you could also redirect using > using shell=True:
subprocess.check_call('dir > popen_output.txt',shell=True)

Ok. This got it going. Thanks for the help!
fob = open('popen_output.txt','a')
subprocess.Popen('dir',stdout=fob,shell=True)
fob.close()

This seems to be more what you'd like to do. You can process then write to file.
process = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True)
for line in process.stdout:
#processing then write line to file...
file.write(line)
If you don't want to process, then you could just do it in your subprocess call.
subprocess.run('dir > popen_output.txt', shell=true)

The problem is you are calling fp.read() twice instead of saving the results of a single fp.read() call as res, printing res, and writing the res to the output file. A file handle is stateful, so if you call read on it twice, the current position after the first call will be at the end of the file/stream, hence your blank file.
Try this (just providing the relevant changes):
fp = os.popen(cmd)
res = fp.read()
print(res)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Read and write from pigz and subprocess with python3 - python

Related

Unable to read file with python

Redirecting stdout to Terminal AND a log .txt file with Python Subprocess.run()

Writing to a file and reading it from a subprocess in python?

Module require to be run two times to produce result

Ouputting the results of os.popen()

Categories

Resources