I would like to run these tree bash commands in Python:
sed $'s/\r//' -i filename
sed -i 's/^ *//; s/ *$//; /^$/d' filename
awk -F, 'NF==10' filename > temp_file && mv temp_file filename
I wrote the following code:
cmd_1 = ["sed $'s/\r//' -i", file]
cmd_2 = ["sed -i 's/^ *//; s/ *$//; /^$/d'", file]
cmd_3 = ["awk -F, 'NF==10'", file, "> tmp_file && mv tmp_file", file]
subprocess.run(cmd_1)
subprocess.run(cmd_2)
subprocess.run(cmd_3)
But I'm getting this error here:
FileNotFoundError: [Errno 2] No such file or directory: "sed $'s/\r//' -i": "sed $'s/\r//' -i"
What I'm getting wrong?
If you provide the command as a list, then each argument should be a separate list member. Therefore:
cmd_1 = ["sed" r"s/\r//", "-i", file]
cmd_2 = ["sed" "-i" "s/^ *//; s/ *$//; /^$/d", file]
subprocess.run(cmd_1)
subprocess.run(cmd_2)
The last command requires the operators > and && provided by the shell, so you will need to also specify shell=True, and make the command a string:
cmd_3 = f"awk -F, NF==10 '{file}' > tmp_file && mv temp_file '{file}'"
subprocess.run(cmd_3, shell=True)
You have to use the shell=True parameter:
subprocess.run(cmd_1, shell=True)
Related
I'm using python subprocess to unzip a zip archive. My code is as below:
subprocess.Popen(['unzip', '{}.zip'.format(inputFile), '-d', output_directory])
Is there an unzip command to remove the zip source file after unzipping it? If no, how can I pipe an rm to the subprocess.Popen but to make sure it waits for the file to unzip first?
Thanks.
You could use && in the Shell, which will execute the second command only if the first was successful:
import subprocess
import os
values = {'zipFile': '/tmp/simple-grid.zip', 'outDir': '/tmp/foo'}
command = 'unzip {zipFile} -d {outDir} && rm {zipFile}'.format(**values)
proc = subprocess.Popen(command, shell=True)
_ = proc.communicate()
print('Success' if proc.returncode == 0 else 'Error')
Or, os.remove() if unzip succeeded:
inputFile = values['zipFile']
output_directory = values['outDir']
proc = subprocess.Popen(
['unzip', '{}'.format(inputFile), '-d', output_directory]
)
_ = proc.communicate() # communicate blocks!
if proc.returncode == 0:
os.remove(values['zipFile'])
print('Success' if not os.path.exists(inputFile) else 'Error')
I'm trying to combine a video(with no sound) and its separate audio file
I've tried ffmpeg ffmpeg -i video.mp4 -i audio.mp4 -c copy output.mp4
and it works fine.
i'm trying to achieve the same output from ffmpeg-python but with no luck. Any help on how to do this?
I had the same problem.
Here is the python code after you have pip install ffmpeg-python in your environment:
import ffmpeg
input_video = ffmpeg.input('./test/test_video.webm')
input_audio = ffmpeg.input('./test/test_audio.webm')
ffmpeg.concat(input_video, input_audio, v=1, a=1).output('./processed_folder/finished_video.mp4').run()
v=1:
Set the number of output video streams, that is also the number of video streams in each segment. Default is 1.
a=1: Set the number of output audio streams, that is also the number of audio streams in each segment. Default is 0.
For the details of ffmpeg.concat, check out: https://ffmpeg.org/ffmpeg-filters.html#concat.
You can check out more examples here: https://github.com/kkroening/ffmpeg-python/issues/281
Hope this helps!
PS.
If you are using MacOS and have the error:
FileNotFoundError: [Errno 2] No such file or directory: 'ffmpeg' while running the code, just brew install ffmpeg in your terminal.
You could use subprocess:
import subprocess
subprocess.run("ffmpeg -i video.mp4 -i audio.mp4 -c copy output.mp4")
You can also use fstrings to use variable names as input:
videofile = "video.mp4"
audiofile = "audio.mp4"
outputfile = "output.mp4"
codec = "copy"
subprocess.run(f"ffmpeg -i {videofile} -i {audiofile} -c {codec} {outputfile}")
import ffmpeg
input_video = ffmpeg.input("../resources/video_with_audio.mp4")
added_audio = ffmpeg.input("../resources/dance_beat.ogg").audio.filter('adelay', "1500|1500")
merged_audio = ffmpeg.filter([input_video.audio, added_audio], 'amix')
(ffmpeg
.concat(input_video, merged_audio, v=1, a=1)
.output("mix_delayed_audio.mp4")
.run(overwrite_output=True))
you can review this link https://github.com/kkroening/ffmpeg-python/issues/281#issuecomment-546724993
Added the following code:
https://github.com/russellstrei/combineViaFFMPEG
It walks the directory, finds ".mp4" files, adds it to the file to be used by ffmpeg, then executes the command.
for name in glob.glob(directory +"\\"+ '*.mp4'):
print(name)
file1 = open(processList, "a") # append mode
file1.write("file '" + name + "'\n")
file1.close()
execute()
def execute():
cmd = "ffmpeg -f concat -safe 0 -i " + processList + " -c copy "+ dir + "output.mp4"
os.system(cmd)
I'm trying to combine a video(with no sound) and its separate audio file
I've tried ffmpeg ffmpeg -i video.mp4 -i audio.mp4 -c copy output.mp4
and it works fine.
i'm trying to achieve the same output from ffmpeg-python but with no luck. Any help on how to do this?
I had the same problem.
Here is the python code after you have pip install ffmpeg-python in your environment:
import ffmpeg
input_video = ffmpeg.input('./test/test_video.webm')
input_audio = ffmpeg.input('./test/test_audio.webm')
ffmpeg.concat(input_video, input_audio, v=1, a=1).output('./processed_folder/finished_video.mp4').run()
v=1:
Set the number of output video streams, that is also the number of video streams in each segment. Default is 1.
a=1: Set the number of output audio streams, that is also the number of audio streams in each segment. Default is 0.
For the details of ffmpeg.concat, check out: https://ffmpeg.org/ffmpeg-filters.html#concat.
You can check out more examples here: https://github.com/kkroening/ffmpeg-python/issues/281
Hope this helps!
PS.
If you are using MacOS and have the error:
FileNotFoundError: [Errno 2] No such file or directory: 'ffmpeg' while running the code, just brew install ffmpeg in your terminal.
You could use subprocess:
import subprocess
subprocess.run("ffmpeg -i video.mp4 -i audio.mp4 -c copy output.mp4")
You can also use fstrings to use variable names as input:
videofile = "video.mp4"
audiofile = "audio.mp4"
outputfile = "output.mp4"
codec = "copy"
subprocess.run(f"ffmpeg -i {videofile} -i {audiofile} -c {codec} {outputfile}")
import ffmpeg
input_video = ffmpeg.input("../resources/video_with_audio.mp4")
added_audio = ffmpeg.input("../resources/dance_beat.ogg").audio.filter('adelay', "1500|1500")
merged_audio = ffmpeg.filter([input_video.audio, added_audio], 'amix')
(ffmpeg
.concat(input_video, merged_audio, v=1, a=1)
.output("mix_delayed_audio.mp4")
.run(overwrite_output=True))
you can review this link https://github.com/kkroening/ffmpeg-python/issues/281#issuecomment-546724993
Added the following code:
https://github.com/russellstrei/combineViaFFMPEG
It walks the directory, finds ".mp4" files, adds it to the file to be used by ffmpeg, then executes the command.
for name in glob.glob(directory +"\\"+ '*.mp4'):
print(name)
file1 = open(processList, "a") # append mode
file1.write("file '" + name + "'\n")
file1.close()
execute()
def execute():
cmd = "ffmpeg -f concat -safe 0 -i " + processList + " -c copy "+ dir + "output.mp4"
os.system(cmd)
enter image description hereThere are few wireshark .pcap files. I need to separate each .pcap to incoming and outgoing traffic (by giving source and destination mac addresses) and these separated files have to get written into two different folders namely Incoming and Outgoing. The output files (files that got separated as incoming and outgoing) have to get the same name as input files and need to get written to .csv files. I tried the below code, but not working . Any help is greatly appreciated. Thanks
import os
import csv
startdir= '/root/Desktop/Test'
suffix= '.pcap'
for root,dirs, files, in os.walk(startdir):
for name in files:
if name.endswith(suffix):
filename=os.path.join(root,name)
cmdOut = 'tshark -r "{}" -Y "wlan.sa==00:00:00:00:00:00 && wlan.da==11:11:11:11:11:11" -T fields -e frame.time_delta_displayed -e frame.len -E separator=, -E header=y > "{}"'.format(filename,filename)
cmdIn = 'tshark -r "{}" -Y "wlan.sa==11:11:11:11:11:11 && wlan.da==00:00:00:00:00:00" -T fields -e frame.time_delta_displayed -e frame.len -E separator=, -E header=y > "{}"'.format(filename,filename)
#os.system(cmd1)
#os.system(cmd2)
with open('/root/Desktop/Incoming/', 'w') as csvFile:
writer = csv.writer(csvFile)
writer.writerows(os.system(cmdIn))
with open('/root/Desktop/Outgoing/', 'w') as csvFile:
writer = csv.writer(csvFile)
writer.writerows(os.system(cmdOut))
csvFile.close()
A correct implementation might look more like:
import csv
import os
import subprocess
startdir = 'in.d' # obviously, people other than you won't have /root/Desktop/test
outdir = 'out.d'
suffix = '.pcap'
def decode_to_file(cmd, in_file, new_suffix):
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE)
fileName = outdir + '/' + in_file[len(startdir):-len(suffix)] + new_suffix
os.makedirs(os.path.dirname(fileName), exist_ok=True)
csv_writer = csv.writer(open(fileName, 'w'))
for line_bytes in proc.stdout:
line_str = line_bytes.decode('utf-8')
csv_writer.writerow(line_str.strip().split(','))
for root, dirs, files in os.walk(startdir):
for name in files:
if not name.endswith(suffix):
continue
in_file = os.path.join(root, name)
cmdCommon = [
'tshark', '-r', in_file,
'-T', 'fields',
'-e', 'frame.time_delta_displayed',
'-e', 'frame.len',
'-E', 'separator=,',
'-E', 'header=y',
]
decode_to_file(
cmd=cmdCommon + ['-Y', 'wlan.sa==00:00:00:00:00:00 && wlan.da==11:11:11:11:11:11'],
in_file=in_file,
new_suffix='.out.csv'
)
decode_to_file(
cmd=cmdCommon + ['-Y', 'wlan.sa==11:11:11:11:11:11 && wlan.da==00:00:00:00:00:00'],
in_file=in_file,
new_suffix='.in.csv'
)
Note:
We don't use os.system(). (This wouldn't have ever worked, since it returns a numeric exit status, not strings in a format you can write to a CSV file).
We're not needing to generate any temporary files; we can read directly into our Python code from the stdout of the tshark subprocess.
We construct our output file name by modifying the input file name (replacing its extension with .out.csv and .in.csv, respectively).
Because writerow() requires an iterable, we can generate one by splitting by line.
Note that I'm not completely clear why you wanted to use the Python CSV module at all, since the fields output appears to already be CSV, so one could also just redirect the output straight to a file with no other processing.
I'm a relatively new to programming. I have a folder, with subfolders, which contain several thousand html files that are generically named, i.e. 1006.htm, 1007.htm, that I would like to rename using the tag from within the file.
For example, if file 1006.htm contains Page Title , I would like to rename it Page Title.htm. Ideally spaces are replaced with dashes.
I've been working in the shell with a bash script with no luck. How do I do this, with either bash or python?
this is what I have so far..
#!/usr/bin/env bashFILES=/Users/Ben/unzipped/*
for f in $FILES
do
if [ ${FILES: -4} == ".htm" ]
then
awk 'BEGIN{IGNORECASE=1;FS="<title>|</title>";RS=EOF} {print $2}' $FILES
fi
done
I've also tried
#!/usr/bin/env bash
for f in *.html;
do
title=$( grep -oP '(?<=<title>).*(?=<\/title>)' "$f" )
mv -i "$f" "${title//[^a-zA-Z0-9\._\- ]}".html
done
But I get an error from the terminal exlaing how to use grep...
use awk instead of grep in your bash script and it should work:
#!/bin/bash
for f in *.html;
do
title=$( awk 'BEGIN{IGNORECASE=1;FS="<title>|</title>";RS=EOF} {print $2}' "$f" )
mv -i "$f" "${title//[^a-zA-Z0-9\._\- ]}".html
done
don't forget to change your bash env on the first line ;)
EDIT full answer with all the modifications
#!/bin/bash
for f in `find . -type f | grep \.html`
do
title=$( awk 'BEGIN{IGNORECASE=1;FS="<title>|</title>";RS=EOF} {print $2}' "$f" )
mv -i "$f" "${title//[ ]/-}".html
done
Here is a python script I just wrote:
import os
import re
from lxml import etree
class MyClass(object):
def __init__(self, dirname=''):
self.dirname = dirname
self.exp_title = "<title>(.*)</title>"
self.re_title = re.compile(self.exp_title)
def rename(self):
for afile in os.listdir(self.dirname):
if os.path.isfile(afile):
originfile = os.path.join(self.dirname, afile)
with open(originfile, 'rb') as fp:
contents = fp.read()
try:
html = etree.HTML(contents)
title = html.xpath("//title")[0].text
except Exception as e:
try:
title = self.re_title.findall(contents)[0]
except Exception:
title = ''
if title:
newfile = os.path.join(self.dirname, title)
os.rename(originfile, newfile)
>>> test = MyClass('/path/to/your/dir')
>>> test.rename()
You want to use a HTML parser (likelxml.html) to parse your HTML files. Once you've got that, retrieving the title tag is one line (probably page.get_element_by_id("title").text_content()).
Translating that to a file name and renaming the document should be trivial.
A python3 recursive globbing version that does a bit of title sanitising before renaming.
import re
from pathlib import Path
import lxml.html
root = Path('.')
for path in root.rglob("*.html"):
soup = lxml.html.parse(path)
title_els = soup.xpath('/html/head/title')
if len(title_els):
title = title_els[0].text
if title:
print(f'Original title {title}')
name = re.sub(r'[^\w\s-]', '', title.lower())
name = re.sub(r'[\s]+', '-', name)
new_path = (path.parent/name).with_suffix(path.suffix)
if not Path(new_path).exists():
print(f'Renaming [{path.absolute()}] to [{new_path}]')
path.rename(new_path)
else:
print(f'{new_path.name} already exists!')