I have to use the below bash command in a python script which includes multiple pip and grep commands.
grep name | cut -d':' -f2 | tr -d '"'| tr -d ','
I tried to do the same using subprocess module but didn't succeed.
Can anyone help me to run the above command in Python3 scripts?
I have to get the below output from a file file.txt.
Tom
Jack
file.txt contains:
"name": "Tom",
"Age": 10
"name": "Jack",
"Age": 15
Actually I want to know how can run the below bash command using Python.
cat file.txt | grep name | cut -d':' -f2 | tr -d '"'| tr -d ','
This works without having to use the subprocess library or any other os cmd related library, only Python.
my_file = open("./file.txt")
line = True
while line:
line = my_file.readline()
line_array = line.split()
try:
if line_array[0] == '"name":':
print(line_array[1].replace('"', '').replace(',', ''))
except IndexError:
pass
my_file.close()
If you not trying to parse a json file or any other structured file for which using a parser would be the best approach, just change your command into:
grep -oP '(?<="name":[[:blank:]]").*(?=",)' file.txt
You do not need any pipe at all.
This will give you the output:
Tom
Jack
Explanations:
-P activate perl regex for lookahead/lookbehind
-o just output the matching string not the whole line
Regex used: (?<="name":[[:blank:]]").*(?=",)
(?<="name":[[:blank:]]") Positive lookbehind: to force the constraint "name": followed by a blank char and then another double quote " the name followed by a double quote " extracted via (?=",) positive lookahead
demo: https://regex101.com/r/JvLCkO/1
Related
I have a requirement to fetch the count the occurrence of '|' in each line of a file then match the count with given inputcount, needs to throw exception when the count is wrong.
Say if the inputcount=3 and the file has following content
s01|test|aaa|hh
S02|test|bbb
so3|test|ccc|oo
then exception should get thrown on executing the line 2 and it should exit the file.
Tried below Awk command to fetch the count for each lines, but I was not sure how to compare and throw the exception, when it not matches
awk ' {print (split($0,a,"\|")-1) }' test.dat
Can anyone please help me with it?
You may use this awk:
awk -v inputcount=3 -F '\\|' 'NF && NF != inputcount+1 {exit 1}' file &&
echo "good" || echo "bad"
Details:
-F '\\|' sets | as input field separator
NF != inputcount+1 will return true if any line doesn't have inputcount pipe delimiters.
$ inputcount=3
$ awk -v c="$inputcount" 'gsub(/\|/,"&") != c{exit 1}' file
$ echo $?
1
As you also tagged the post with python I will write a python answer that could be a simple script.
The core is:
with open(filename) as f:
for n, line in enumerate(f):
if line.count("|") != 3:
print(f"Not valid file at line {n + 1}")
Than you can add some boilerplate:
import fileinput
import sys
with fileinput.input() as f:
for n, line in enumerate(f):
if line.count("|") != 3:
print(f"Not valid file at line {n + 1}")
sys.exit(1)
And with fileinput you can accept almost any sort of input: see Piping to a python script from cmd shell
Maybe try
awk -F '[|]' -v cols="$inputcount" 'NF != cols+1 {
print FILENAME ":" FNR ":" $0 >"/dev/stderr"; exit 1 }' test.dat
The -F argument says to split on this delimiter; the number of resulting fields NF will be one more than there are delimiters, so we scream and die when that number is wrong.
I have thousands of text files on my disk.
I need to search for them in terms of selected words.
Currently, I use:
grep -Eri 'text1|text2|text3|textn' dir/ > results.txt
The result is saved to a file: results.txt
I would like the result to be saved to many files.
results_text1.txt, results_text2.txt, results_textn.txt
Maybe someone has encountered some kind of script eg in python?
One solution might be to use a bash for loop.
for word in text1 text2 text3 textn; do grep -Eri '$word' dir/ > results_$word.txt; done
You can run this directly from the command line.
By using combination of "sed" and "xargs"
echo "text1,text2,text3,textn" | sed "s/,/\n/g" | xargs -I{} sh -c "grep -ir {} * > result_{}"
One way (using Perl because it's easier for regex and one-liner).
Sample data:
% mkdir dir dir/dir1 dir/dir2
% echo -e "text1\ntext2\nnope" > dir/file1.txt
% echo -e "nope\ntext3" > dir/dir1/file2.txt
% echo -e "nope\ntext2" > dir/dir1/file3.txt
Search:
% find dir -type f -exec perl -ne '/(text1|text2|text3|textn)/ or next;
$pat = $1; unless ($fh{$pat}) {
($fn = $1) =~ s/\W+/_/ag;
$fn = "results_$fn.txt";
open $fh{$pat}, ">>", $fn;
}
print { $fh{$pat} } "$ARGV:$_"' {} \;
Content of results_text1.txt:
dir/file1.txt:text1
Content of results_text2.txt:
dir/dir2/file3.txt:text2
dir/file1.txt:text2
Content of results_text3.txt:
dir/dir1/file2.txt:text3
Note:
you need to put the pattern inside parentheses to capture it. grep doesn't allow one to do this.
the captured pattern is then filtered (s/\W+/_/ag means to replace nonalphanumeric characters with underscore) to ensure it's safe as part of a filename.
I want to run the following lines of linux bash commands inside a python program.
tail /var/log/omxlog | stdbuf -o0 grep player_new | while read i
do
Values=$(omxd S | awk -F/ '{print $NF}')
x1="${Values}"
x7="${x1##*_}"
x8="${x7%.*}"
echo ${x8}
done
I know that for a single-line command, we can use the following syntax:
subprocess.call(['my','command'])
But, how can I use subprocess.call if there are several commands in multiple lines !?
quote https://mail.python.org/pipermail/tutor/2013-January/093474.html:
use subprocess.check_output(shell_command, shell=True)
import subprocess
cmd = '''
tail /var/log/omxlog | stdbuf -o0 grep player_new | while read i
do
Values=$(omxd S | awk -F/ '{print $NF}')
x1="${Values}"
x7="${x1##*_}"
x8="${x7%.*}"
echo ${x8}
done
'''
subprocess.check_output(cmd, shell=True)
I have try some other examples and it works.
Here is a pure python solution that I think does the same as your bash:
logname = '/var/log/omxlog'
with open(logname, 'rb') as f:
# not sure why you only want the last 10 lines, but here you go
lines = f.readlines()[-10:]
for line in lines:
if 'player_new' in line:
omxd = os.popen('omxd S').read()
after_ = omxd[line.rfind('_')+1:]
before_dot = after_[:after_.rfind('.')]
print(before_dot)
I am trying to convert the following code which is written in python to use cat | grep instead of opening a file.
The original code:
LOG_NAME="/raft/log/{}{}{}_{}_Rx_flow.log".format(NTIME.tm_year,str(NTIME.tm_mon).zfill(2),str(NTIME.tm_mday).zfill(2),str(NTIME.tm_hour).zfill(2))
print time.strftime("%Y/%m/%d %H:%M:%S") + " Log file name, RX restart, is: " + LOG_NAME
print time.strftime("%Y/%m/%d %H:%M:%S") + " ERRTIMEMIN value: " + ERRTIMEMIN + " RXRESTART message value: " + RXRESTART
LINK_LOG_FILE = open(LOG_NAME, "r")
ISRXRESTART=0
for REPline in LINK_LOG_FILE:
***if RXRESTART in REPline and (ERRTIMEMIN in REPline or ERRTIMEMIN1 in REPline) and ISRXRESTART==0:***
#Link restarted - correct behaviour.
print time.strftime("%Y/%m/%d %H:%M:%S") + " RX restarted - This is correct behaviour"
ISRXRESTART=1
I have to delete the line which opens the file and change the following line with the *** *** to something with cat and grep
for example:
os.popen("sshpass -p ssh root#"+self.ipaddr+" cat "+LOG_NAME+" | egrep `"+device_start+" "+ERRTIMEMIN+`").read().strip()
But I don't know how to combine or & and in the same grep
"OR" can be simulated by simply grepping multiple patterns at once:
egrep -E 'thing1|thing2' <file.txt
-E tells egrep to use extended regex.
To my knowledge there is no "AND" operator in grep but again can be simulated by grepping for patterns in both forward and backward order.
egrep -E 'thing1.*thing2|thing2.*thing1' <file.txt
Use -v for "NOT".
egrep -E 'thing1' <file.txt | egrep -v 'thing2'
This will find everything with "thing1" then grab only the stuff without "thing2".
Hope this helped.
Question:
How do I use sed with python successfully? I have to run this command on a remote server to get a list of comma delimited hosts. When ran from bash I get what I want which is something like host1, host2, host3
Here is what I have:
process = subprocess.Popen(["ssh $USER#mychefserver knife search node "chef_environment:*" | grep -i "node name" | egrep -i "stuff|ruff" | uniq -u | sort -n | cut -d ":" -f 2 | sed -e 's/^[ \t]*//' | tr '\n' ', '
"], shell=False, stdout=PIPE)
I know I'll have to escape the \n, \t, etc, but I'm having trouble with the rest. Whenever I try to run it from my Python script I get an error for invalid syntax even though I've tried a cornucopia of escapes.
You string quoting is broken as you use " inside a double quoted string. You have to escape the " like \". Further note, that most of the double quotes in command line can be replaced by single quotes '. The following code should work:
process = subprocess.Popen(["ssh $USER#mychefserver knife search node \"chef_environment:*\" | grep -i 'node name' | egrep -i 'stuff|ruff' | uniq -u | sort -n | cut -d':' -f 2 | sed -e 's/^[ \t]*//' | tr '\n' ', '"], shell=False, stdout=subprocess.PIPE)