Print Matching line and one line before from the matched line

Print Matching line and one line before from the matched line - python

looking forward to print Matching line in a file on Linux host and one line before from the matched line included into one line.
Below is just the content from the log file:
[2020/02/18 08:25:21.229198, 1] ../source3/lib/smbldap.c:1206(get_cached_ldap_connect)
Connection to LDAP server failed for the 1 try!
[2020/02/18 08:25:21.229221, 2] ../source3/passdb/pdb_ldap_util.c:287(smbldap_search_domain_info)
smbldap_search_domain_info: Problem during LDAPsearch: Timed out
What i have tried:
I have tried following with grep and sed which somehow works..
$ egrep -B 1 "failed|Timed" /var/log/samba/smbd.log.old |tr -d "\n" | sed "s/--/\n/g"
[2020/02/18 08:25:21.229198, 1] ../source3/lib/smbldap.c:1206(get_cached_ldap_connect) Connection to LDAP server failed for the 1 try!
[2020/02/18 08:25:21.229221, 2] ../source3/passdb/pdb_ldap_util.c:287(smbldap_search_domain_info) smbldap_search_domain_info: Problem during LDAPsearch: Timed out
This does not looks to be a cleaner solution, i'm looking forward some expert one lines, one liner is acceptable with awk, sed, grep or even python.

It can be done with awk alone:
awk ' /Timed|failed/ { print previous, $0; }; {previous = $0;}' /var/log/samba/smbd.log.old

This might work for you (GNU sed):
sed -n 'N;/\n.*\(failed\|Timed\)/s/\n//p;D' file
Turn off implicit printing. Append the next line. If the appended line contains failed or Timed, delete the newline and print the result. Delete the first line in the pattern space and repeat.

Could you please try following tac + awk solution:
tac Input_file | awk '/failed/{found=1;val=$0;next} found && NF{print $0,val;val=found=""}'
OR adding a non-one liner form of solution:
tac Input_file |
awk '
/failed/{
found=1
val=$0
next
}
found && NF{
print $0,val
val=found=""
}
'

Related

How to get consolidated count of delimiter occurrence

I have a requirement to fetch the count the occurrence of '|' in each line of a file then match the count with given inputcount, needs to throw exception when the count is wrong.
Say if the inputcount=3 and the file has following content
s01|test|aaa|hh
S02|test|bbb
so3|test|ccc|oo
then exception should get thrown on executing the line 2 and it should exit the file.
Tried below Awk command to fetch the count for each lines, but I was not sure how to compare and throw the exception, when it not matches
awk ' {print (split($0,a,"\|")-1) }' test.dat
Can anyone please help me with it?

You may use this awk:
awk -v inputcount=3 -F '\\|' 'NF && NF != inputcount+1 {exit 1}' file &&
echo "good" || echo "bad"
Details:
-F '\\|' sets | as input field separator
NF != inputcount+1 will return true if any line doesn't have inputcount pipe delimiters.

$ inputcount=3
$ awk -v c="$inputcount" 'gsub(/\|/,"&") != c{exit 1}' file
$ echo $?
1

As you also tagged the post with python I will write a python answer that could be a simple script.
The core is:
with open(filename) as f:
for n, line in enumerate(f):
if line.count("|") != 3:
print(f"Not valid file at line {n + 1}")
Than you can add some boilerplate:
import fileinput
import sys
with fileinput.input() as f:
for n, line in enumerate(f):
if line.count("|") != 3:
print(f"Not valid file at line {n + 1}")
sys.exit(1)
And with fileinput you can accept almost any sort of input: see Piping to a python script from cmd shell

Maybe try
awk -F '[|]' -v cols="$inputcount" 'NF != cols+1 {
print FILENAME ":" FNR ":" $0 >"/dev/stderr"; exit 1 }' test.dat
The -F argument says to split on this delimiter; the number of resulting fields NF will be one more than there are delimiters, so we scream and die when that number is wrong.

Execute chained bash commands including multiple pipes and grep in Python3

I have to use the below bash command in a python script which includes multiple pip and grep commands.
grep name | cut -d':' -f2 | tr -d '"'| tr -d ','
I tried to do the same using subprocess module but didn't succeed.
Can anyone help me to run the above command in Python3 scripts?
I have to get the below output from a file file.txt.
Tom
Jack
file.txt contains:
"name": "Tom",
"Age": 10
"name": "Jack",
"Age": 15
Actually I want to know how can run the below bash command using Python.
cat file.txt | grep name | cut -d':' -f2 | tr -d '"'| tr -d ','

This works without having to use the subprocess library or any other os cmd related library, only Python.
my_file = open("./file.txt")
line = True
while line:
line = my_file.readline()
line_array = line.split()
try:
if line_array[0] == '"name":':
print(line_array[1].replace('"', '').replace(',', ''))
except IndexError:
pass
my_file.close()

If you not trying to parse a json file or any other structured file for which using a parser would be the best approach, just change your command into:
grep -oP '(?<="name":[[:blank:]]").*(?=",)' file.txt
You do not need any pipe at all.
This will give you the output:
Tom
Jack
Explanations:
-P activate perl regex for lookahead/lookbehind
-o just output the matching string not the whole line
Regex used: (?<="name":[[:blank:]]").*(?=",)
(?<="name":[[:blank:]]") Positive lookbehind: to force the constraint "name": followed by a blank char and then another double quote " the name followed by a double quote " extracted via (?=",) positive lookahead
demo: https://regex101.com/r/JvLCkO/1

search strings in multiple textfile

I have thousands of text files on my disk.
I need to search for them in terms of selected words.
Currently, I use:
grep -Eri 'text1|text2|text3|textn' dir/ > results.txt
The result is saved to a file: results.txt
I would like the result to be saved to many files.
results_text1.txt, results_text2.txt, results_textn.txt
Maybe someone has encountered some kind of script eg in python?

One solution might be to use a bash for loop.
for word in text1 text2 text3 textn; do grep -Eri '$word' dir/ > results_$word.txt; done
You can run this directly from the command line.

By using combination of "sed" and "xargs"
echo "text1,text2,text3,textn" | sed "s/,/\n/g" | xargs -I{} sh -c "grep -ir {} * > result_{}"

One way (using Perl because it's easier for regex and one-liner).
Sample data:
% mkdir dir dir/dir1 dir/dir2
% echo -e "text1\ntext2\nnope" > dir/file1.txt
% echo -e "nope\ntext3" > dir/dir1/file2.txt
% echo -e "nope\ntext2" > dir/dir1/file3.txt
Search:
% find dir -type f -exec perl -ne '/(text1|text2|text3|textn)/ or next;
$pat = $1; unless ($fh{$pat}) {
($fn = $1) =~ s/\W+/_/ag;
$fn = "results_$fn.txt";
open $fh{$pat}, ">>", $fn;
}
print { $fh{$pat} } "$ARGV:$_"' {} \;
Content of results_text1.txt:
dir/file1.txt:text1
Content of results_text2.txt:
dir/dir2/file3.txt:text2
dir/file1.txt:text2
Content of results_text3.txt:
dir/dir1/file2.txt:text3
Note:
you need to put the pattern inside parentheses to capture it. grep doesn't allow one to do this.
the captured pattern is then filtered (s/\W+/_/ag means to replace nonalphanumeric characters with underscore) to ensure it's safe as part of a filename.

if not line.startswith - give to much single quotes and line bracks

In my attempt to get an Apache log parser, I try to filter IP adresses, with the following code:
for r in log:
host_line = "'",r['host'],"'"
for line in host_line:
if not line.startswith("178.255.20.20"):
print line.strip()
The result of this code is:
p4fdf6780.dip0.t-ipconnect.de
'
'
79.223.103.128
'
'
p4fdf6780.dip0.t-ipconnect.de
'
'
With line.replace("'", "") I remove the single quotes.
print line.replace("'", "")
The result:
p4fdf6780.dip0.t-ipconnect.de
79.223.103.128
p4fdf6780.dip0.t-ipconnect.de
This leaves me with the two line breaks.
How can a avoid those line breaks?
And is there a work around, or a better solution - more pythonic way to get what I want?

What do you want the program to do? What is the intended purpose of the for line in host_line loop?
If you're just looking to print hosts other than 178.255.20.20, would the following not work?
for r in log:
host = str(r['host']).strip() # not sure if the str() is required, depends on type of r['host']
if not host.startswith("178.255.20.20"):
print host

Simply change your code like below. You don't need to go for a replace function.
for r in log:
host_line = "'",r['host'],"'"
for line in host_line:
if not line.startswith("178.255.20.20"):
if not line == "'":
print line.strip()

One way is to use bash and a dedicated search tool like Ag or just a standard grep which will make it really fast because it is C:
grep -v "178.255.20.20" your_log.txt | grep -v -E "^'"
If you need to stick to python then try to better use strip so that it removes also the quote character and print the line only if not empty:
for r in log:
host_line = "'",r['host'],"'"
for line in host_line:
if not line.startswith("178.255.20.20"):
line = line.strip("'\n")
if len(line) > 0: print line

Unable to separate codes in one file to many files in AWK/Python

I need to put different codes in one file to many files.
The file is apparantly shared by AWK's creators at their homepage.
The file is also here for easy use.
My attempt to the problem
I can get the lines where each code locate by
awk '{ print $1 }'
However, I do no know how
to get the exact line numbers so that I can use them
to collect codes between the specific lines so that the first word of each line is ignored
to put these separate codes into new files which are named by the first word at the line
I am sure that the problem can be solved by AWK and with Python too. Perhaps, we need to use them together.
[edit] after the first answer
I get the following error when I try to execute it with awk
$awk awkcode.txt
awk: syntax error at source line 1
context is
>>> awkcode <<< .txt
awk: bailing out at source line 1

Did you try to:
Create a file unbundle.awk with the following content:
$1 != prev { close(prev); prev = $1 }
{ print substr($0, index($0, " ") + 1) >$1 }
Remove the following lines form the file awkcode.txt:
# unbundle - unpack a bundle into separate files
$1 != prev { close(prev); prev = $1 }
{ print substr($0, index($0, " ") + 1) >$1 }
Run the following command:
awk -f unbundle.awk awkcode.txt

Are you trying to unpack a file in that format? It's a kind of shell archive. For more information, see http://en.wikipedia.org/wiki/Shar
If you execute that program with awk, awk will create all those files. You don't need to write or rewrite much. You can simply run that awk program, and it should still work.
First, view the file in "plain" format. http://dpaste.com/12282/plain/
Second, save the plain version of the file as 'awkcode.shar'
Third, I think you need to use the following command.
awk -f awkcode.shar
If you want to replace it with a Python program, it would be something like this.
import urllib2, sys
data= urllib2.urlopen( "http://dpaste.com/12282/plain/" )
currName, currFile = None, sys.stdout
for line in data:
fileName, _, text= line.strip().partition(' ')
if fileName == currName:
currFile.write(line+"\n")
else:
if currFile is not None:
currFile.close()
currName= fileName
currFile= open( currName, "w" )
if currFile is not None:
currFile.close()

Awk file awkcode.txt should not contain ANY BLANK line. If any blank line is encountered, the awk program fails. There is no error check to filter out blank line in the code. This I could find out after several days of struggle.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Print Matching line and one line before from the matched line - python

It can be done with awk alone: awk ' /Timed|failed/ { print previous, $0; }; {previous = $0;}' /var/log/samba/smbd.log.old

This might work for you (GNU sed): sed -n 'N;/\n.*\(failed\|Timed\)/s/\n//p;D' file Turn off implicit printing. Append the next line. If the appended line contains failed or Timed, delete the newline and print the result. Delete the first line in the pattern space and repeat.

Could you please try following tac + awk solution: tac Input_file | awk '/failed/{found=1;val=$0;next} found && NF{print $0,val;val=found=""}' OR adding a non-one liner form of solution: tac Input_file | awk ' /failed/{ found=1 val=$0 next } found && NF{ print $0,val val=found="" } '

Related

How to get consolidated count of delimiter occurrence

Execute chained bash commands including multiple pipes and grep in Python3

search strings in multiple textfile

if not line.startswith - give to much single quotes and line bracks

Unable to separate codes in one file to many files in AWK/Python

Categories

Resources