I am having a problem with creating a script, to search a file for certain patterns. Once the pattern is found in the file, it should print a message.
The problem I am having is that it is just executing the first for loop and not printing the next message.
`import r
f = open('/R1.ios')
V96189 = ['session-limit']
V96197 = ['logging enable']
for pattern in V96189:
for line in f:
if pattern in line:
print('V-96189', 'Not a finding', 'session-limit is set')
for pattern in V96197:
for line in f:
if pattern in line:
print('V-96197', 'Not a finding', 'logging enable is set')'
`
Related
I am trying to scan multiple files and search for two keywords in the same line. I am trying to look for the keywords "SEQADV" and "MUTATION" in the same line. The problem is I keep getting the error "NameError: name 'wt_residue' is not defined". When I search for one keyword "SEQADV", the program runs smoothly.
if 'SEQADV' and 'MUTATION' in line:
try:
mutation = line.split()
sequence_number = mutation[4]
chain = mutation[3]
mutant_residue = mutation[2]
wt_residue = mutation[7]
except IndexError:
pass
#Prints all data to the .csv file above and closes the file
print(",".join([pdb_name, mutant_residue, chain, sequence_number, wt_residue]), file=datafile)
datafile.close()
Try changing your if statement to if 'SEQADV' in line.split() and 'MUTATION' in line.split():
I am trying to use threads in Python to read some files (big files, some of the might be over a Gig size) and parse the file to find specific info, I am using the re module for that.
The problem is that I'm seeing very high execution times.
Reading over 4 files, then parsing the files for my data takes me over 30 seconds. Is this expected or there's any recommendation you can provide me with to improvde these times?
I apologize in advance, I'm sure that this has been asked in the forum already, i really tried to find this myself but could not find the right words to search for this problem.
Below is my current code:
def get_hostname(file: str) -> str:
"""
Get the hostname from show tech/show running file
:param file: show tech/show running string
:return: the hostname as a string
"""
hostname = re.findall('hostname.*', file, flags=re.IGNORECASE)
if len(hostname) > 0:
return hostname[0].split(' ')[1]
else:
print('Could not find a valid hostname on file ' + file)
def set_file_dictionary():
threads_list = []
def set_file_dictionary_thread(file_name: str):
thread_set_file_dict_time = time.time()
current_file = open(path + file_name, encoding='utf8', errors='ignore').read()
files_dir[get_hostname(current_file)] = current_file
print('set_file_dictionary_thread is ' + str(time.time() - thread_set_file_dict_time))
for file in list_files:
threads_list.append(threading.Thread(target=set_file_dictionary_thread, args=(file, )))
for thread in threads_list:
thread.start()
for thread in threads_list:
thread.join()
The result is
set_file_dictionary_thread is 12.55484390258789
set_file_dictionary_thread is 13.184206008911133
set_file_dictionary_thread is 16.15609312057495
set_file_dictionary_thread is 16.19360327720642
Main exec time is 16.1940469741821
Thanks for reading me
NOTE - The indentation is ok, for some reason it gets messed up when copying from Pycharmand
Firstly, running regex in multiple python threads won't help much. (see https://stackoverflow.com/a/9984414/14035728)
Secondly, you can improve your get_hostname function by:
compiling regex beforehand
using search instead of findall, since apparently you only need the first match
using groups to capture the hostname, instead of string split
Here's my suggested get_hostname function:
hostname_re = re.compile('hostname ([^ ]*)', flags=re.IGNORECASE)
def get_hostname(file: str) -> str:
match = hostname_re.search(file)
if match:
return match.groups()[0]
else:
print('Could not find a valid hostname on file ' + file)
This question already has answers here:
How to search for word in text file and print part of line with Python?
(2 answers)
Closed 5 years ago.
My goal is to create a script that search for credentials in an input file.
I find endless example, even here on StackOverflow, that can teach me how to search for a range of words in a file:
Example 1
Example 2
By the way, when I try to apply those rules to my script it return me nothing.
Here is my code:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
import argparse
import sys
parser = argparse.ArgumentParser()
parser.add_argument('-input', dest='input',help="input one or more files",metavar=None)
args = parser.parse_args()
GrabdirectoryFile = open(args.input,"r",encoding='UTF8')
directoryFile = GrabdirectoryFile.read()
HotWords = ['password', 'admin']
def search_for_lines(filename, words_list):
words_found = 0
for line_no, line in enumerate(filename):
if any(word in line for word in words_list):
print(line_no, ':', line)
words_found += 1
return words_found
search_for_lines(directoryFile,HotWords)
I tried following the instructions I find on the 2 links provided above but no luck.
The code is definitely executed and Python returns no errors.
The file contains many words and also a few 'password' and 'admin' but no line is returned.
Why?
EDIT:
dear #Kirk Broadhurst, #SIM, #André Schild, #kasperhj, #Garrett Hyde, I tried to follow your link and and substitute my code with:
with open(args.input) as openfile:
for line in openfile:
for part in line.split():
if "color=" in part:
print (part)
but unfortunately is still not working. The right solution was provided here below by #Farhan.K, I had to use readlines() instead of read()
You are reading the file using file.read() which returns a string but you are expecting a list. Use file.readlines() instead. As an aside, it is better use open/close files using the with statement.
Replace
GrabdirectoryFile = open(args.input,"r",encoding='UTF8')
directoryFile = GrabdirectoryFile.read()
with...
GrabdirectoryFile = open(args.input,"r",encoding='UTF8')
directoryFile = GrabdirectoryFile.readlines()
Using a with statement is better:
with open(args.input,"r",encoding='UTF8') as GrabdirectoryFile:
directoryFile = GrabdirectoryFile.readlines()
i'm trying to read words from a line after matching words :
To be exact -
I have a file with below texts:
-- Host: localhost
-- Generation Time: Nov 15, 2006 at 09:58 AM
-- Server version: 5.0.21
-- PHP Version: 5.1.2
I want to search that, if that file contains 'Server version:' sub string, if do then read next characters after 'Server version:' till next line, in this case '5.0.21'.
I tried the following code, but it gives the next line(-- PHP Version: 5.1.2) instead of next word (5.0.21).
with open('/root/Desktop/test.txt', 'r+') as f:
for line in f:
if 'Server version:' in line:
print f.next()
you are using f.next() which will return the next line.
Instead you need:
with open('/root/Desktop/test.txt', 'r+') as f:
for line in f:
found = line.find('Server version:')
if found != -1:
version = line[found+len('Server version:')+1:]
print version
You might want to replace that text like this
if 'Server version: ' in line:
print line.rstrip().replace('-- Server version: ', '')
We do line.rstrip() because the read line will have a new line at the end and we strip that.
Might be overkill, but you could also use the regular expressions module re:
match = re.search("Server version: (.+)", line)
if match: # found a line matching this pattern
print match.group(1) # whatever was matched for (.+ )
The advantage is that you need to type the key only once, but of course you can have the same effect by wrapping any of the other solutions into a function definition. Also, you could do some additional validation.
You can try using the split method on strings, using the string to remove (i.e. 'Server version: ') as separator:
if 'Server version: ' in line:
print line.split('Server version: ', 1)[1]
as you have
line='-- Server version: 5.0.21'
just:
line.split()[-1]
This gives you the last word rather than all the characters after :.
If you want all the characters after :
line.split(':', 1)[-1].strip()
Replace : with other string as needed.
I have a difficult problem. I know there are so many 're' masters in python out there. So please help me. I have a huge log file. The format is something like this:
[text hello world yadda
lines lines lines
exceptions]
[something i'm not interested in]
[text hello world yadda
lines lines lines
exceptions]
And so on...
So Block 1 and 3 are same. And there are multiple cases like this. My ques is how can I read this file and write in an output file only the unique blocks? If there's a duplicate, it should be written only once. And sometimes there are multiple blocks in between two duplicate blocks. I'm actually pattern matching and this is the code as of now. It only matches the pattern but doesn't do anything about duplicates.
import re
import sys
from itertools import islice
try:
if len(sys.argv) != 3:
sys.exit("You should enter 3 parameters.")
elif sys.argv[1] == sys.argv[2]:
sys.exit("The two file names cannot be the same.")
else:
file = open(sys.argv[1], "r")
file1 = open(sys.argv[2],"w")
java_regex = re.compile(r'[java|javax|org|com]+?[\.|:]+?', re.I) # java
at_regex = re.compile(r'at\s', re.I) # at
copy = False # flag that control to copy or to not copy to output
for line in file:
if re.search(java_regex, line) and not (re.search(r'at\s', line, re.I) or re.search(r'mdcloginid:|webcontainer|c\.h\.i\.h\.p\.u\.e|threadPoolTaskExecutor|caused\sby', line, re.I)):
# start copying if "java" is in the input
copy = True
else:
if copy and not re.search(at_regex, line):
# stop copying if "at" is not in the input
copy = False
if copy:
file1.write(line)
file.close()
file1.close()
except IOError:
sys.exit("IO error or wrong file name.")
except IndexError:
sys.exit('\nYou must enter 3 parameters.') #prevents less than 3 inputs which is mandatory
except SystemExit as e: #Exception handles sys.exit()
sys.exit(e)
I don't care if this has to be in this code(removing duplicates). It can be in a separate .py file also. Doesn't matter
This is the original snippet of the log file:
javax.xml.ws.soap.SOAPFaultException: Uncaught BPEL fault http://schemas.xmlsoap.org/soap/envelope/:Server
at org.apache.axis2.jaxws.marshaller.impl.alt.MethodMarshallerUtils.createSystemException(MethodMarshallerUtils.java:1326) ~[org.apache.axis2.jar:na]
at org.apache.axis2.jaxws.marshaller.impl.alt.MethodMarshallerUtils.demarshalFaultResponse(MethodMarshallerUtils.java:1052) ~[org.apache.axis2.jar:na]
at org.apache.axis2.jaxws.marshaller.impl.alt.DocLitBareMethodMarshaller.demarshalFaultResponse(DocLitBareMethodMarshaller.java:415) ~[org.apache.axis2.jar:na]
at org.apache.axis2.jaxws.client.proxy.JAXWSProxyHandler.getFaultResponse(JAXWSProxyHandler.java:597) ~[org.apache.axis2.jar:na]
at org.apache.axis2.jaxws.client.proxy.JAXWSProxyHandler.createResponse(JAXWSProxyHandler.java:537) ~[org.apache.axis2.jar:na]
at org.apache.axis2.jaxws.client.proxy.JAXWSProxyHandler.invokeSEIMethod(JAXWSProxyHandler.java:403) ~[org.apache.axis2.jar:na]
at org.apache.axis2.jaxws.client.proxy.JAXWSProxyHandler.invoke(JAXWSProxyHandler.java:188) ~[org.apache.axis2.jar:na]
com.hcentive.utils.exception.HCRuntimeException: Unable to Find User Profile:null
at com.hcentive.agent.service.AgentServiceImpl.getAgentByUserProfile(AgentServiceImpl.java:275) ~[agent-service-core-4.0.0.jar:na]
at com.hcentive.agent.service.AgentServiceImpl$$FastClassByCGLIB$$e3caddab.invoke(<generated>) ~[cglib-2.2.jar:na]
at net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:191) ~[cglib-2.2.jar:na]
at org.springframework.aop.framework.Cglib2AopProxy$CglibMethodInvocation.invokeJoinpoint(Cglib2AopProxy.java:689) ~[spring-aop-3.1.2.RELEASE.jar:3.1.2.RELEASE]
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) ~[spring-aop-3.1.2.RELEASE.jar:3.1.2.RELEASE]
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:110) ~[spring-tx-3.1.2.RELEASE.jar:3.1.2.RELEASE]
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) ~[spring-aop-3.1.2.RELEASE.jar:3.1.2.RELEASE]
at org.springframework.security.access.intercept.aopalliance.MethodSecurityInterceptor.invoke(MethodSecurityInterceptor.java:64) ~[spring-security-core-3.1.2.RELEASE.jar:3.1.2.RELEASE]
javax.xml.ws.soap.SOAPFaultException: Uncaught BPEL fault http://schemas.xmlsoap.org/soap/envelope/:Server
at org.apache.axis2.jaxws.marshaller.impl.alt.MethodMarshallerUtils.createSystemException(MethodMarshallerUtils.java:1326) ~[org.apache.axis2.jar:na]
at org.apache.axis2.jaxws.marshaller.impl.alt.MethodMarshallerUtils.demarshalFaultResponse(MethodMarshallerUtils.java:1052) ~[org.apache.axis2.jar:na]
at org.apache.axis2.jaxws.marshaller.impl.alt.DocLitBareMethodMarshaller.demarshalFaultResponse(DocLitBareMethodMarshaller.java:415) ~[org.apache.axis2.jar:na]
at org.apache.axis2.jaxws.client.proxy.JAXWSProxyHandler.getFaultResponse(JAXWSProxyHandler.java:597) ~[org.apache.axis2.jar:na]
at org.apache.axis2.jaxws.client.proxy.JAXWSProxyHandler.createResponse(JAXWSProxyHandler.java:537) ~[org.apache.axis2.jar:na]
at org.apache.axis2.jaxws.client.proxy.JAXWSProxyHandler.invokeSEIMethod(JAXWSProxyHandler.java:403) ~[org.apache.axis2.jar:na]
And so on and on....
you can remove duplicate blocks with this:
import re
yourstr = r'''
[text hello world yadda
lines lines lines
exceptions]
[something i'm not interested in]
[text hello world yadda
lines lines lines
exceptions]
'''
pat = re.compile(r'\[([^]]+])(?=.*\[\1)', re.DOTALL)
result = pat.sub('', yourstr)
Note that only the last block is preserved, If you want the first you must reverse the string and use this pattern:
(][^[]+)\[(?=.*\1\[)
and then reverse the string again.
You could use a hashing algorithm like in hashlib and a dictionary that looks like this: {123456789: True}
the value is not important but a dict makes it significantly faster than a list if its a big file.
Anyway you can hash each block as you come along it and store it in the dictionary as long as its not in the dictionary. If it is in the dictionary then ignore the block. That's assuming your blocks are structured absolutely identical.