I'm checking if there are various strings present in file contents at the same time and I am curious if this could be done in switches, which I'm moving the code to right now, because I have a lot of lines that look as the one below.
What larsks also mentioned in the comments, is, if I mean the match statement. Yes, I'm aiming for results like this statement, but I've also found another solution, which works for me in cases, where I am looking only for one substring.
My current code looks like this:
f = open('somesortoffilename')
if "string" in f.read() and "otherstring" in f.read(): variable = 'value'
And I would like something like this:
f = open('somesortoffilename')
def f(variable):
return {
'string' and 'otherstring' in f.read(): 'value'
}
Is it possible in any way?
First, we need to make sure that we read the file stream only once. First call to f.read() will already have used all the bytes in the file.
Let's store the contents in a string instead.
with open('somesortoffilename') as file:
contents = file.read()
The with form makes sure that the file stream is closed after we fetch its contents.
The “switch” pattern can be implemented in Python with dictionaries (the dict type).
switches = {
'term1': ['string', 'other_string'],
'term2': ['another_string']
}
We can use this lookup to check if any string corresponding a term is found in the file.
def f(contents):
for term, values in switches.items():
if any(x in contents for x in values):
return term
return None
Related
i am quite new to Python and i would like to ask the following:
Let's say for example i have the following .txt file:
USERNAME -- example.name
SERVER -- server01
COMPUTERNAME -- computer01
I would like to search through this document for the 3 keywords: USERNAME, SERVER and COMPUTERNAME and when I find these, I would like to extract their values, a.i "example.name", "server01" and "computer01" respectively for each line.
Is this possible? I have already tried for looking by line numbers, but I would much prefer searching by keywords instead.
Question 2: Somewhere in this .txt file exists a line with the keyword Adresses: which has multiple values but listed in different lines, like such:
Adresses:
6001:8000:1080:142::12
8002:2013:2380:110::53
9007:2013:2380:117::80
.
.
Would there be any way to get all of the listed addresses as a result, not just the first one? The number of said addresses is dynamic, so it may change in the future.
To this i have honestly no idea how to begin. I appreciate any kind of hints or pointing me in the right direction.
Thank you very much for your time and attention!
Like this:
with open("filename.txt") as f:
for x in f:
a = x.split(" -- ")
print(a[1])
If line with given value always starts with keyword you can try something like this
with open('file.txt', 'r') as file:
for line in file:
if line.startswith('keyword'):
keyword, value = line.split(' -- ')
and to gather all the addresses i'd initiate list of addresses beforehand, then add line
addresses.append(value)
inside of if statement
Your best friend for this kind of task will be str.split function. You can put your data in a dict which will map keywords to values :
data = {} # Create a new dict
with open('data.txt') as file: # Open the file containing the data
lines = file.read().split('\n') # Split the file in lines
for line in lines: # For each line
keyword, value = line.split(' -- ') # Extract keyword and value
data[keyword] = value # Put in the dict
Then, you can access your values with data['USERNAME'] for example. This method will work on any document containing a key-value association on each line (even if you have more than 3 keywords). However, it will not work if the same text file contains the addresses in the format you mentionned.
If you want to include the addresses in the same file, you'll need to adapt the code. You can for example check if splitted_line contains two elements (= key-value on the same line, like USERNAME) or only one (= key-value on multiple lines, like Addresses:). Then, you can store in a list all the different addresses, and bound this list to the data dict. It's not a problem to have a dict in the form :
{
'USERNAME': 'example.name',
'Addresses': ['6001:8000:1080:142::12', '8002:2013:2380:110::53']
}
I have a csv file that i need to format first before i can send the data to zabbix, but i enconter a problem in 1 of the csv that i have to use.
This is a example of the part of the file that i have problems:
Some_String_That_ineed_01[ifcontainsCell01removehere],xxxxxxxxx02
Some_String_That_ineed_01vtcrmp01[ifcontainsCell01removehere]-[1],aass
so this is 2 lines from the file, the other lines i already treated.
i need to check if Cell01 is in the line
if 'Cell01' in h: do something.
i need to remove all the content beetwen the [ ] included this [] that contains the Cell01 word and leave only this:
Some_String_That_ineed_01,xxxxxxxxx02
Some_String_That_ineed_01vtcrmp01-[1],aass
else my script already treats quite easy. There must be a better way then what i think which is use h.split in the first [ then split again on the , then remove the content that i wanna then add what is left sums strings. since i cant use replace because i need this data([1]).
later on with the desired result i will add this to zabbix sender as the item and item key_. I already have the hosts,timestamps,values.
You should use a regexp (re module):
import re
s = "Some_String_That_ineed_01[ifcontainsCell01removehere],xxxxxxxxx02"
replaced = re.sub('\[.*Cell01.*\]', '', s)
print (replaced)
will return:
[root#somwhere test]# python replace.py
Some_String_That_ineed_01,xxxxxxxxx02
You can also experiment here.
I'm working with a Rest Api for finding address details. I pass it an address and it passes back details for that address: lat/long, suburb etc. I'm using the requests library with the json() method on the response and adding the json response to a list to analyse later.
What I'm finding is that when there is a single match for an address the 'FoundAddress' key in the json response contains a dictionary but when more than one match is found the 'FoundAddress' key contains a list of dictionaries.
The returned json looks something like:
For a single match:
{
'FoundAddress': {AddressDetails...}
}
For multiple matches:
{
'FoundAddress': [{Address1Details...}, {Address2Details...}]
}
I don't want to write code to handle a single match and then multiple matches.
How can I modify the 'FoundAddress' so that when there is a single match it changes it to a list with a single dictionary entry? Such that I get something like this:
{
'FoundAddress': [{AddressDetails...}]
}
If it's the external API sending responses in that format then you can't really change FoundAddress itself, since it will always arrive in that format.
You can change the response if you want to, since you have full control over what you've received:
r = json.parse(response)
fixed = r['FoundAddress'] if (type(r['FoundAddress']) is list) else [r['FoundAddress']]
r['FoundAddress'] = fixed
Alternatively you can do the distinction at address usage time:
def func(foundAddress):
# work with a single dictionary instance here
then:
result = map(func, r['FoundAddress']) if (type(r['FoundAddress']) is list) else [func(r['FoundAddress'])]
But honestly I'd take a clear:
if type(r['FoundAddress']) is list:
result = map(func, r['FoundAddress'])
else:
result = func(r['FoundAddress'])
or the response fix-up over the a if b else c one-liner any day.
If you can, I would just change the API. If you can't there's nothing magical you can do. You just have to handle the special case. You could probably do this in one place in your code with a function like:
def handle_found_addresses(found_addresses):
if not isinstance(found_addresses, list):
found_addresses = [found_addreses]
...
and then proceed from there to do whatever you do with found addresses as if the value is always a list with one or more items.
I am seeking some advice whether it be in terms of a script (possibly python?) that I could use to do the following.
I basically have two documents, taken from a DB:
document one contains :
hash / related username.
example:
fb4aa888c283428482370 username1
fb4aa888c283328862370 username2
fb4aa888c283422482370 username3
fb4aa885djsjsfjsdf370 username4
fb4aa888c283466662370 username5
document two contains:
hash : plaintext
example:
fb4aa888c283428482370:plaintext
fb4aa888c283328862370:plaintext2
fb4aa888c283422482370:plaintext4
fb4aa885djsjsfjsdf370:plaintextetc
fb4aa888c283466662370:plaintextetc
can anyone think of an easy way for me to match up the hashes in document two with the relevant username from document one into a new document (say document three) and add the plain so it would look like the following...
Hash : Relevant Username : plaintext
This would save me a lot of time having to cross reference two files, find the relevant hash manually and the user it belongs to.
I've never actually used python before, so some examples would be great!
Thanks in advance
I don't have any code for you but a very basic way to do this would be to whip up a script that does the following:
Read the first doc into a dictionary with the hashes as keys.
Read the second doc into a dictionary with the hashes as keys.
Iterate through both dictionaries, by key, in the same loop, writing out the info you want into the third doc.
You didn't really specify how you wanted the output, but this should get you close enough to modify to your liking. There are guys out there good enough to shorten this into a fey lines of code - but I think the readability of keeping it long may be helpful to you just getting started.
Btw, I would probably avoid this altogether and to the join in SQL before creating the file -- but that wasn't really your question : )
usernames = dict()
plaintext = dict()
result = dict()
with open('username.txt') as un:
for line in un:
arry = line.split() #Turns the line into an array of two parts
hash, user = arry[0], arry[1]
usernames[hash] = user.rsplit()[0] # add to dictionary
with open('plaintext.txt') as un:
for line in un:
arry = line.split(':')
hash, txt = arry[0], arry[1]
plaintext[hash] = txt.rsplit()[0]
for key, val in usernames.items():
hash = key
txt = plaintext[hash]
result[val] = txt
with open("dict.txt", "w") as w:
for name, txt in result.items():
w.write('{0} = {1}\n'.format(name, txt))
print(usernames) #{'fb4aa888c283466662370': 'username5', 'fb4aa888c283422482370': 'username3' ...................
print(plaintext) #{'fb4aa888c283466662370': 'plaintextetc', 'fb4aa888c283422482370': 'plaintext4' ................
print(result) #{'username1': 'plaintext', 'username3': 'plaintext4', .....................
I have a text log file that looks like this:
Line 1 - Date/User Information
Line 2 - Type of LogEvent
Line 3-X, variable number of lines with additional information,
could be 1, could be hundreds
Then the sequence repeats.
There are around 20K lines of log, 50+ types of log events, approx. 15K separate user/date events. I would like to parse this in Python and make this information queryable.
So I thought I'd create a class LogEvent that records user, date (which I extract and convert to datetime), action, description... something like:
class LogEvent():
def __init__(self,date,user):
self.date = date # string converted to datetime object
self.user = user
self.content = ""
Such an event is created each time a line of text with user/date information is parsed.
To add the log event information and any descriptive content, there could be something like:
def classify(self,logevent):
self.logevent = logevent
def addContent(self,lineoftext):
self.content += lineoftext
To process the text file, I would use readline() and proceed one line at a time. If the line is user/date, I instantiate a new object and add it to a list...
newevent = LogEvent(date,user)
eventlist.append(newevent)
and start adding action/content until I encounter a new object.
eventlist[-1].classify(logevent)
eventlist[-1].addContent(line)
All this makes sense (unless you convince me there is a smarter way to do it or a useful Python module I am not aware of). I'm trying to decide how best to classify the log event type when working with a set list of possible log event types that might hold more than 50 possible types, and I don't just want to accept the entire line of text as the log event type. Instead I need to compare the start of the line against a list of possible values...
What I don't want to do is have 50 of these:
if line.startswith("ABC"):
logevent = "foo"
if line.startswith("XYZ"):
logevent = "boo"
I thought about using a dict as lookup table but I am not sure how to implement that with the "startswith"... Any suggestions would be appreciated, and my apologies if I was way too long winded.
If you have a dictionary of your logEvent types as keys and whatever you want to go into the logevent attribute as values, you can do this,
logEvents = {"ABC":"foo", "XYZ":"boo", "Data Load":"DLtag"}
and the line from your log file is this,
line = "Data Load: 127 row uploaded"
you can check if any of the keys above are at the beginning of the line,
for k in logEvents:
if line.startswith(k):
logevent = logEvents[k]
This will loop over all the keys in logEvents and check if line starts with one of them. You can do whatever you like after the if conditional. You could put this into a function that is called after a line of text with user/date information is parsed. If you want to do something if no keys are found you can do this,
for k in logEvents:
if line.startswith(k):
logevent = logEvents[k]
return
raise ValueError( "logEvent not recognized.\n line = " + line )
Note, the exact type of exception you raise is not super important. I chose one of the builtin exceptions to avoid subclassing. Here you can see all the builtin exceptions.
Since I didn't do a good job posing my question, I have given it more thought and come up with this answer, which is similar to this thread.
I would like a clean, easily manageable solution to process each line of text differently, based on whether certain conditions are met. I didn't want to use a bunch of if/else clauses. So I tried instead to move both condition and consequence (processing) into a decisionDict = {}.
### RESPONSES WHEN CERTAIN CONDITIONS ARE MET - simple examples
def shorten(line):
return line[:25]
def abc_replace(line):
return line.replace("xyz","abc")
### CONDITIONAL CHECKS FOR CONTENTS OF LINES OF TEXT - simple examples
def check_if_string_in_line(line):
response = False
if "xyz" in line:
response = True
return response
def check_if_longer_than25(line):
response = False
if len(line)>25:
response = True
return response
### DECISION DICTIONARY - could be extended for any number of condition/response
decisionDict = {check_if_string_in_line:abc_replace, check_if_longer_than25:shorten}
### EXAMPLE LINES OF SILLY TEXT
lines = ["Alert level raised to xyz",
"user 5 just uploaded duplicate file",
"there is confusion between xyz and abc"]
for line in lines:
for k in decisionDict.keys():
if k(line):#in line:
print decisionDict[k](line)
This keeps all the conditions and actions neatly separated. It also currently does not allow for more than one condition to apply to any one line of text. Once the first condition that resolves to 'True', we move on to the next line of text.