Ignore non existing attributes in config file parser python - python

I have a function returning 1 list at a time, like below
['7:49', 'Section1', '181', '1578', '634', '4055']
['7:49', 'Section2', '181', '1578', '634', '4055']
These values are time,section,count,avg,min,max (I know this will always be the sequence)
My aim is to alert if any of the values breaches limits defined in a config file.
So I create a config like below
[Section1]
Count:10
Min:20
Max:100
Avg:50
[Section2]
Count:10
Min:20
Max:100
Avg:50
My function to check max and min limits
def checklimit(line):
print "Inside CheckLimit", line[1],line[4],line[5]
if line[4] < ConfigSectionMap(line[1])['min'] or line[5] > ConfigSectionMap(line[1])['max']:
sendAlert(line)
This works fine but this could be improved and has some corner cases.
Suppose someone leaves config as below
[Section1]
Count:10
Min:
Max:
Avg:50
[Section2]
Count:10
Avg:50
Meaning the user only wants to check for Count and Avg. How should these cases be handled in my code so as to check only required parameters given in config file. I have used Config Parser from here
Suggestions for question title improvement are welcome. It was hard to put one. Thanks

There's a many ways to approach this. With key lookups, in dictionaries you can use the dict.get() method and provide a fallback value.
so instead of
ConfigSectionMap(line[1])['min']
You can use something like this, which will return 0 if the key does not exist.
ConfigSectionMap(line[1]).get('min', 0)

Related

Extract text from a config file [duplicate]

This question already has answers here:
Parse key value pairs in a text file
(7 answers)
Closed 1 year ago.
I'm using a config file to inform my Python script of a few key-values, for use in authenticating the user against a website.
I have three variables: the URL, the user name, and the API token.
I've created a config file with each key on a different line, so:
url:<url string>
auth_user:<user name>
auth_token:<API token>
I want to be able to extract the text after the key words into variables, also stripping any "\n" that exist at the end of the line. Currently I'm doing this, and it works but seems clumsy:
with open(argv[1], mode='r') as config_file:
lines = config_file.readlines()
for line in lines:
url_match = match('jira_url:', line)
if url_match:
jira_url = line[9:].split("\n")[0]
user_match = match('auth_user:', line)
if user_match:
auth_user = line[10:].split("\n")[0]
token_match = match('auth_token', line)
if token_match:
auth_token = line[11:].split("\n")[0]
Can anybody suggest a more elegant solution? Specifically it's the ... = line[10:].split("\n")[0] lines that seem clunky to me.
I'm also slightly confused why I can't reuse my match object within the for loop, and have to create new match objects for each config item.
you could use a .yml file and read values with yaml.load() function:
import yaml
with open('settings.yml') as file:
settings = yaml.load(file, Loader=yaml.FullLoader)
now you can access elements like settings["url"] and so on
If the format is always <tag>:<value> you can easily parse it by splitting the line at the colon and filling up a custom dictionary:
config_file = open(filename,"r")
lines = config_file.readlines()
config_file.close()
settings = dict()
for l in lines:
elements = l[:-1].split(':')
settings[elements[0]] = ':'.join(elements[1:])
So, you get a dictionary that has the tags as keys and the values as values. You can then just refer to these dictionary entries in your pogram.
(e.g.: if you need the auth_token, just call settings["auth_token"]
if you can add 1 line for config file, configparser is good choice
https://docs.python.org/3/library/configparser.html
[1] config file : 1.cfg
[DEFAULT] # configparser's config file need section name
url:<url string>
auth_user:<user name>
auth_token:<API token>
[2] python scripts
import configparser
config = configparser.ConfigParser()
config.read('1.cfg')
print(config.get('DEFAULT','url'))
print(config.get('DEFAULT','auth_user'))
print(config.get('DEFAULT','auth_token'))
[3] output
<url string>
<user name>
<API token>
also configparser's methods is useful
whey you can't guarantee config file is always complete
You have a couple of great answers already, but I wanted to step back and provide some guidance on how you might approach these problems in the future. Getting quick answers sometimes prevents you from understanding how those people knew about the answers in the first place.
When you zoom out, the first thing that strikes me is that your task is to provide config, using a file, to your program. Software has the remarkable property of solve-once, use-anywhere. Config files have been a problem worth solving for at least 40 years, so you can bet your bottom dollar you don't need to solve this yourself. And already-solved means someone has already figured out all the little off-by-one and edge-case dramas like stripping line endings and dealing with expected input. The challenge of course, is knowing what solution already exists. If you haven't spent 40 years peeling back the covers of computers to see how they tick, it's difficult to "just know". So you might have a poke around on Google for "config file format" or something.
That would lead you to one of the most prevalent config file systems on the planet - the INI file. Just as useful now as it was 30 years ago, and as a bonus, looks not too dissimilar to your example config file. Then you might search for "read INI file in Python" or something, and come across configparser and you're basically done.
Or you might see that sometime in the last 30 years, YAML became the more trendy option, and wouldn't you know it, PyYAML will do most of the work for you.
But none of this gets you any better at using Python to extract from text files in general. So zooming in a bit, you want to know how to extract parts of lines in a text file. Again, this problem is an age-old problem, and if you were to learn about this problem (rather than just be handed the solution), you would learn that this is called parsing and often involves tokenisation. If you do some research on, say "parsing a text file in python" for example, you would learn about the general techniques that work regardless of the language, such as looping over lines and splitting each one in turn.
Zooming in one more step closer, you're looking to strip the new line off the end of the string so it doesn't get included in your value. Once again, this ain't a new problem, and with the right keywords you could dig up the well-trodden solutions. This is often called "chomping" or "stripping", and with some careful search terms, you'd find rstrip() and friends, and not have to do awkward things like splitting on the '\n' character.
Your final question is about re-using the match object. This is much harder to research. But again, the "solution" wont necessarily show you where you went wrong. What you need to keep in mind is that the statements in the for loop are sequential. To think them through you should literally execute them in your mind, one after one, and imagine what's happening. Each time you call match, it either returns None or a Match object. You never use the object, except to check for truthiness in the if statement. And next time you call match, you do so with different arguments so you get a new Match object (or None). Therefore, you don't need to keep the object around at all. You can simply do:
if match('jira_url:', line):
jira_url = line[9:].split("\n")[0]
if match('auth_user:', line):
auth_user = line[10:].split("\n")[0]
and so on. Not only that, if the first if triggered then you don't need to bother calling match again - it will certainly not trigger any of other matches for the same line. So you could do:
if match('jira_url:', line):
jira_url = line[9:].rstrip()
elif match('auth_user:', line):
auth_user = line[10:].rstrip()
and so on.
But then you can start to think - why bother doing all these matches on the colon, only to then manually split the string at the colon afterwards? You could just do:
tokens = line.rstrip().split(':')
if token[0] == 'jira_url':
jira_url = token[1]
elif token[0] == 'auth_user':
auth_user = token[1]
If you keep making these improvements (and there's lots more to make!), eventually you'll end up re-writing configparse, but at least you'll have learned why it's often a good idea to use an existing library where practical!

Does "in" do the same thing as str.contains()?

I'm new to Python but am very confused as to how this code works:
Correct code I don't understand:
I don't understand how in the function, you can just write ".org' in domain to capture whether the referrer_domain is an organization. I thought you would have to filter via .str.contains() to be able to see if the domain includes .org or .com.
I originally coded:
dot_org = data[data['referrer_domain'].str.contains('.org')
dot_com = data[data['referrer_domain'].str.contains('.com')
def domain_type(type):
if type in dot_org['referrer_domain']:
return 'organization'
elif type in dot_com['referrer_domain']:
return 'company'
else:
return 'other'
data['new_column'] = data['referrer_domain'].apply(domain_type)
But this ended up labeling all of the rows in the new column I created as "other".
Is anyone able to explain why the code in the picture works, but why the code above doesn't?
First, you should not use type as a variable name, because it's a reserved word.
Aside from that, there is no str.contains method, at least not in plain Python. The official way of checking if a string contains another string is using the in operator.

How to delete document from index by it's path in Whoosh

First i add documents to index like this:
writer.add_document(title=doc_path.split(os.sep)[-1], path=doc_path, content=text, textdata=text)
And then i just need to delete one of them completely from index by it's path. Documentation says there are few no low level method to do this:
delete_by_term(fieldname, termtext)
Deletes any documents where the given (indexed) field contains the
given term. This is mostly useful for ID or KEYWORD fields.
delete_by_query(query)
Deletes any documents that match the given query.
but i can't find suitable and very convenient method for me where i can specify path of the document and just remove it. There is some low level method where i can specify internal doc_number, which i supposed to get somehow.
Can anyone give me advice how it's better to accomplish this task?
ix = open_dir('/my_index_dir_path/..')
writer = ix.writer()
writer.delete_by_term('path', doc_path)
writer.commit()
delete_by_term
method does exactly what i need. Note, that first argument is a text string 'path', and them goes the actual path. My mistake was to put an actual path instead of attribute name.

Building a set of records incrementally as app progresses

I have an sysadmin type CLI app that reads in info from a config file. I cannot change the format of the config file below.
TYPE_A = "value1,value2,value2"
TYPE_A = "value3,value4,value5"
TYPE_B = "valuex,valuey,valuez"
Based on the TYPE, I'll need to do some initial processing with each one. After I'm done with that step for all, I need to do some additional processing and depending on the options chosen either print the state and intended action(s) or execute those action(s).
I'd like to do the initial parsing of the config into a dict of lists of dicts and update every instance of TYPE_A, TYPE_B, TYPE_C, etc with all the pertinent info about it. Then either print the full state or execute the actions (or fail if the state of something was incorrect)
My thought is it would look something like:
dict
TYPE_A_list
dict_A[0] key:value,key:value,key:value
dict_A[1] key:value,key:value,key:value
TYPE_B_list
dict_A[0] key:value,key:value,key:value
dict_A[1] key:value,key:value,key:value
I think I'd want to read the config into that and then add keys and values or update values as the app progresses and reprocesses each TYPE.
Finally my questions.
I'm not sure how iterate over each list of dicts or to add list elements and add or update key:value pairs.
Is what I describe above the best way to go about this?
I'm fairly new to Python, so I'm open to any advice. FWIW, this will be python 2.6.
A little clarification on the config file lines
CAR_TYPE = "Ford,Mustang,Blue,2005"
CAR_TYPE = "Honda,Accord,Green,2009"
BIKE_TYPE = "Honda,VTX,Black,2006"
BIKE_TYPE = "Harley,Sportster,Red,2010"
TIRE_TYPE = "170R15,whitewall"
Each type will have the same order and number of values.
No need to "remember" there are two different TYPE_A assignments - you can combine them.
TYPE_A = "value1,value2,value2"
TYPE_A = "value3,value4,value5"
would be parsed as only one of them, or both, depends on the implementation of your sysadmin CLI app.
Then the data model should be:
dict
TYPE_A: list(value1, value2, value3)
TYPE_B: list(valuex, valuey, valuez)
That way, you can iterate through dict.items() pretty easily:
for _type, values in dict.items():
for value in values:
print "%s: %s" % (_type, value)
# or whatever you wish to do

dictionary key-call

im building a test program. its essentially a database of bugs and bug fixes. it may end up being an entire database for all my time working in python.
i want to create an effect of layers by using a dictionary.
here is the code as of april 29 2011:
modules=['pass']
syntax={'PRINT':''' in eclipse anthing which
you "PRINT" needs to be within a set of paranthesis''','StrRet':'anytime you need to use the return action in a string, you must use the triple quotes.'}
findinp= input('''where would you like to go?
Dir:''')
if findinp=='syntax':
print(syntax)
dir2= input('choose a listing')
if dir2=='print'or'PRINT'or'Print':
print('PRINT' in syntax)
now when i use this i get the ENTIRE dictionary, not just the first layer. how would i do something like this? do i need to just list links in the console? or is there a better way to do so?
thanks,
Pre.Shu.
I'm not quite sure what you want, but to print the content of a single key of dictionary you index it:
syntax['PRINT']
Maybe this help a bit:
modules=['pass']
syntax={
'PRINT':''' in eclipse anthing which
you "PRINT" needs to be within a set of paranthesis''',
'STRRET':'anytime you need to use the return action in a string, you must use the triple quotes.'}
choice = input('''where would you like to go?
Dir:''').upper()
if choice in syntax:
print syntax[choice]
else:
print "no data ..."

Categories

Resources