How to iterate more efficiency with multiple IF statements Python - python

I have 3 if statements and they are really ugly in terms of style and efficiency.
They parse HTML with BS4.
HTML is in example_post variable.
If element exists -> get text
If does not exist -> assign 'None' as a str.
if example_post.find('span', class_='tag1'):
post_reactions = example_post.find('span', class_='tag1').getText()
else:
post_reactions = 'None'
if example_post.find('span', class_='tag2'):
post_comments = example_post.find('span', class_='tag2').getText()
else:
post_comments = 'None'
if example_post.find('span', class_='tag3'):
post_shares = example_post.find('span', class_= 'tag3').getText()
else:
post_shares = 'None'
I started to google how to make it better and found that it is possible to use dictionaries with if statements
so the dict
post_reactions_dict = {'post_reactions': 'tag1', 'post_comments':'tag2','post_shares':'tag3'}
and tried like this
post_titles = []
post_values = []
for key,value in post_reactions_dict.items():
if example_post.find('span', class_=key):
post_values.append(example_post.find('span', class_=key).getText())
post_titles.append(key)
else:
post_titles.append(key)
post_values.append('None')
It is ok, but maybe it is possible to make it even better?
Ideal result:
post_titles = ['post_reactions', 'post_comments', 'post_shares']
post_values (it depends) but for the question ['None', 'None', 'None']

I would suggest making this a bit more generic and avoid using exceptions as the "normal" program flow:
def get_span(element,class_):
tag = element.find('span', class_=class_)
return None if tag is None else tag.getText()
post_reactions = get_span(example_post,'tag1')
post_comments = get_span(example_post,'tag2')
post_share = get_span(example_post,'tag3')

post = {}
attributes = ('reactions', 'tag1'), ('comments', 'tag2'), ('shares', 'tag3')
for attribute, tag in attributes:
try:
post[attribute] = example_post.find('span', class_=tag).getText()
except AttributeError:
post[attribute] = None
Don't use individual variables, use a dict to store the data.
Figure out what the variables (the differences between your repeated code) are; in this case it's just post_* and tag*, put them as pairs together as data.
Don't repeat the example_post.find(...) call; here we're using the fact that .getText() will likely cause an AttributeError if find() returns None/False/whatever it is it returns.

I assume the .find() method returns a class or None? If so my approach without any if:
def get_text(class_):
try:
return example_post.find('span', class_=class_).getText()
except AttributeError:
return 'None'
post_reactions = get_text('tag1')
post_comments = get_text('tag2')
post_share = get_text('tag3')

Related

python function try & except

I wonder what am i missing in my very simple function
I want to create a simple function which will take as argument other function and will try to run it. If it can't proceed then it should go to except statement.
def if_not_found(formula):
try:
return formula
except:
print('nothing')
text1 = if_not_found(element.find('span', {'class': 'listitem123'}).text.split()[0].replace('kd', ''))
print(text1)
Why I receive below? Why it didn't print 'nothing'?
AttributeError: 'NoneType' object has no attribute 'text'
Probably the error is on the parser, the statement element.find('span', {'class': 'listitem123'}) is returning None. Besides that, you are calling the replace method, so you are not passing a function as an argument to if_not_found, you are passing the result of the replace method.
You have a typo. element.find("span", {"class": "listitem123"}) was not found, so it returned None. .text is not an attribute of None.
Maybe something like this would suit your situation better?
def found_or_none(element):
found = element.find("span", {"class": "listitem123"})
if found:
return found.text.split()[0].replace("kd", "")
else:
return None
text1 = found_or_none(element)
if text1:
print(text1)
else:
print("Element not found")

I am trying to import xml file with some empty attributes to table. getting this error AttributeError: 'NoneType' object has no attribute 'strip'

def XML_get_fields_and_types_and_data_levels_3(xml_file_name):
data_2d = []
for child in root:
grandchildren = child.findall(".//")
fields = []
types = []
data_1d = []
data_2d.append(data_1d)
for grandchild in grandchildren:
data_1d.append(convert_string_to_type(grandchild.text))
if grandchild.tag not in fields:
fields.append(grandchild.tag)
types.append(get_type_of_string(grandchild.text))
return (fields, types, data_2d)
def get_type_of_string(string):
clean_string = string.strip()
try:
if clean_string is not None:
clean_string = string.strip()
return string.strip()
if "." in clean_string:
clean_string = string.split()
if isinstance(clean_string, list):
point_or_segment = [float(i) for i in clean_string]
if len(point_or_segment) == 2:
return 'POINT'
else:
return 'LSEG'
else:
val = float(clean_string)
return 'REAL'
else:
val = int(clean_string)
return 'INTEGER'
except ValueError:
return 'TEXT'
the issue is the line-of-code (loc) after your method definition,
def get_type_of_string(string):
clean_string = string.strip()
there string might be None, so the exception is raised. Instead of re-writing the method for you, which would easy for me but not be very helpful for you, I suggest you to re-design this method. Here are my hints:
there is duplicated code
the split() method always returns a list, no matter if the separator is found, isn't, so there is no need for this line to exists if isinstance(clean_string, list)
then why is this conversion in place if then val is not used thereafter? the easiest way to evaluate a variable type is by using the isinstance() builtin method as you did it few lines above.
try to split this method, in simpler and smaller methods, or try to simplify its logic.
Hope this hints will help you through. Enjoy coding.

Python function to assign data to multiple variables

I'm using this code to assign data to some variables if a form doesn't validate. This is some logic I'll be using a lot in my script. I want to create a function so that the else portion of this statement is stored in a function, so that I can just call it rather than pasting these lines each time.
if form.validate_on_submit():
do something
else:
brand_title=form.brand_title.data or ''
carrier_format=form.carrier_format.data or ''
recording_artist=form.recording_artist.data or ''
producer=form.producer.data or ''
session=form.session.data or ''
tx_date=form.tx_date.data or ''
network=form.network.data or ''
programme_number=form.programme_number.data or ''
start_time_1=form.start_time_1.data or ''
I've created a function like so:
def variables():
brand_title=form.brand_title.data or ''
carrier_format=form.carrier_format.data or ''
recording_artist=form.recording_artist.data or ''
producer=form.producer.data or ''
session=form.session.data or ''
tx_date=form.tx_date.data or ''
network=form.network.data or ''
programme_number=form.programme_number.data or ''
start_time_1=form.start_time_1.data or ''
But how do I return the variables so that calling the function mirrors typing each line out (as in else section in the first section of code).
I've read that simply returning each variable like so:
return (brand_title. carrier_format, recording_artist, producer, session, tx_date, network, programme_number, start_time_1)
would create a tuple, which doesn't seem like the correct option for my needs.
Might I suggest a small change to your function:
def variables():
var_dict = {}
var_dict['brand_title'] = form.brand_title.data or ''
var_dict['carrier_format'] = form.carrier_format.data or ''
var_dict['recording_artist'] = form.recording_artist.data or ''
var_dict['producer'] = form.producer.data or ''
var_dict['session'] = form.session.data or ''
var_dict['tx_date'] = form.tx_date.data or ''
var_dict['network'] = form.network.data or ''
var_dict['programme_number'] = form.programme_number.data or ''
var_dict['start_time_1'] = form.start_time_1.data or ''
return var_dict
Using a dictionary to store your data is cleaner, especially when you have so many related variables.
If this is a WTForm, you might just access the data which is already in a dict with var_dict = form.data.copy(), and then access your fields with var_dict.get(<var>, '') as needed.
A dictionary would be a better. Also, you can create a list or other container with the names of the attributes you want and then generalize your logic into a loop:
def get_values(form):
names = ['brand_title', 'carrier_format', 'recording_artist'] # add others
return {name: getattr(form, name).data or '' for name in names}

How to combine try/except in python into a pretty one liner?

In order to make CSV files with many columns, I have many, many instances of
try:
printlist.append(text['data'])
except:
printlist.append('')
Is it possible to condense these 4 lines into 1 or 2 (mostly for easier reading of the code)? I've tried with this function but I haven't discovered a way to pass something that doesn't exist.
def tryexcept(input):
try:
printlist.append(input)
except:
printlist.append('')
return printlist
UPDATE I should mention that 'text' is actually a dict value, so it should look like
printlist.append(text['data'])
(changed above)
What about:
printlist.append(text['data'] if 'data' in text else '')
Or even better as #bruno desthuilliers suggested:
printlist.append(text.get('data',''))
[EDIT]
For nested dict I normally use my dict selector:
class TokenDataType:
LIST = "list"
DICT = "dict"
def _select_key(keyitt, data):
try:
new_key = keyitt.next()
except StopIteration:
return data
if new_key["t"] == TokenDataType.DICT:
return _select_key(keyitt, data[new_key["k"]])
elif new_key["t"] == TokenDataType.LIST:
return _select_key(keyitt, data[new_key["i"]])
def tokenize_query(query):
tokens = []
for token in query.split("."):
token = token.strip()
if token:
ttype = TokenDataType.LIST if "[" in token else TokenDataType.DICT
if ttype == TokenDataType.LIST:
index = None
if len(token) >= 3:
index = int(token.replace("[", "").replace("]", ""))
tokens.append({"k":token, "t":ttype, "i":index})
else:
tokens.append({"k":token, "t":ttype})
return tokens
def normalize_query(query=None, tokens=None):
if tokens == None:
tokens = tokenize_query(query)
return ".".join([token["k"] for token in tokens])
def select(query, data, throw_exception_on_key_not_found=False):
tokens = tokenize_query(query)
try:
return _select_key(iter(tokens), data)
except Exception as e:
if throw_exception_on_key_not_found:
raise e
return None
DQ = select
if __name__ == "__main__":
test = {"bla":1, "foo":{"bar":2}, "baz":[{"x":1}, {"x":2}]}
print(DQ(".bla", test))
print(DQ("bla", test))
print(DQ("nothere", test))
print(DQ(".foo", test))
print(DQ("foo.bar", test))
print(DQ("baz", test))
print(DQ("baz.[0]", test))
print(DQ("baz.[1].x", test))
print(DQ("baz.[2].x", test))
for your case (appends None when one of the keys is not found):
printlist.append(DQ("data.someotherkey.yetanotherkey", text))
There's a dictionary method that does exactly this, and it let's you specify any default value.
input = text.get('data', default='')
printlist.append(input)
It checks if the key exists in the dictionary, and if not, it returns the default value. More on dictionaries here.
Try this simple wrapper:
def execute_with_exception_handling(f):
try:
return f()
except:
raise
Then you execute your function:
def my_func():
return 0 / 0
execute_with_exception_handling(my_func)
You can also add arguments to the function with *args. And you can even use decorators...that you can google because I recall off the top of my head how that works.
Why do you need to pass something that does not exist?
You can simply call function if the value that is being passed in is not "None" (Or any other unwanted value).
tryexcept( x ) if x is not None else None
Edit:
Are you are trying to see if the variable is declared or not? If yes, one way to get around this would be to declare the variable beforehand:
x = None
...
tryexcept( x ) if x is not None else None

Python - Continuing after exception at the point of exception

I'm trying to extract data from an xml file. A sample of my code is as follows:
from xml.dom import minidom
dom = minidom.parse("algorithms.xml")
...
parameter = dom.getElementsByTagName("Parameters")[0]
# loop over parameters
try:
while True:
parameter_id = parameter.getElementsByTagName("Parameter")[m].getAttribute("Id")
parameter_name = parameter.getElementsByTagName("Name")[m].lastChild.data
...
parameter_default = parameter.getElementsByTagName("Default")[m].lastChild.data
print parameter_id
print parameter_default
m = m+1
except IndexError:
#reached end of available parameters
pass
#except AttributeError:
#parameter doesn't exist
#?
If all elements for each parameter exist, the code runs correctly. Unfortunately the data I am supplied often has missing entries in it, raising an AttributeError exception. If I simply pass on that error, then any elements that do exist but are retrieved later in the loop than when the exception occurred are skipped, which I don't want. I need some way to continue where the code left off and skip to the next line of code if this specific exception is raised.
The only way to work around this that I can think of would be to override the minidom's class methods and catch the exception there, but that seems far too messy and too much work to handle what should be a very simple and common problem. Is there some easier way to handle this that I am missing?
Instead of "an individual try-except block for every statement", why not abstract out that part?
def getParam(p, tagName, index, post=None):
post = post or lambda i: i
try:
return post(p.getElementsByTagName(tagname)[index])
except AttributeError:
print "informative message"
return None # will happen anyway, but why not be explicit?
then in the loop you could have things like:
parameter_id = getParam(parameter, "Parameter", m, lambda x: x.getAttribute("Id"))
parameter_name = getParam(parameter, "Name", m, lambda x: x.lastChild.data)
...
I think there are two parts to your question. First, you want the loop to continue after the first AttributeError. This you do by moving the try and except into the loop.
Something like this:
try:
while True:
try:
parameter_id = parameter.getElementsByTagName("Parameter")[m].getAttribute("Id")
parameter_name = parameter.getElementsByTagName("Name")[m].lastChild.data
...
parameter_default = parameter.getElementsByTagName("Default")[m].lastChild.data
print parameter_id
print parameter_default
m = m+1
except AttributeError:
print "parameter doesn't exist"
#?
except IndexError:
#reached end of available parameters
pass
The second part is more tricky. But it is nicely solved by the other answer.

Categories

Resources