More than one for-loop in a python while-loop - python

I'm a python/coding newbie and I'm trying to put a two for loops into a while loop? Can I do this? How can I print out the dictionary mydict to make sure I am doing this correctly?
I'm stuck.
40 minutes later. Not stuck anymore. Thanks everyone!
def runloop():
while uid<uidend:
for row in soup.findAll('h1'):
try:
name = row.findAll(text = True)
name = ''.join(name)
name = name.encode('ascii','ignore')
name = name.strip()
mydict['Name'] = name
except Exception:
continue
for row in soup.findAll('div', {'class':'profile-row clearfix'}):
try:
field = row.find('div', {'class':'profile-row-header'}).findAll$
field = ''.join(field)
field = field.encode('ascii','ignore')
field = field.strip()
except Exception:
continue
try:
value = row.find('div', {'class':'profile-information'}).findAl$
value = ''.join(value)
value = value.encode('ascii','ignore')
value = value.strip()
return mydict
mydict[field] = value
print mydict
except Exception:
continue
uid = uid + 1
runloop()

On nested loops:
You can nest for and while loops very deeply before python will give you an error, but it's usually bad form to go more than 4 deep. Make another function if you find yourself needing to do a lot of nesting. Your use is fine though.
Some problems with the code:
It will never reach the print statements because under the first for loop you have a return statement. When python sees a return inside a function, it will leave the function and present the return value.
I would avoid using try and except until you understand why you're getting the errors that you get without those.
Make sure the indentation is consistent. Maybe it's a copy and paste error, but it looks like the indentation of some lines is a character more than others. Make sure every tab is 4 spaces. Python, unlike most languages, will freak out if the indentation is off.
Not sure if you just didn't post the function call, but you would need to call runloop() to actually use the function.

You can put as many loops within other loops as you'd like. These are called nested loops.
Also, printing a dictionary is simple:
mydict = {}
print mydict

You are not helping yourself by having these all over the place
except Exception:
continue
That basically says, "if anything goes wrong, carry one and don't tell me about it."
Something like this lets you at least see the exception
except Exception as e:
print e
continue
Is mydict declared somewhere? That could be your problem

Related

How to do elegant error handling in Python

I'm scraping Linkedin using Selenium. This a very brittle task and exceptions are raised often. I want to find an elegant way to handle errors. The internet has the usual try catch but its clunky... See the code below:
try:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable(job))
job_title = job.find_element(By.CLASS_NAME, "base-search-card__title").text
company = job.find_element(By.CLASS_NAME, "base-search-card__subtitle").text
location = job.find_element(By.CLASS_NAME, "job-search-card__location").text
except :
print("Boom Boom")
If any of the find_element methods throws the expect part is run and the code in the try wont execute further. I'd like a scenario where if one fails the except wouldn't be hit i.e. if it fails I can return an empty string. I can wrap everything in a function and do something like this:
def extract_job_title(job):
try:
return job.find_element(By.CLASS_NAME, "base-search-card__title").text
except:
return ""
and have:
job_title = extract_job_title(job)
but that is also clunky... I want something like I would have in Swift. Something like this:
let job_title = try? job.find_element(By.CLASS_NAME, "base-search-card__title").text ?? ""
Does something similar to Swift exist and if not can anyone else see a way of making things "nicer" other than using functions?
Generally, if you have a lot of very repetitive code with only small differences, you can probably extract that into a function or loop. E.g.:
searches = {'title': 'base-search-card__title', ...}
data = {}
for item, cls in searches.items():
try:
data[item] = job.find_element(By.CLASS_NAME, cls).text
except:
pass
This would also extract data into one dict, which seems more logical than a bunch of separate variables.
Alternatively, extracted into a function:
def get(job, cls):
try:
return job.find_element(By.CLASS_NAME, cls).text
except:
return None
job_title = get(job, 'base-search-card__title')
Note that you should intercept a specific exception in except, not any and all exceptions.
Alternatively, turn the entire thing on its head and only evaluate the elements that actually exist, and sort them into the right bins. Something along the lines of:
classes = {'base-search-card__title': 'title', ...}
data = {}
for elem in job.some_broader_find_query():
data[classes[elem.class_name]] = elem.text

In case of an exception during a loop: How to return the intermediate result before passing on the exception?

def store(self) -> list:
result = []
for url in self.urls():
if url.should_store():
stored_url = self.func_that_can_throw_errors(url)
if stored_url: result.append(stored_url)
return result
Preface: not actual method names. Silly names chosen to emphasize
During the loop errors may occur. In that case, I desire the intermediate result to be returned by store() and still raise the original exception for handling in a different place.
Doing something like
try:
<accumulating results ... might break>
except Exception:
return result
raise
sadly doesn't do the trick, since trivially the raise stmt won't be reached (and thus an empty list get's returned).
Do you guys have recommendations on how not to lose the intermediate result?
Thanks a lot in advance - Cheers!
It is not possible as you imagine it. You can't raise an exception and return a value.
So I think what you are asking for is a work around. There, I see two possibilities:
return a Flag/Exception along the actual return value:
Return flag:
except Exception:
return result, False
where False is the Flag telling that something went wrong
Return Exception:
except Exception as e:
return result, e
Since it appears, that store is a method of some class, you could raise the exception and retrieve the intermediary result with a second call like so:
def store(self):
result = []
try:
# something
except Exception:
self.intermediary_result = result
raise
def retrieve_intermediary(self):
return self.intermediary_result
The best answer I can come up with given my limited knowledge of Python would be to always return a pair, where the first part of the pair is the result and the second part of the pair is an optional exception value.
def store(self) -> list:
'''
TODO: Insert documentation here.
If an error occurs during the operation, a partial list of results along with
the exception value will be returned.
:return A tuple of [list of results, exception]. The exception part may be None.
'''
result = []
for url in self.urls():
if url.should_store():
try:
stored_url = self.func_that_can_throw_errors(url)
except Exception as e:
return result, e
if stored_url: result.append(stored_url)
return result, None
That said, as you have mentioned, if you have this call multiple places in your code, you would have to be careful to change it in all relevant places as well as possibly change the handling. Type checking might be helpful there, though I only have very limited knowledge of Python's type hints.
Meanwhile I had the idea to just use an accumulator which appears to be the 'quickest' fix for now with the least amount of changes in the project where store() is called.
The (intermediate) result is not needed everywhere (let's say it's optional). So...
I'd like to share that with you:
def store(self, result_accu=None) -> list:
if result_accu is None:
result_accu = []
for url in self.urls():
if url.should_store():
stored_url = self.func(url)
if stored_url: result_accu.append(stored_url)
return result_accu
Still returning a list but alongside the intermediate result is accessible by reference on the accu list.
Making the parameter optional enables to leave most statements in project as they are since the result is not needed everywhere.
store() is rather some kind of a command where the most work on data integrity is done within already. The result is nice-to-have for now.
But you guys also enabled me to notice that there's work to do in ordner to process the intermediate result anyway. Thanks! #attalos #MelvinWM

python switch code always matches even when it should not do so

I am trying to replace a largish if-elif block in python with something a bit more like java switch. My understanding is this should be a bit faster and we are parsing lots of data so if I can get speed improvement I will take it. However, what is happening is the code is always acting as if the key is 'deposits' even for entries that are not. The "func =" line is there to validate I am getting things correctly. I will probably not return a result to func since my goal is to fill a list with results.
What am I doing wrong that the switcher.get always finds a match even when one does not exist?
`def parsePollFile(thisFile):
line = ''
switcher = {
'deposits': deposits(line)
}
try:
reader = csv.reader(open(thisFile, 'r'))
for line in reader:
try:
if line[2] == "D":
func = switcher.get(line[0], lambda: 'invalid key')
print('key: {} -- {}\n'.format(line[0],func))
except IndexError:
continue
except Exception as e:
print("exception {}\n".format(e))
`
Your problem is that the code
switcher = {'deposits': deposits(line)}
Doesn't create a key in your dictionary with the value being the deposits function object. 'deposits': deposits(line) actually runs the deposits function, and stores the return value as the value of the 'deposits' key. You need to store a function object in the dictionary.
Since your function takes arguments, this is a bit tricky. There are several ways around this problem, but perhaps the simplest is to wrap your function call in another function
switcher = {'deposits': lambda: deposits(line)}
You would then use the dictionary like so
func = switcher.get(line[0], lambda: 'invalid key')
func()
My solution resulted from both the comment and the answer. I changed the 'deposits': deposits(line) to be 'deposits':deposits and then I run the func(line) when a match is found.
Thank you to the answers. I did not realize I was actually running the method instead of defining the value.

Raise error after unreached return statement in Python?

Relevant code:
def return_t_done(queryList):
for item in list:
if item[0] == "t_done":
return int(item[1])
raise Error("Attempted to get t_done's value, yet no t_done exists")
So basically this code runs through a nested list in the form of
[[spam,eggs],[ham,yum]]
and looks for a field ([[field,value],[field,value]]) of 't_done' and, when it finds it, returns the value associate with that field. There should always be a t_done field, but in case there isn't (this is parsing automatically generated log files using an app I didn't write and can't access the source code for, so who knows what could go wrong) I would like to elegantly raise an exception in the most appropriate way possible.
That said, this seems like the most elegant way to me, but I'm not particularly versed in Python, and my research into the docs confused me a bit. So, should I wrap the for loop in a try, except clause? Should I write a class for this error? Should I raise an exception instead of an error? Is it okay as written?
Thanks!
The way you have it is fine. But one other possibility is to use the little-known else cause for the for loop. That is evaluated if and only if the for loop completes successfully.
for item in list:
if item[0] == "t_done":
return int(item[1])
else:
raise Error(...)
Note the indentation: the else lines up with the for, not the if.

Python: variable "tricking" try-exception, but works for if statement

I know the title seems crazy, but it is true. Here is my predicament. First, I am still a beginner at Python so please be considerate. I am trying to test if a variable exists. Now the variable comes from a file that is parsed in yaml. Whenever I try this code,
if not name['foo']:
print "No name given."
return False
Python does as you would expect and returns false. However, when I try to change it to this code,
try:
name['foo']
except:
print "ERROR: No name given."
raise
the exception is never risen. I have searched and searched and could not find any questions or sites that could explain this to me. My only thought is that the parser is "tricking" the exception handler, but that doesn't really make sense to me.
I have made sure that there were no white spaces in the name field of the document I am parsing. The field has the format:
*name: foo
*ver: bar
Like I have said, I made sure that foo was completely deleted along with any whitespace between lines. If anyone could help, it would be greatly appreciated.
EDIT:
And I apologize for the negative logic in the if statement. The function has to go through a series of checks. The best way I could think of to make sure all the checks were executed was to return true at the very end and to return false if any individual check failed.
A few things:
That shouldn't throw an exception! You're doing name['foo'] in two places and expecting different behavior.
The name doesn't behave like a dictionary, if it did you would NOT get a return of False in the first example, you'd get an exception. Try doing this:
name = {}
name['foo']
Then you'll get a KeyError exception
Don't ever have an except: block! Always catch the specific exception you're after, like IndexError or whatever
I don't understand why you think it would raise an exception. The key evidently exists, otherwise the first snippet would not work. The value for that key is obviously some false-y value like False, None, [] or ''.
Python does as you would expect and returns false.
The exception (KeyError) will be thrown only if there is no key in a dictionary (assuming name is a dictionary). If you do
if not name['foo']:
and it does not throw an exception, then it means that "foo" is in name but the value evaluates to boolean false (it can be False, None, empty string "", empty list [], empty dictionary {}, some custom object, etc. etc.). Thus wrapping name['foo'] with try:except: is pointless - the key is there.

Categories

Resources