How can I replace multiple try/except blocks with less code? - python

I find myself writing code as below quite a bit. It's very verbose. What I'd like to do is assign array indeces to different variables, and if there's an indexerror, assign false. I feel like there should be a shorter syntax for doing this (compared to what I have below).
Edit - here's my actual code. page is a valid lxml.html object. Each of the selectors may or may not return a value, depending on whether that section is present on the page.
def extract_data( page ):
# given lxml.html obj, extract data from g+ page and return as dict
try:
profile_name = page.xpath( '//div[#guidedhelpid="profile_name"]/text()' )[0]
except IndexError:
profile_name = False
try:
website = page.cssselect( 'span.K9a' )[0].text_content().rstrip('/')
except IndexError:
website = False
try:
contact_div = html.tostring( page.xpath( '//div[normalize-space(text())="Contact Information"]/../../..' )[0] )
except IndexError:
contact_div = False
return {
'profile_name' : profile_name,
'website' : website,
'contact_div' : contact_div,
}

Assuming what you're trying to do makes sense within the context of your use case, you can encapsulate this notion of a default value inside a function:
def retrieve(my_list, index, default_value=False):
try:
return my_list[index]
except IndexError:
return default_value
That way you can do something like:
my_list = [2, 4]
first = retrieve(my_list, 0)
# first will be 2
second = retrieve(my_list, 1)
# second will be 4
third = retrieve(my_list, 2)
# third will be False
You can even change the value you'd like to default to in case the index does not exist.
In general, when you're repeating code like in the manner you're doing above, the first thing you should think about is whether you can write a function that does what you're trying to do.
Using your actual code, you could do something like:
profile_name = retrieve(page.xpath( '//div[#guidedhelpid="profile_name"]/text()'), 0)
website = retrieve(page.cssselect( 'span.K9a' ), 0)
if website:
website = website.text_content().rstrip('/')
contact_div = retrieve(page.xpath( '//div[normalize-space(text())="Contact Information"]/../../..' ), 0)
if contact_div:
contact_div = html.tostring(contact_div)

vars = ['first', 'second', 'third']
r = {}
for i, var in enumerate(vars):
try:
r[var] = l[i]
except IndexError:
r[var] = False

This should solve your problem :) exec + looping to the rescue!
l = list([0,2])
numberWords = { 0:"first", 1:"second", 2:"third"}
for i in range(0,len(l)):
try:
exec(numberWords[i]+"=l["+str(i)+"]")
except IndexError:
exec(numberWords[i]+"=false")

Related

Python or expression on exception

I have this code:
try:
info_model = Doc2Vec.load('models/info_model')
salary_model = Doc2Vec.load('models/salary_model')
education_model = Doc2Vec.load('models/education_model')
experience_model = Doc2Vec.load('models/experience_model')
skills_model = Doc2Vec.load('models/skills_model')
except:
info_model = lrn.info_model()
salary_model = lrn.salary_model()
education_model = lrn.education_model()
experience_model = lrn.experience_model()
skills_model = lrn.skills_model()
Basically, it checks if the file exists and creates it if not. But for correct work I would like to check every of this variables one by one. For that I would need to use try/except to each one separately.
I came up with sth like this:
experience_model = Doc2Vec.load('models/experience_model') or lrn.experience_model()
But this line still gives me FileNotFound exception. Is there a workaround? or should I do try/exception statement for each variable?
You could define a helper like this:
def load_or_default(filename, default):
try:
return Doc2Vec.load(filename)
except FileNotFoundError:
return default()
info_model = load_or_default('models/info_model', lrn.info_model)
salary_model = load_or_default('models/salary_model', lrn.salary_model)
education_model = load_or_default('models/education_model', lrn.education_model)
experience_model = load_or_default('models/experience_model', lrn.experience_model)
skills_model = load_or_default('models/skills_model', lrn.skills_model)
It's worth noting how the default object is only called within the function.

Passing over errors in loop for my web-scraper

I currently have a loop running for my web-scraper. If it encounters an error (i.e can't load the page) I have it set to ignore it and continue with the loop.
for i in links:
try:
driver.get(i);
d = driver.find_elements_by_xpath('//p[#class = "list-details__item__date"]')
s = driver.find_elements_by_xpath('//p[#class = "list-details__item__score"]')
m = driver.find_elements_by_xpath('//span[#class="list-breadcrumb__item__in"]')
o = driver.find_elements_by_xpath('//tr[#data-bid]');
l = len(o)
lm= len(m)
for i in range(l):
a = o[i].text
for i in range(lm):
b = m[i].text
c = s[i].text
e = d[i].text
odds.append((a,b,c,e))
except:
pass
However, I now wish for there to be a note of some kind when an error was encountered so that I can see which pages didn't load. Even if they are just left blank in the output table, that would be fine.
Thanks for any help.
You can add a catch for the exception and then do something with that catch. This should be suitable for your script.
import ... (This is where your initial imports are)
import io
import trackback
for i in links:
try:
driver.get(i);
d = driver.find_elements_by_xpath('//p[#class = "list-details__item__date"]')
s = driver.find_elements_by_xpath('//p[#class = "list-details__item__score"]')
m = driver.find_elements_by_xpath('//span[#class="list-breadcrumb__item__in"]')
o = driver.find_elements_by_xpath('//tr[#data-bid]');
l = len(o)
lm= len(m)
for i in range(l):
a = o[i].text
for i in range(lm):
b = m[i].text
c = s[i].text
e = d[i].text
odds.append((a,b,c,e))
except Exception as error_script:
print(traceback.format_exc())
odds.append('Error count not add')
Essentially what happens is that you catch the exception using the exception Exception as error_script: line. Afterwards , you can print the actual error message to the console using thetraceback.format_exc()`command.
But most importantly you can append a string to the list by passing the append statement in the exception catch and use pass at the end of the exception. pass will run the code int he catch and then go to the next iteration.

Writing scraped headers from webpages to pandas frame

Wrote this code to download h1,h2 and h3 headers and write to a pandas frame along with a list of urls but it gives error as unpacking error expected 3 values.
def url_corrector(url):
if not str(url).startswith('http'):
return "https://"+str(url)
else:
return str(url)
def header_agg(url):
h1_list = []
h2_list = []
h3_list = []
p = requests.get(url_corrector(url),proxies = proxy_data,verify=False)
soup = BeautifulSoup(p.text,'lxml')
for tag in soup.find_all('h1'):
h1_list.append(tag.text)
for tag in soup.find_all('h2'):
h2_list.append(tag.text)
for tag in soup.find_all('h3'):
h3_list.append(tag.text)
return h1_list, h2_list, h3_list
headers_frame = url_list.copy()
headers_frame['H1'],headers_frame['H2'],headers_frame['H3'] = headers_frame.url.map(lambda x: header_agg(x))
Any help on how to do it?
Getting this error:
ValueError: too many values to unpack (expected 3)
Lets assume that url_list is a dict with the following structure:
url_list = {'url': [<url1>, <url2>, <url3>, <url4>, ..., <urln>]}
the call to headers_frame.url.map(lambda x: header_agg(x)) will return a list with n elements in the form:
[<url1(h1_list, h2_list, h3_list)>, <url2(h1_list, h2_list, h3_list)>, ..., <urln(h1_list, h2_list, h3_list)>]
For the code to produce the output you require, you may have to re-write the last statement as a loop
headers_frame.update({'H1':[], 'H2':[], 'H3':[]})
for url in headers_frame.url:
headers = header_agg(url)
headers_frame['H1'].extend(headers[0])
headers_frame['H2'].extend(headers[1])
headers_frame['H3'].extend(headers[2])
You have to return one entity. Just change:
return [h1_list, h2_list, h3_list]
Did this to work around this issue. However, still unsure why the original isn't working.
headers_frame = url_list.copy()
H1=[]
H2=[]
H3=[]
for url in headers_frame.url:
k = header_agg(url)
H1.append(k[0])
H2.append(k[1])
H3.append(k[2])
pd.DataFrame(np.column_stack([headers_frame.url,H1,H2,H3]))

When using Q objects in django queryset for OR condition error occured

I want to use filter with OR condition in django.
For this i want to use Q objects.
My code is this
def product(request):
try:
proTitle = request.GET.get('title')
ProDescription = request.GET.get('description')
except:
pass
list0 = []
result = Product.objects.filter(Q(title__contains=proTitle ) | Q(description__contains=ProDescription ) )
for res in result:
list0.append(res.project_id)
data ={'title result':list0}
return HttpResponse(json.dumps(data))
return HttpResponse(json.dumps(data), content_type='application/json')
When pass all value that is proTitle,ProDescription
it working fine.
If any value is going to none it encounter a errorCannot use None as a query value`
Why this error occured when i am using OR operator into my queryset
I am also try this
result = Project.objects.filter(title__contains=proTitle) | Project.objects.filter(description__contains=ProDescription )
but same error occured
I am unable to understand actually what is the problem
You may set the defaults of the get method to something other than None, say empty string:
proTitle = request.GET.get('title', '')
ProDescription = request.GET.get('description', '')
funAria = request.GET.get('funAria', '')
femaleReq = request.GET.get('femaleReq', '')
This is however likely to return all results in the DB when using __contains.
Otherwise you may build the Q functions discarding all None values.
Use this as a guideline: How to dynamically compose an OR query filter in Django?
You can build the Q object in a loop, discarding all values that are none. Note that after consideration, it appears that your except will never be reached because get will simply return None if the Get parameter is not present. You can thus replace the try with an if.
The following code assumes that you simply want the unset values to be ignored in the filter.
(This is untested--let me know if you have any issues.)
attributes = (('title', 'title__contains'),
('description', 'description__contains'),
('funAria', 'functional_area__contains'),
('femaleReq', 'female_candidate'))
query_filter = Q() # initialize
for get_param, kw_arg in attributes:
get_value = request.GET.get(get_param)
if get_value:
query_filter |= Q(**{ kw_arg: get_value })
result = Product.objects.filter(query_filter)
Your error seems to mean that one of your values is None:
def product(request):
try:
proTitle = request.GET.get('title')
ProDescription = request.GET.get('description')
funAria = request.GET.get('funAria')
femaleReq = request.GET.get('femaleReq')
someIntValue = request.GET.get('someIntValue')
except:
pass
allQs = Q()
if proTitle is not None:
allQs |= Q(title__contains=proTitle )
if ProDescription is not None:
allQs |= Q(description__contains=ProDescription )
if funAria is not None:
allQs |= Q(functional_area__contains=funAria )
if someIntValue is not None:
allQs |= Q(some_int_value__gte=someIntValue) # no need to convert someIntValue to an int here as long as you are guaranteed it is unique.
allQs |= Q(female_candidate=femaleReq )
list0 = []
result = Product.objects.filter(allQs)
for res in result:
list0.append(res.project_id)
data ={'title result':list0}
return HttpResponse(json.dumps(data))

Python Maya - If objectType returns "No object name specified"

I am trying to get maya to check if the listed object is a blendshape node or not.
This is my code:
def bake(self, *args):
self.items["selection"] = cmds.ls(sl = True)
self.items["shapes"] = cmds.listRelatives(self.items["selection"], ad = True)
shapes = ()
for i in self.items["shapes"]:
bs = cmds.listConnections(i, type = "blendShape", exactType = True)
if cmds.objectType(bs, isType = "blendShape"):
print bs
It returns # Error: RuntimeError: file X:/Documents/maya/scripts\jtBakeCharacter.py line 16: No object name specified
Line 16 is: if cmds.objectType(bs, isType = "blendShape"):
Except that I AM specifying an object name, that object name is bs .. I have printed the result of bs and it has many objects listed. Many.
The code is redundant. You don't need most of the lines. The listConnections already ensures that you have only blendshapes. The exact problem is that you are calling something like:
cmds.objectType([])
for some of those extra shapes. And this is illegal. But mostly you code can be encapsulated as follows:
selected = cmds.ls(sl = True, dag=True ,shapes = True)
blends = cmds.listConnections(selected , type = "blendShape", exactType = True)
for item in blends:
print item
But this may not catch your intent perfectly, but shows how may extra steps you take. In reality you don't need the line if cmds.objectType(bs, isType = "blendShape"): for anything
Joojaa's answer is elegant, but you can get it down even shorter by using the default selection behavior:
blendshapes = cmds.ls(cmds.listHistory(pdo=True), type='blendShape') or []
for item in blendshapes:
print item
(In the quest to make it even shorter I'm not checking for the selection, so this one fails if nothing is selected).
PS: if you need to get to the blendshape from one of the upstream shapes, instead of the deformed shape, you can use listHistory (f=True)
You could try this:
from pymel.core import *
for obj in selected():
shapeNode = obj.getChildren()[0]
for output in shapeNode.outputs():
if nodeType(output) == "blendShape":
print obj, "is a blendshape"

Categories

Resources