Strange issue when using google DFP API python client - python

have a look at this code
order_service = client.GetService('OrderService', version='v201208')
creative_service = client.GetService('CreativeService', version='v201208')
with open('/tmp/urls.txt', 'w') as f:
for i in range(0, 3929, 100):
print 'ORDER BY ID LIMIT 100 OFFSET '+str(i)
creatives = creative_service.getCreativesByStatement({'query':'ORDER BY ID LIMIT 100 OFFSET '+str(i)})
try:
for creative in creatives[0]['results']:
try:
for var in creative['creativeTemplateVariableValues']:
if var['uniqueName'] == 'DetailsPageURL':
print var['value']
f.write(creative['advertiserId']+','+var['value']+"\n")
exception:
pass
except:
raise
pass
The second iteration when offset is 200, will complain at for creative in creatives[0]['results'] about results keyerror, but if I change a try/except statement to if creative.has_key('creativeTemplateVariableValues'): like following fixes the problem:
order_service = client.GetService('OrderService', version='v201208')
creative_service = client.GetService('CreativeService', version='v201208')
with open('/tmp/urls.txt', 'w') as f:
for i in range(0, 3929, 100):
print 'ORDER BY ID LIMIT 100 OFFSET '+str(i)
creatives = creative_service.getCreativesByStatement({'query':'ORDER BY ID LIMIT 100 OFFSET '+str(i)})
try:
print creatives[0]['results']
except:
print creatives
#creatives = creative_service.getCreativesByStatement({'query':'ORDER BY ID LIMIT 10 OFFSET 200'})
try:
for creative in creatives[0]['results']:
if creative.has_key('creativeTemplateVariableValues'):
for var in creative['creativeTemplateVariableValues']:
if var['uniqueName'] == 'DetailsPageURL':
print var['value']
f.write(creative['advertiserId']+','+var['value']+"\n")
except:
raise
pass
Why???

The field 'creativeTemplateVariableValues' creatives of type 'TemplateCreative' so if you have other creatives on your network that's not a TemplateCreative, it will not have the field and throw the key error as you have seen. You can do the has_key check as you have done or an alternative is to do a type check:
if creative['Creative_Type'] == 'ImageCreative':
for var in creative['creativeTemplateVariableValues']:
...
If you only care about TemplateCreatives, I would suggest using a statement filter for that particular creative type. Please see the get_creatives_by_statement example (http://code.google.com/p/google-api-ads-python/source/browse/trunk/examples/adspygoogle/dfp/v201208/get_creatives_by_statement.py)
For future questions regarding DFP API and the related client libraries, please post to the DFP API forum: https://groups.google.com/forum/#!forum/google-doubleclick-for-publishers-api

Related

Duplicate output with arrays in python

I'm trying to gather the data from 6 stocks in the array, but when my API can find the data I want it to move to the next item but still selecting just 6.
I tried with this code and other variants but nothing seems to work. The output always duplicate one stock and I don't know why
portfolio = ['NVDA', 'SPCE', 'IMGN', 'SUMR', 'EXPE', 'PWM.V', 'SVMK', 'DXCM']
tt = 0
irange = 6;
for i in range(irange):
try:
t = requests.get('https://finnhub.io/api/v1/stock/profile?symbol='+portfolio[tt])
t = t.json()
except Exception as e:
print("Error calling API, waiting 70 seconds and trying again...")
time.sleep(70)
t = requests.get('https://finnhub.io/api/v1/stock/profile?symbol='+portfolio[tt])
t = t.json()
try:
coticker = t['ticker']
coexchange = t['exchange']
coname = t['name']
codesc = t['ggroup']
coipo = t['ipo']
cosector = t['gsector']
costate = t['state']
coweburl = t['weburl']
except Exception as e:
print("Information not available")
irange = irange+1
print("THE TT IS:"+str(tt))
tt = tt+1
print("")
print(coticker,coexchange,coname,codesc,coipo,cosector,costate,coweburl)
This is the output:
THE TT IS:0
NVDA -- GATHERED DATA
Information not available
THE TT IS:1
NVDA -- GATHERED DATA
THE TT IS:2
IMGN -- GATHERED DATA
THE TT IS:3
SUMR -- GATHERED DATA
THE TT IS:4
EXPE -- GATHERED DATA
Information not available
THE TT IS:5
EXPE -- GATHERED DATA
As you can see, when there is no information available, it doesn't move to the next one, it repeats the same one. What's the mistake? Thanks in advance for your kind help.
Put the line that prints the information inside the try block that sets all the variables. Otherwise, you'll print the variables from the previous stock.
To make it keep going past 6 items when you have failures, don't use range(irange). Loop over the entire list with for symbol in portfolio:, and use a variable to count the successful attempts. Then break out of the loop when you've printed 6 stocks.
I've changed the code to use if statements instead of try/except to handle empty responses.
portfolio = ['NVDA', 'SPCE', 'IMGN', 'SUMR', 'EXPE', 'PWM.V', 'SVMK', 'DXCM']
irange = 6
successes = 0
for symbol in portfolio:
try:
t = requests.get('https://finnhub.io/api/v1/stock/profile?symbol='+symbol)
t = t.json()
except Exception as e:
print("Error calling API, waiting 70 seconds and trying again...")
time.sleep(70)
t = requests.get('https://finnhub.io/api/v1/stock/profile?symbol='+symbol)
if t:
t = t.json()
if t:
coticker = t['ticker']
coexchange = t['exchange']
coname = t['name']
codesc = t['ggroup']
coipo = t['ipo']
cosector = t['gsector']
costate = t['state']
coweburl = t['weburl']
print("")
print(coticker,coexchange,coname,codesc,coipo,cosector,costate,coweburl)
successes += 1
if successes >= irange:
break
else:
print("Information not available for "+symbol)

python-ldap unable to do any basic search queries on open server

I have been trying to do some basic search queries, but I am unable to connect to an open LDAP server regardless. I tried a couple of servers, and none of them worked. I used Apache Directory Studio to make sure that the keyword was there but it did not work either way. I tried a variety of different code from different sources.
This was the first one I used
:
https://www.linuxjournal.com/article/6988
import ldap
keyword = "boyle"
def main():
server = "ldap.forumsys.com"
username = "cn=read-only-admin,dc=example,dc=com"
password = "password"
try:
l = ldap.open(server)
l.simple_bind_s(username,password)
print "Bound to server . . . "
l.protocol_version = ldap.VERSION3
print "Searching . . ."
mysearch (l,keyword)
except ldap.LDAPError:
print "Couldnt connect"
def mysearch(l, keyword):
base = ""
scope = ldap.SCOPE_SUBTREE
filter = "cn=" + "*" + keyword + "*"
retrieve_attributes = None
count = 0
result_set = []
timeout = 0
try:
result_id = l.search(base, scope, filter, retrieve_attributes)
while l != 1:
result_id = l.search(base, scope,filter, retrieve_attributes)
result_type, result_data = l.result(result_id, timeout)
if result_data == []:
break
else:
if result_type == ldap.RES_SEARCH_ENTRY:
result_set.append(result_data)
if len (result_set = 0):
print "No Results"
for i in range (len(result_set)):
for entry in result_set[i]:
try:
name = entry[1]['cn'][0]
mail = entry[1]['mail'][0]
#phone = entry[1]['telephonenumber'][0]
#desc = entry[1]['description'][0]
count = count + 1
print name + mail
except:
pass
except ldap.LDAPError, error_message:
print error_message
main()
Every time I ran this program, I received an error
{'desc': u"No such object"}
I also tried this
import ldap
try:
l = ldap.open("ldap.example.com")
except ldap.LDAPError, e:
print e
base_dn = "cn=read-only-admin,dc=example,dc=com"
search_scope = ldap.SCOPE_SUBTREE
retrieve_attributes = None
search_filter = "uid=myuid"
try:
l_search = l.search(base_dn, search_scope, search_filter, retrieve_attributes)
result_status, result_data = l.result(l_search, 0)
print result_data
except ldap.LDAPError, e:
print e
The error on this one was
{'desc': u"Can't contact LDAP server"}
I spent about 5 hours trying to figure this out. I would really appreciate it if you guys could give me some advice. Thanks.
There are several bogus things in there.
I will only comment your first code sample because it can be used by anyone with that public LDAP server.
l = ldap.open(server)
Function ldap.open() is deprecated since many years. You should use function ldap.initialize() with LDAP URI as argument instead like this:
l = ldap.initialize("ldap://ldap.forumsys.com")
l_search = l.search(..)
This is the asynchronous method which just returns a message ID (int) of the underlying OpenLDAP C API (libldap). It's needed if you want to retrieve extended controls returned by the LDAP server along with search results. Is that what you want?
As a beginner you probably want to use the simpler method LDAPObject.search_s() which immediately returns a list of (DN, entry) 2-tuples.
See also: python-ldap -- Sending LDAP requests
while l != 1
This does not make sense at all because l is your LDAPObject instance (LDAP connection object). Note that LDAPObject.search() would raise an exception if it gets an Integer error code from OpenLDAP's libldap. No need to do C-style error checks at this level.
filter = "cn=" + "" + keyword + ""
If keyword can be arbitrary input this is a prone to LDAP injection attacks. Don't do that.
For adding arbitrary input into a LDAP filter use function ldap.filter.escape_filter_chars() to properly escape special characters. Also avoid using variable name filter because it's the name of a built-in Python function and properly enclose the filter in parentheses.
Better example:
ldap_filter = "(cn=*%s*)" % (ldap.filter.escape_filter_chars(keyword))
base = ""
The correct search base you have to use is:
base = "dc=example,dc=com"
Otherwise ldap.NO_SUCH_OBJECT is raised.
So here's a complete example:
import pprint
import ldap
from ldap.filter import escape_filter_chars
BINDDN = "cn=read-only-admin,dc=example,dc=com"
BINDPW = "password"
KEYWORD = "boyle"
ldap_conn = ldap.initialize("ldap://ldap.forumsys.com")
ldap_conn.simple_bind_s(BINDDN, BINDPW)
ldap_filter = "(cn=*%s*)" % (ldap.filter.escape_filter_chars(KEYWORD))
ldap_results = ldap_conn.search_s(
"dc=example,dc=com",
ldap.SCOPE_SUBTREE,
ldap_filter,
)
pprint.pprint(ldap_results)

Search via Python Search API timing out intermittently

We have an application that is basically just a form submission for requesting a team drive to be created. It's hosted on Google App Engine.
This timeout error is coming from a single field in the form that simply does typeahead for an email address. All of the names on the domain are indexed in the datastore, about 300k entities - nothing is being pulled directly from the directory api. After 10 seconds of searching (via the Python Google Search API), it will time out. This is currently intermittent, but errors have been increasing in frequency.
Error: line 280, in get_result raise _ToSearchError(e) Timeout: Failed to complete request in 9975ms
Essentially, speeding up the searches will resolve. I looked at the code and I don't believe there is any room for improvement there. I am not sure if increasing the instance class will improve this, it is currently an F2. Or if perhaps there is another way to improve the index efficiency. I'm not entirely sure how one would do that however. Any thoughts would be appreciated.
Search Code:
class LookupUsersorGrpService(object):
'''
lookupUsersOrGrps accepts various params and performs search
'''
def lookupUsersOrGrps(self,params):
search_results_json = {}
search_results = []
directory_users_grps = GoogleDirectoryUsers()
error_msg = 'Technical error'
query = ''
try:
#Default few values if not present
if ('offset' not in params) or (params['offset'] is None):
params['offset'] = 0
else:
params['offset'] = int(params['offset'])
if ('limit' not in params) or (params['limit'] is None):
params['limit'] = 20
else:
params['limit'] = int(params['limit'])
#Search related to field name
query = self.appendQueryParam(q=query, p=params, qname='search_name', criteria=':', pname='query', isExactMatch=True,splitString=True)
#Search related to field email
query = self.appendQueryParam(q=query, p=params, qname='search_email', criteria=':', pname='query', isExactMatch=True, splitString=True)
#Perform search
log.info('Search initialized :\"{}\"'.format(query) )
# sort results by name ascending
expr_list = [search.SortExpression(expression='name', default_value='',direction=search.SortExpression.ASCENDING)]
# construct the sort options
sort_opts = search.SortOptions(expressions=expr_list)
#Prepare the search index
index = search.Index(name= "GoogleDirectoryUsers",namespace="1")
search_query = search.Query(
query_string=query.strip(),
options=search.QueryOptions(
limit=params['limit'],
offset=params['offset'],
sort_options=sort_opts,
returned_fields = directory_users_grps.get_search_doc_return_fields()
))
#Execute the search query
search_result = index.search(search_query)
#Start collecting the values
total_cnt = search_result.number_found
params['limit'] = len(search_result.results)
#Prepare the response object
for teamdriveDoc in search_result.results:
teamdriveRecord = GoogleDirectoryUsers.query(GoogleDirectoryUsers.email==teamdriveDoc.doc_id).get()
if teamdriveRecord:
if teamdriveRecord.suspended == False:
search_results.append(teamdriveRecord.to_dict())
search_results_json.update({"users" : search_results})
search_results_json.update({"limit" : params['limit'] if len(search_results)>0 else '0'})
search_results_json.update({"total_count" : total_cnt if len(search_results)>0 else '0'})
search_results_json.update({"status" : "success"})
except Exception as e:
log.exception("Error in performing search")
search_results_json.update({"status":"failed"})
search_results_json.update({"description":error_msg})
return search_results_json
''' Retrieves the given param from dict and adds to query if exists
'''
def appendQueryParam(self, q='', p=[], qname=None, criteria='=', pname=None,
isExactMatch = False, splitString = False, defaultValue=None):
if (pname in p) or (defaultValue is not None):
if len(q) > 0:
q += ' OR '
q += qname
if criteria:
q += criteria
if defaultValue is None:
val = p[pname]
else:
val = defaultValue
if splitString:
val = val.replace("", "~")[1: -1]
#Helps to retain passed argument as it is, example email
if isExactMatch:
q += "\"" +val + "\""
else:
q += val
return q
An Index instance's search method accepts a deadline parameter, so you could use that to increase the time that you are willing to wait for the search to respond:
search_result = index.search(search_query, deadline=30)
The documentation doesn't specify acceptable value for deadline, but other App Engine services tend to accept values up to 60 seconds.

urllib.request: Data Not Writing to Outfile

I've got a script here which (ideally) iterates through multiple pages X of JSON data for each entity Y (in this case, multiple loans X for each team Y). The way that the api is constructed, I believe I must physically change a subdirectory within the URL in order to iterate through multiple entities. Here is the explicit documentation and URL:
GET /teams/:id/loans
Returns loans belonging to a particular team.
Example http://api.kivaws.org/v1/teams/2/loans.json
Parameters id(number) Required. The team ID for which to return loans.
page(number) The page position of results to return. Default: 1
sort_by(string) The order by which to sort results. One of: oldest,
newest Default: newest app_id(string) The application id in reverse
DNS notation. ids_only(string) Return IDs only to make the return
object smaller. One of: true, false Default: false Response
loan_listing – HTML , JSON , XML , RSS
Status Production
And here is my script, which does run and appear to extract the correct data, but doesn't seem to write any data to the outfile:
# -*- coding: utf-8 -*-
import urllib.request as urllib
import json
import time
# storing team loans dict. The key is the team id, en value is the list of lenders
team_loans = {}
url = "http://api.kivaws.org/v1/teams/"
#teams_id range 1 - 11885
for i in range(1, 100):
params = dict(
id = i
)
#i =1
try:
handle = urllib.urlopen(str(url+str(i)+"/loans.json"))
print(handle)
except:
print("Could not handle url")
continue
# reading response
item_html = handle.read().decode('utf-8')
# converting bytes to str
data = str(item_html)
# converting to json
data = json.loads(data)
# getting number of pages to crawl
numPages = data['paging']['pages']
# deleting paging data
data.pop('paging')
# calling additional pages
if numPages >1:
for pa in range(2,numPages+1,1):
#pa = 2
handle = urllib.urlopen(str(url+str(i)+"/loans.json?page="+str(pa)))
print("Pulling loan data from team " + str(i) + "...")
# reading response
item_html = handle.read().decode('utf-8')
# converting bytes to str
datatemp = str(item_html)
# converting to json
datatemp = json.loads(datatemp)
#Pagings are redundant headers
datatemp.pop('paging')
# adding data to initial list
for loan in datatemp['loans']:
data['loans'].append(loan)
time.sleep(2)
# recording loans by team in dict
team_loans[i] = data['loans']
if (data['loans']):
print("===Data added to the team_loan dictionary===")
else:
print("!!!FAILURE to add data to team_loan dictionary!!!")
# recorging data to file when 10 teams are read
print("===Finished pulling from page " + str(i) + "===")
if (int(i) % 10 == 0):
outfile = open("team_loan.json", "w")
print("===Now writing data to outfile===")
json.dump(team_loans, outfile, sort_keys = True, indent = 2, ensure_ascii=True)
outfile.close()
else:
print("!!!FAILURE to write data to outfile!!!")
# compliance with API # of requests
time.sleep(2)
print ('Done! Check your outfile (team_loan.json)')
I know that may be a heady amount of code to throw in your faces, but it's a pretty sequential process.
Again, this program is pulling the correct data, but it is not writing this data to the outfile. Can anyone understand why?
For others who may read this post, the script does in face write data to an outfile. It was simply test code logic that was wrong. Ignore the print statements I have put into place.

Send an email from Python script whenever the threshold is met for a given machine?

I have a URL which gives me the below JSON String if I hit them on the browser -
Below is my URL, let's say it is URL-A and I have around three URL's like this -
http://hostnameA:1234/Service/statistics?%24format=json
And below is my JSON String which I get back from the url -
{
"description": "",
"statistics": {
"dataCount": 0,
}
}
Now I have written a Python script which is scanning all my 3 URL's and then parse then JSON String to extract the value of dataCount from it. And it should keep on running every few seconds to scan the URL and then parse it.
Below are my URL's
hostnameA http://hostnameA:1234/Service/statistics?%24format=json
hostnameB http://hostnameB:1234/Service/statistics?%24format=json
hostnameC http://hostnameC:1234/Service/statistics?%24format=json
And the data which I am seeing on the console is like this after running my python script -
hostnameA - dataCount
hostnameB - dataCount
hostnameC - dataCount
Below is my python script which works fine
def get_data_count(url):
try:
req = requests.get(url)
except requests.ConnectionError:
return 'could not get page'
try:
data = json.loads(req.content)
return int(data['statistics']['dataCount'])
except TypeError:
return 'field not found'
except ValueError:
return 'not an integer'
def send_mail(data):
sender = 'user#host.com'
receivers = ['some_name#host.com']
message = """\
From: user#host.com
To: some_name#host.com
Subject: Testing Script
"""
body = '\n\n'
for item in data:
body = body + '{name} - {res}\n'.format(name=item['name'], res=item['res'])
message = message + body
try:
smtpObj = smtplib.SMTP('some_server_name' )
smtpObj.sendmail(sender, receivers, message)
print "Mail sent"
except smtplib.SMTPException:
print "Mail sending failed!"
def main():
urls = [
('hostnameA', 'http://hostnameA:1234/Service/statistics?%24format=json'),
('hostnameB', 'http://hostnameB:1234/Service/statistics?%24format=json'),
('hostnameC', 'http://hostnameC:1234/Service/statistics?%24format=json')
]
count = 0
while True:
data = []
print('')
for name, url in urls:
res = get_data_count(url)
print('{name} - {res}'.format(name=name, res=res))
data.append({'name':name, 'res':res})
if len([item['res'] for item in data if item['res'] >= 20]) >= 1: count = count+1
else: count = 0
if count == 2:
send_mail(data)
count = 0
sleep(10.)
if __name__=="__main__":
main()
What I am also doing with above script is, suppose if any of the machines dataCount value is greater than equal to 20 for two times continuously, then I am sending out an email and it also works fine.
One issue which I am noticing is, suppose hostnameB is down for whatever reason, then it will print out like this for first time -
hostnameA - 1
hostnameB - could not get page
hostnameC - 10
And second time it will also print out like this -
hostnameA - 5
hostnameB - could not get page
hostnameC - 7
so my above script, sends out an email for this case as well since could not get page was two times continuously but infact, hostnameB dataCount value is not greater than equal to 20 at all two times? Right? So there is some bug in my script and not sure how to solve that?
I just need to send out an email, if any of the hostnames dataCount value is greater than equal to 20 for two times continuously. if the machine is down for whatever reason, then I will skip that case but my script should keep on running.
Without changing the get_data_count function:
I took the liberty to make data a dictionary with the server name as index, this makes looking up the last value easier.
I store the last dictionary and then compare the current and old values to 20. Most strings are > 19, so I create an int object from the result, this throws an exception when the result is a string, which I can then again catch to prevent shut-down servers from being counted.
last = False
while True:
data = {}
hit = False
print('')
for name, url in urls:
res = get_data_count(url)
print('{name} - {res}'.format(name=name, res=res))
data[name] = res
try:
if int(res) > 19:
hit = True
except ValueError:
continue
if hit and last:
send_mail(data)
last = hit
sleep(10.)
Pong Wizard is right, you should not handle errors like that. Either return False or None and reference the value later, or just throw an exception.
You should use False for a failed request, instead of the string "could not get page". This would be cleaner, but a False value will also double as a 0 if it is treated as an int.
>>> True + False
1
Summing two or more False values will therefore equal 0.

Categories

Resources