Python JIRA jira.issue slow performance - python

I am trying to retrieve from JIRA tree graph of parent child issues
epic->story->task using python3.10 jira=3.1.1
for project with 400 issues it takes minutes to retrieve the result.
is there a way how to improve this code for better performance
The result is displayed in flask frontend web and response time 2minutes is not pleasant for users
jira = JIRA(options={'server': self.url, 'verify': False}, basic_auth=(self.username, self.password))
for singleIssue in jira.search_issues(jql_str=querystring, maxResults=False):
issue = jira.issue(singleIssue.key)
links = issue.fields.issuelinks
for link in links:
if hasattr(link, 'inwardIssue'):
item['inward_issue'] = link.inwardIssue.key
if hasattr(link, 'outwardIssue'):
item['outward_issue'] = link.outwardIssue.key
item['typename'] = link.type.nam

I realised this is moreless matter of flask and long queries handling.
It should be solved either by asynchronous calls or queuing.

Limit the fields you are requesting via the "fields=[ 'key' ]" parameter to include only those which you need.
In your example you are fetching the whole issue twice: once from the .search_issues() iterator every time round the loop (as expected), and (unnecessarily) again in the .issue() call on the following line.

Related

Struggling with how to iterate data

I am learning Python3 and I have a fairly simple task to complete but I am struggling how to glue it all together. I need to query an API and return the full list of applications which I can do and I store this and need to use it again to gather more data for each application from a different API call.
applistfull = requests.get(url,authmethod)
if applistfull.ok:
data = applistfull.json()
for app in data["_embedded"]["applications"]:
print(app["profile"]["name"],app["guid"])
summaryguid = app["guid"]
else:
print(applistfull.status_code)
I next have I think 'summaryguid' and I need to again query a different API and return a value that could exist many times for each application; in this case the compiler used to build the code.
I can statically call a GUID in the URL and return the correct information but I haven't yet figured out how to get it to do the below for all of the above and build a master list:
summary = requests.get(f"url{summaryguid}moreurl",authmethod)
if summary.ok:
fulldata = summary.json()
for appsummary in fulldata["static-analysis"]["modules"]["module"]:
print(appsummary["compiler"])
I would prefer to not yet have someone just type out the right answer but just drop a few hints and let me continue to work through it logically so I learn how to deal with what I assume is a common issue in the future. My thought right now is I need to move my second if up as part of my initial block and continue the logic in that space but I am stuck with that.
You are on the right track! Here is the hint: the second API request can be nested inside the loop that iterates through the list of applications in the first API call. By doing so, you can get the information you require by making the second API call for each application.
import requests
applistfull = requests.get("url", authmethod)
if applistfull.ok:
data = applistfull.json()
for app in data["_embedded"]["applications"]:
print(app["profile"]["name"],app["guid"])
summaryguid = app["guid"]
summary = requests.get(f"url/{summaryguid}/moreurl", authmethod)
fulldata = summary.json()
for appsummary in fulldata["static-analysis"]["modules"]["module"]:
print(app["profile"]["name"],appsummary["compiler"])
else:
print(applistfull.status_code)

Waiting MySQL execution before rendering template

I am using Flask and MySQL. I have an issue with updated data not showing up after the execution.
Currently, I am deleting and redirecting back to the admin page so I may then have a refreshed version of the website. However, I still get old entries showing up in the table I have in the front end. After refreshing manually, everything works normally. The issue sometimes also happens with data simply not being sent to the front-end at all as a result of the template being rendered faster than the MySQL execution and an empty list being sent forward, I assume.
On Python I have:
#app.route("/admin")
def admin_main():
query = "SELECT * FROM categories"
conn.executing = True
results = conn.exec_query(query)
return render_template("src/admin.html", categories = results)
#app.route('/delete_category', methods = ['POST'])
def delete_category():
id = request.form['category_id']
query = "DELETE FROM categories WHERE cid = '{}'".format(id)
conn.delete_query(query)
return redirect("admin", code=302)
admin_main is the main page. I tried adding some sort of "semaphore" system, only executing once "conn.executing" would become false, but that did not work out either. I also tried playing around with async and await, but no luck ("The view function did not return a valid response").
I am somehow out of options in this case and do not really know how to treat the problem. Any help is appreciated!
I figured that the problem was not with the data not being properly read, but with the page not being refreshed. Though the console prints a GET request for the page, it does not direct since it is already on that same page.
The only workaround I am currently working on is socket.io implementation to have the content updated dynamically.

Recursive API Calls using AWS Lambda Functions Python

I've never written a recursive python script before. I'm used to splitting up a monolithic function into sub AWS Lambda functions. However, this particular script I am working on is challenging to break up into smaller functions.
Here is the code I am currently using for context. I am using one api request to return a list of objects within a table.
url_pega_EEvisa = requests.get('https://cloud.xxxx.com:443/prweb/api/v1/data/D_pxCaseList?caseClass=xx-xx-xx-xx', auth=(username, password))
pega_EEvisa_raw = url_pega_EEvisa.json()
pega_EEvisa = pega_EEvisa_raw['pxResults']
This returns every object(primary key) within a particular table as a list. For example,
['XX-XXSALES-WORK%20PO-1', 'XX-XXSALES-WORK%20PO-10', 'XX-XXSALES-WORK%20PO-100', 'XX-XXSALES-WORK%20PO-101', 'XX-XXSALES-WORK%20PO-102', 'XX-XXSALES-WORK%20PO-103', 'XX-XXSALES-WORK%20PO-104', 'XX-XXSALES-WORK%20PO-105', 'XX-XXSALES-WORK%20PO-106', 'XX-XXSALES-WORK%20PO-107']
I then use this list to populate more get requests using a for loop which then grabs me all the data per object.
for t in caseid:
url = requests.get(('https://cloud.xxxx.com:443/prweb/api/v1/cases/{}'.format(t)), auth=(username, password)).json()
data.append(url)
This particular lambda function takes about 15min which is the limit for one AWS Lambda function. Ideally, I'd like to split up the list into smaller parts and run the same process. I am struggling marking the point where it last ran before failure and passing that information on to the next function.
Any help is appreciated!
I'm not sure if I entirely understand what you want to do with the data once you've fetched all the information about the case, but it terms of breaking up the work once lambda is doing into many lambdas, you should be able to chunk out the list of cases and pass them to new invocations of the same lambda. Python psuedocode below, hopefully it helps illustrate the idea. I stole the chunks method from this answer that would help break the list into batches
import boto3
import json
client = boto3.client('lambda')
def handler
url_pega_EEvisa = requests.get('https://cloud.xxxx.com:443/prweb/api/v1/data/D_pxCaseList?caseClass=xx-xx-xx-xx', auth=(username, password))
pega_EEvisa_raw = url_pega_EEvisa.json()
pega_EEvisa = pega_EEvisa_raw['pxResults']
for chunk in chunks(pega_EEvisa, 10)
client.invoke(
FunctionName='lambdaToHandleBatchOfTenCases',
Payload=json.dumps(chunk)
)
Hopefully that helps? Let me know if this was not on target 😅

Jira -Python update issues takes very long time

I am using python with jira package and writing a simple script that will create or update all the existing issues for a project on my company's server.
Creating multiple issues via python is very fast and i can create 100 issues within 30 or seconds. But the problem is when i want to update those issues. When i update issues it takes a very long time probably 4 or 5 minutes for updating 100 issues. I am getting the InsecureRequestWarnings warning. I tried to disable warnings as well but still the program is very slow when it comes to updating the issues. How can i make updating issues faster?
NOTE: Each issue update takes more than 3.1 secs.
from jira import *
import urllib3
urllib3.disable_warnings() #Comment this to see warnings
options = {'server': 'Company Server', 'verify': False}
jira = JIRA(options, basic_auth=('username', "password"))
nameOfProjects = "Project name from jira"
issuesJira = jira.search_issues(jql_str='project=
"{}"'.format(nameOfProjects),fields='summary, key,type,status', startAt=0,
maxResults=1000)
test = 0
for issue in issuesJira:
issue.update(notify=False, fields={
'summary' :'some Text',
'description': 'some Text- ' +str(test),
'priority': {"name": 'High'},
'components': [{'name': 'TestMode'}],
"issuetype": {"name": 'Requirement'},
'fixVersions': [{'name': 'test'}]})
print('issue is updated-', test)
test = test +1
print('END')
I'm also experiencing this issue.
It seems there may be a bug in the python API that adds a 4s sleep in the update method.
https://github.com/pycontribs/jira/issues/622
Since this bug has been open for almost 2 years, I think it's time for a fork!
Is username in the first user directory that is configured in Jira user management? That can affect authentication times.
Does the project have a complex permission scheme or issue security scheme?
What changes if you only update one field such as Summary? Does that change the timing?
I would expect about 1s/issue update in many installations

Updated approach to Google search with python

I was trying to use xgoogle but I has not been updated for 3 years and I just keep getting no more than 5 results even if I set 100 results per page. If anyone uses xgoogle without any problem please let me know.
Now, since the only available(apparently) wrapper is xgoogle, the option is to use some sort of browser, like mechanize, but that is gonna make the code entirely dependant on google HTML and they might change it a lot.
Final option is to use the Custom search API that google offers, but is has a redicolous 100 requests per day limit and a pricing after that.
I need help on which direction should I go, what other options do you know of and what works for you.
Thanks !
It only needs a minor patch.
The function GoogleSearch._extract_result (Line 237 of search.py) calls GoogleSearch._extract_description (Line 258) which fails causing _extract_result to return None for most of the results therefore showing fewer results than expected.
Fix:
In search.py, change Line 259 from this:
desc_div = result.find('div', {'class': re.compile(r'\bs\b')})
to this:
desc_div = result.find('span', {'class': 'st'})
I tested using:
#!/usr/bin/python
#
# This program does a Google search for "quick and dirty" and returns
# 200 results.
#
from xgoogle.search import GoogleSearch, SearchError
class give_me(object):
def __init__(self, query, target):
self.gs = GoogleSearch(query)
self.gs.results_per_page = 50
self.current = 0
self.target = target
self.buf_list = []
def __iter__(self):
return self
def next(self):
if self.current >= self.target:
raise StopIteration
else:
if(not self.buf_list):
self.buf_list = self.gs.get_results()
self.current += 1
return self.buf_list.pop(0)
try:
sites = {}
for res in give_me("quick and dirty", 200):
t_dict = \
{
"title" : res.title.encode('utf8'),
"desc" : res.desc.encode('utf8'),
"url" : res.url.encode('utf8')
}
sites[t_dict["url"]] = t_dict
print t_dict
except SearchError, e:
print "Search failed: %s" % e
I think you misunderstand what xgoogle is. xgoogle is not a wrapper; it's a library that fakes being a human user with a browser, and scrapes the results. It's heavily dependent on the format of Google's search queries and results pages as of 2009, so it's no surprise that it doesn't work the same in 2013. See the announcement blog post for more details.
You can, of course, hack up the xgoogle source and try to make it work with Google's current format (as it turns out, they've only broken xgoogle by accident, and not very badly…), but it's just going to break again.
Meanwhile, you're trying to get around Google's Terms of Service:
Don’t misuse our Services. For example, don’t interfere with our Services or try to access them using a method other than the interface and the instructions that we provide.
They're been specifically asked about exactly what you're trying to do, and their answer is:
Google's Terms of Service do not allow the sending of automated queries of any sort to our system without express permission in advance from Google.
And you even say that's explicitly what you want to do:
Final option is to use the Custom search API that google offers, but is has a redicolous 100 requests per day limit and a pricing after that.
So, you're looking for a way to access Google search using a method other than the interface they provide, in a deliberate attempt to get around their free usage quota without paying. They are completely within their rights to do anything they want to break your code—and if they get enough hits from people doing things kind of thing, they will do so.
(Note that when a program is scraping the results, nobody's seeing the ads, and the ads are what pay for the whole thing.)
Of course nobody's forcing you to use Google. EntireWeb has an free "unlimited" (as in "as long as you don't use too much, and we haven't specified the limit") search API. Bing gives you a higher quota, and amortized by month instead of by day. Yahoo BOSS is flexible and super-cheap (and even offers a "discount server" that provides lower-quality results if it's not cheap enough), although I believe you're forced to type the ridiculous exclamation point. If none of them are good enough for you… then pay for Google.

Categories

Resources