How to use Azure DevOps / VSTS to fetch query results in python - python

Below is my current code. It connects successfully to the organization. How can I fetch the results of a query in Azure like they have here? I know this was solved but there isn't an explanation and there's quite a big gap on what they're doing.
from azure.devops.connection import Connection
from msrest.authentication import BasicAuthentication
from azure.devops.v5_1.work_item_tracking.models import Wiql
personal_access_token = 'xxx'
organization_url = 'zzz'
# Create a connection to the org
credentials = BasicAuthentication('', personal_access_token)
connection = Connection(base_url=organization_url, creds=credentials)
wit_client = connection.clients.get_work_item_tracking_client()
results = wit_client.query_by_id("my query ID here")
P.S. Please don't link me to the github or documentation. I've looked at both extensively for days and it hasn't helped.
Edit: I've added the results line that successfully gets the query. However, it returns a WorkItemQueryResult class which is not exactly what is needed. I need a way to view the column and results of the query for that column.

So I've figured this out in probably the most inefficient way possible, but hope it helps someone else and they find a way to improve it.
The issue with the WorkItemQueryResult class stored in variable "result" is that it doesn't allow the contents of the work item to be shown.
So the goal is to be able to use the get_work_item method that requires the id field, which you can get (in a rather roundabout way) through item.target.id from results' work_item_relations. The code below is added on.
for item in results.work_item_relations:
id = item.target.id
work_item = wit_client.get_work_item(id)
fields = work_item.fields
This gets the id from every work item in your result class and then grants access to the fields of that work item, which you can access by fields.get("System.Title"), etc.

Related

Struggling with how to iterate data

I am learning Python3 and I have a fairly simple task to complete but I am struggling how to glue it all together. I need to query an API and return the full list of applications which I can do and I store this and need to use it again to gather more data for each application from a different API call.
applistfull = requests.get(url,authmethod)
if applistfull.ok:
data = applistfull.json()
for app in data["_embedded"]["applications"]:
print(app["profile"]["name"],app["guid"])
summaryguid = app["guid"]
else:
print(applistfull.status_code)
I next have I think 'summaryguid' and I need to again query a different API and return a value that could exist many times for each application; in this case the compiler used to build the code.
I can statically call a GUID in the URL and return the correct information but I haven't yet figured out how to get it to do the below for all of the above and build a master list:
summary = requests.get(f"url{summaryguid}moreurl",authmethod)
if summary.ok:
fulldata = summary.json()
for appsummary in fulldata["static-analysis"]["modules"]["module"]:
print(appsummary["compiler"])
I would prefer to not yet have someone just type out the right answer but just drop a few hints and let me continue to work through it logically so I learn how to deal with what I assume is a common issue in the future. My thought right now is I need to move my second if up as part of my initial block and continue the logic in that space but I am stuck with that.
You are on the right track! Here is the hint: the second API request can be nested inside the loop that iterates through the list of applications in the first API call. By doing so, you can get the information you require by making the second API call for each application.
import requests
applistfull = requests.get("url", authmethod)
if applistfull.ok:
data = applistfull.json()
for app in data["_embedded"]["applications"]:
print(app["profile"]["name"],app["guid"])
summaryguid = app["guid"]
summary = requests.get(f"url/{summaryguid}/moreurl", authmethod)
fulldata = summary.json()
for appsummary in fulldata["static-analysis"]["modules"]["module"]:
print(app["profile"]["name"],appsummary["compiler"])
else:
print(applistfull.status_code)

Waiting MySQL execution before rendering template

I am using Flask and MySQL. I have an issue with updated data not showing up after the execution.
Currently, I am deleting and redirecting back to the admin page so I may then have a refreshed version of the website. However, I still get old entries showing up in the table I have in the front end. After refreshing manually, everything works normally. The issue sometimes also happens with data simply not being sent to the front-end at all as a result of the template being rendered faster than the MySQL execution and an empty list being sent forward, I assume.
On Python I have:
#app.route("/admin")
def admin_main():
query = "SELECT * FROM categories"
conn.executing = True
results = conn.exec_query(query)
return render_template("src/admin.html", categories = results)
#app.route('/delete_category', methods = ['POST'])
def delete_category():
id = request.form['category_id']
query = "DELETE FROM categories WHERE cid = '{}'".format(id)
conn.delete_query(query)
return redirect("admin", code=302)
admin_main is the main page. I tried adding some sort of "semaphore" system, only executing once "conn.executing" would become false, but that did not work out either. I also tried playing around with async and await, but no luck ("The view function did not return a valid response").
I am somehow out of options in this case and do not really know how to treat the problem. Any help is appreciated!
I figured that the problem was not with the data not being properly read, but with the page not being refreshed. Though the console prints a GET request for the page, it does not direct since it is already on that same page.
The only workaround I am currently working on is socket.io implementation to have the content updated dynamically.

(BigQuery PY Client Library v0.28) - Fetch result from table 'query' job

I'm learning BigQuery API using Python Client Libraries v0.28
https://googlecloudplatform.github.io/google-cloud-python/latest/bigquery/usage.html#run-a-simple-query
Wrote this simple code to fetch data from the table
1) Create client object
client_ = bigquery.Client.from_service_account_json('/Users/xyz/key.json')
2) Begin new Async query job
QUERY = 'SELECT visitid FROM `1234567.ga_sessions_20180101`'
query_job = client_.query(QUERY
, job_id=str(uuid.uuid4()))
3) poll until the query is DONE
while (query_job.state == 'RUNNING'):
time.sleep(5)
query_job.reload()
4) Fetch the results in iteration
query_job.reload()
iter = query_job.result()
At this stage I'd like to fetch how many rows are in the table. As per the doc GitHub code iter is of type bigquery.table.RowIterator with a property [tier.total_rows][1]
5) However, at this stage when I print:
print(iter.total_rows)
It keeps returning None
I'm pretty sure this table is NOT empty an dry query is correctly formatted!
Any help to any pointers what am I missing here will be really helpful... Thanks a lot!
Cheers!
You need to also check query_job.error_result to make sure query succeeded.
You can also see your job in the UI, which can be useful for debugging, using project id and job id:
https://bigquery.cloud.google.com/results/projectid:jobid
Also, query_job.result() already waits for the job completion so you don't need to poll.
The current behavior of how RowIterator returns None is indeed perplexing. Luckily, according to this issue, tswast's comment from 10 days ago indicates that the developers are working on a better solution.
Current awkward behavior of .total_rows
Currently, .total_rows is initialized only once iteration begins. (In what follows, for clarity I renamed your iter variable to row_iter.)
row_iter = query_job.result()
itr = iter(row_iter)
first_row = next(itr)
print(row_iter.total_rows) # Now you get a number instead of None.
This is ugly because to continue the iteration, we must either handle the first row differently or call row_iter = query_job.result() again.
Temporary workaround
A currently-working alternative is to use the value of query_job._query_results.total_rows. Unfortunately this is cheating because _query_results is private, so there is no reason to expect that this will work in the future.
Future behavior
If tswast's proposal is implemented, then row_iter.total_rows will be initialized at the beginning, just as you expect.
Suggestion
In my code, I'm going to use something like
try:
num_rows = row_iter.total_rows or query_job._query_results.total_rows
except NameError:
num_rows = None
to be compatible with future behavior while falling-back to the temporary workaround if necessary.

How to use PyOrient to create functions (stored procedures) in OrientDB?

I'm trying to create an OrientDB graph database using PyOrient, and I can't find enough documentation to allow me to get Functions working. I've been able to create a function using record_create into the ofunction cluster, but although it doesn't crash, it doesn't appear to work either.
Here's my code:
#!/usr/bin/python
import pyorient
ousername="user"
opassword="pass"
client = pyorient.OrientDB("localhost", 2424)
session_id = client.connect( ousername, opassword )
db_name="database"
client.db_create( db_name, pyorient.DB_TYPE_GRAPH, pyorient.STORAGE_TYPE_PLOCAL )
# Set up the schema of the database
client.command( "create class URL extends V" )
client.command( "CREATE PROPERTY URL.url STRING")
client.command( "CREATE PROPERTY URL.id INTEGER")
client.command( "CREATE SEQUENCE urlseq")
client.command( "CREATE INDEX urls ON URL (url) UNIQUE")
# Get the id numbers of all the clusters
info=client.db_reload()
clusters={}
for c in info:
clusters[c.name]=c.id
print(clusters)
# Construct a test function
# All this should do is create a new URL vertex. Eventually it will check for uniqueness of url, etc.
code="INSERT INTO URL SET id = sequence('urlseq').next(), url='?'"
addURL_func = { '#OFunction': { 'name': 'addURL', 'code':'orient.getGraph().command("sql","%s",[urlparam]);' % code, 'language':'javascript', 'parameters':'urlparam', 'idempotent':False } }
client.record_create( clusters['ofunction'], addURL_func )
# Assume allURLs contains the list of URLs I want to store
for url in allURLs:
client.command("select addURL('%s')" % url)
vs = client.command("select * from URL")
for v in vs:
print(v.url)
Doing all the select addURL bits runs happily, but doing select * from URL simply times out. Presumably because (as I've discovered by examining the database in Studio) there are still no URL vertices. Although why that should timeout rather than returning an empty list or giving a useful error message, I'm not sure.
What am I doing wrong, and is there an easier way to create Functions through PyOrient?
I don't want to just write the Functions in Studio, because I am prototyping and want them written from the Python code rather than being lost every time I drop the mangled experimental graph!
I've mainly been using the OrientDB wiki page to find out about OrientDB functions, and the PyOrient github page as almost my only source of documentation for that.
Edit: I've been able to create a working Function in SQL (see my own answer below) but I still can't create a working Javascript Function which creates a vertex. My current best attempt is:
code2="""var g=orient.getGraph();g.command('sql','CREATE VERTEX URL SET id = sequence(\\"urlseq\\").next(), url = \\"'+urlparam+'\\"',[urlparam]);"""
myFunction2 = 'CREATE FUNCTION addURL2 "' + code2 + '" parameters [urlparam] idempotent false language javascript'
client.command(myFunction2)
which runs without crashing when called from PyOrient, but doesn't actually create any vertices. But if I call it from Studio, it works!?! I have no idea what's going on.
OK, after a lot of hacking and Googling, I've got it working:
code="CREATE VERTEX URL SET id = sequence('urlseq').next(), url = :urlparam;"
myFunction = 'CREATE FUNCTION addURL "' + code + '" parameters [urlparam] idempotent false language sql'
client.command(myFunction)
The key here seems to be the use of a colon before parameter names in OrientDB's version of SQL. I couldn't find any reference to this anywhere in the OrientDB docs, but someone online had discovered it somehow.
I'm answering my own question in the hope that this will help others struggling wth ODB's poor documentation!
You could try something like :
code="var g=orient.getGraph();\ng.command(\\'sql\\',\\'%s\\',[urlparam]);"
myFunction = "CREATE FUNCTION addURL '" + code + "' parameters [urlparam] idempotent false language javascrip"
client.command(myFunction);
UPDATE
I used this code (version 2.2.5) and it worked for me
code="var g=orient.getGraph().command(\\'sql\\',\\'%s\\',[urlparam]);"
myFunction = "CREATE FUNCTION addURL '" + code + "' parameters [urlparam] idempotent false language javascrip"
client.command(myFunction);
Hope it helps

Multi-tenancy with SQLAlchemy

I've got a web-application which is built with Pyramid/SQLAlchemy/Postgresql and allows users to manage some data, and that data is almost completely independent for different users. Say, Alice visits alice.domain.com and is able to upload pictures and documents, and Bob visits bob.domain.com and is also able to upload pictures and documents. Alice never sees anything created by Bob and vice versa (this is a simplified example, there may be a lot of data in multiple tables really, but the idea is the same).
Now, the most straightforward option to organize the data in the DB backend is to use a single database, where each table (pictures and documents) has user_id field, so, basically, to get all Alice's pictures, I can do something like
user_id = _figure_out_user_id_from_domain_name(request)
pictures = session.query(Picture).filter(Picture.user_id==user_id).all()
This is all easy and simple, however there are some disadvantages
I need to remember to always use additional filter condition when making queries, otherwise Alice may see Bob's pictures;
If there are many users the tables may grow huge
It may be tricky to split the web application between multiple machines
So I'm thinking it would be really nice to somehow split the data per-user. I can think of two approaches:
Have separate tables for Alice's and Bob's pictures and documents within the same database (Postgres' Schemas seems to be a correct approach to use in this case):
documents_alice
documents_bob
pictures_alice
pictures_bob
and then, using some dark magic, "route" all queries to one or to the other table according to the current request's domain:
_use_dark_magic_to_configure_sqlalchemy('alice.domain.com')
pictures = session.query(Picture).all() # selects all Alice's pictures from "pictures_alice" table
...
_use_dark_magic_to_configure_sqlalchemy('bob.domain.com')
pictures = session.query(Picture).all() # selects all Bob's pictures from "pictures_bob" table
Use a separate database for each user:
- database_alice
- pictures
- documents
- database_bob
- pictures
- documents
which seems like the cleanest solution, but I'm not sure if multiple database connections would require much more RAM and other resources, limiting the number of possible "tenants".
So, the question is, does it all make sense? If yes, how do I configure SQLAlchemy to either modify the table names dynamically on each HTTP request (for option 1) or to maintain a pool of connections to different databases and use the correct connection for each request (for option 2)?
After pondering on jd's answer I was able to achieve the same result for postgresql 9.2, sqlalchemy 0.8, and flask 0.9 framework:
from sqlalchemy import event
from sqlalchemy.pool import Pool
#event.listens_for(Pool, 'checkout')
def on_pool_checkout(dbapi_conn, connection_rec, connection_proxy):
tenant_id = session.get('tenant_id')
cursor = dbapi_conn.cursor()
if tenant_id is None:
cursor.execute("SET search_path TO public, shared;")
else:
cursor.execute("SET search_path TO t" + str(tenant_id) + ", shared;")
dbapi_conn.commit()
cursor.close()
Ok, I've ended up with modifying search_path in the beginning of every request, using Pyramid's NewRequest event:
from pyramid import events
def on_new_request(event):
schema_name = _figire_out_schema_name_from_request(event.request)
DBSession.execute("SET search_path TO %s" % schema_name)
def app(global_config, **settings):
""" This function returns a WSGI application.
It is usually called by the PasteDeploy framework during
``paster serve``.
"""
....
config.add_subscriber(on_new_request, events.NewRequest)
return config.make_wsgi_app()
Works really well, as long as you leave transaction management to Pyramid (i.e. do not commit/roll-back transactions manually, letting Pyramid to do that at the end of request) - which is ok as committing transactions manually is not a good approach anyway.
What works very well for me it to set the search path at the connection pool level, rather than in the session. This example uses Flask and its thread local proxies to pass the schema name so you'll have to change schema = current_schema._get_current_object() and the try block around it.
from sqlalchemy.interfaces import PoolListener
class SearchPathSetter(PoolListener):
'''
Dynamically sets the search path on connections checked out from a pool.
'''
def __init__(self, search_path_tail='shared, public'):
self.search_path_tail = search_path_tail
#staticmethod
def quote_schema(dialect, schema):
return dialect.identifier_preparer.quote_schema(schema, False)
def checkout(self, dbapi_con, con_record, con_proxy):
try:
schema = current_schema._get_current_object()
except RuntimeError:
search_path = self.search_path_tail
else:
if schema:
search_path = self.quote_schema(con_proxy._pool._dialect, schema) + ', ' + self.search_path_tail
else:
search_path = self.search_path_tail
cursor = dbapi_con.cursor()
cursor.execute("SET search_path TO %s;" % search_path)
dbapi_con.commit()
cursor.close()
At engine creation time:
engine = create_engine(dsn, listeners=[SearchPathSetter()])

Categories

Resources