I am developing a DAG to be scheduled on Apache Airflow which main porpuse will be to post survey data (on json format) to an API and then getting a response (the answers to the surveys). Since this whole process is going to be automated, every part of it has to be programmed in the DAG, so I canĀ“t use Postman or any similar app (unless there is a way to automate their usage, but I don't know if this is possible).
I was thinking of using the requests library for Python, and the function I've written for posting the json to the API looks like this:
def postFileToAPI(**context):
print('uploadFileToAPI() ------ ')
json_file = context['ti'].xcom_pull(task_ids='toJson') ## this pulls the json file from a previous task
print('--------------- Posting survey request to API')
r = requests.post('https://[request]', data = json_file)
(I haven't finished defining the http link for the request because my source data is incomplete.)
However, since this is my frst time working with APIs and the requests library, I don't know if this is enough. For example, I'm unsure if I need to provide a token from the API to perform the request.
I also don't know if there are other libraries that are better suited for this or that could be a good support.
In short: I don't know if what I'm doing will work as intended, what other information I need t provide my DAG or if there are any libraries to make my work easier.
The Python requests package that you're using is all you need, except if you're making a request that needs extra authorisation - then you should also import for example requests_jwt (then from requests_jwt import JWTAuth) if you're using JSON web tokens, or whatever relevant requests package corresponds for your authorisation style.
You make POST and GET requests and all individual requests separately.
Include the URL and data arguments as you have done and that should work!
You may also need headers and/or auth arguments to get through security,
eg for the GitLab api for a private repository you would include these extra arguments, where GITLAB_TOKEN is a GitLab web token.
```headers={'PRIVATE-TOKEN': GITLAB_TOKEN},
auth=JWTAuth(GITLAB_TOKEN)```
If you just try it it should work, if it doesn't work then test the API with curl requests directly in the Terminal, or let us know :)
Related
I am new to working with APIs in general and am writing code in python that needs to consume/interact with an API someone else has set up. I was wondering if there is any package out there that would build some sort of custom client class to interact with an API given a file outlining the API in some way (like a json or something where each available endpoint and http verb could be outlined in terms of stuff like allowed payload json schema for posts, general params allowed and their types, expected response json schema, the header key/value for a business verb, etc.). It would be helpful if I could have one master file outlining the endpoints available and then some package uses that to generate a client class we can use to consume the API as described.
In my googling most API packages I have found in python are much more focused on the generation of APIs but this isn't what I want.
Basically I believe you are looking for the built in requests package.
response = requests.get(f'{base_url}{endpoint}',
params={'foo': self.bar,
'foo_2':self.bar_2},
headers={'X-Api-Key': secret}
)
And from here, you can build you own class, pass it to a dataframe or whatever.
In the requests package is basically everything you need. Status handling, exception handling everything you need.
Please check the docs.
https://pypi.org/project/requests/
I am working with the AWS Transcribe streaming service that boto3 does not support yet, so to make HTTP/2 requests, I need to manually setup the authorization header with the "AWS Signature Version 4"
I've found some example implementation, but I was hoping to just call whatever function boto3/botocore have implemented using the same configuration object.
Something like
session = boto3.Session(...)
auth = session.generate_signature('POST', '/stream-transcription', ...)
Any pointers in that direction?
Contrary to the AWS SDKs for most other programming languages, boto3/botocore don't offer the functionality to sign arbitrary requests using "AWS Signature Version 4" yet. However there is at least already an open feature request for that: https://github.com/boto/botocore/issues/1784
In this feature request, existing alternatives are discussed as well. One is the third-party Python library aws-requests-auth, which provides a thin wrapper around botocore and requests to sign HTTP-requests. That looks like the following:
import requests
from aws_requests_auth.boto_utils import BotoAWSRequestsAuth
auth = BotoAWSRequestsAuth(aws_host="your-service.domain.tld",
aws_region="us-east-1",
aws_service="execute-api")
response = requests.get("https://your-service.domain.tld",
auth=auth)
Another alternative presented in the feature request is to implement the necessary glue-code on your own, as shown in the following gist: https://gist.github.com/rhboyd/1e01190a6b27ca4ba817bf272d5a5f9a.
Did you check this SDK? Seems very recent but might do what you need.
https://github.com/awslabs/amazon-transcribe-streaming-sdk/tree/master
It looks like it handles the signing: https://github.com/awslabs/amazon-transcribe-streaming-sdk/blob/master/amazon_transcribe/signer.py
I have not tested this, but you can likely accomplish this by following along with with this SigV4 unit test:
https://github.com/boto/botocore/blob/master/tests/unit/test_auth_sigv4.py
Note, this constructs a request using the botocore.awsrequest.AWSRequest helper. You'll likely need to dig around to figure out how to send the actual HTTP request (perhaps with httpsession.py)
I'm coding an app which has to use this api. So I want to do at a certain point a search on their database. Now I'm struggling with which python library is the right one to use in order to authenticate about oAuth2? I couldn't find any by now, where I was sure, it would offer the necessary functions.
I wonder if this library (python-oauth2) offers, what I need. But this isn't a library for the client, is it? It seems it is for the server...
I'd be really grateful, if someone could just give me an advice, with what I should work.
Method 1
You will need to use the following modules. No need to use oauth. Just need to get the token before performing any search using the api.
requests, json, urllib
Here's a short Example code for that
import requests, json, urllib
BASE_URL = "http://scoilnet.com/grants/apikey/"
r = requests.post(_BASE_URL+"user/token/", data={'username': username , 'password': password })
print r.getcontent
The above code will show you how to request a token from the api. Using that token you will be making get and post requests to the api which will give a json response. That json response will be shown as a Dictionary from which you will load you data in your program.
Method 2
You can also use urllib or urllib2 or urllib3
I have a tiny Flask server that is supposed to load data from a file and run a function on it. This function will return a DataFrame and I return the json version of it. Much to my surprise this all works nicely. However, how would I test this? I have included some attempts below but I don't understand Flask (nor REST) well enough yet:
#!/home/thomas/python
from flask import Flask
from flask.ext.restful import Resource, Api
app = Flask(__name__)
api = Api(app)
class UniverseAPI(Resource):
def get(self):
import pandas as pd
frame = pd.read_csv("//datasrv10//data$//AQ//test.csv", index_col=0, header=0)
return frame.to_json()
api.add_resource(UniverseAPI, '/data/universe')
I am happy to include a few of my attempts here... I appreciate any hints. I have read the official documentation.
I should specify what I mean with testing. I can run this on my linux server and can extract all the required information with the requests package. However, I want to create a unittest that comes without the need to start the server on the localhost. I think I have managed with the FLASK test-client. However, the problem now is that the requests response object and the flask response object treat the underlying json strings rather differently. So I guess my problem is more related to json string issues rather than FLASK. Thanks for all your helpful feedback though
Well, the basics of writing a REST API are essentially a set of design principles. My understanding of it is based on this article by Miguel Grinberg, http://blog.miguelgrinberg.com/post/designing-a-restful-api-with-python-and-flask .
In it, he talks about how a REST API is:
"Stateless" - All interactions with the service can happen using the information from one request.
Built upon accessing "resources" from URIs using HTTP requests like GET, PUTS, and POST. A resource could be an order in a store, a task in a web app, or whatever you like.
There's also a bunch of stuff about how the server should standardize all forms of communication between itself and the client, indicate whether it can do cacheing, and other stuff like that. From an initial design standpoint, though, this is "the point" as he put it:
"The task of designing a web service or API that adheres to the REST guidelines then becomes > an exercise in identifying the resources that will be exposed and how they will be affected > by the different request methods."
If you're looking for an interesting example of a REST API that might be suited to your interests (I know it is to mine), reddit's is open source. It's a relatable example to see how they try and structure the interactions behind requests: http://www.reddit.com/dev/api
I am new to Flask.
I have a public api, call it api.example.com.
#app.route('/api')
def api():
name = request.args.get('name')
...
return jsonify({'address':'100 Main'})
I am building an app on top of my public api (call it www.coolapp.com), so in another app I have:
#app.route('/make_request')
def index():
params = {'name':'Fred'}
r = requests.get('http://api.example.com', params=params)
return render_template('really_cool.jinja2',address=r.text)
Both api.example.com and www.coolapp.com are hosted on the same server. It seems inefficient the way I have it (hitting the http server when I could access the api directly). Is there a more efficient way for coolapp to access the api and still be able to pass in the params that api needs?
Ultimately, with an API powered system, it's best to hit the API because:
It's user testing the API (even though you're the user, it's what others still access);
You can then scale easily - put a pool of API boxes behind a load balancer if you get big.
However, if you're developing on the same box you could make a virtual server that listens on localhost on a random port (1982) and then forwards all traffic to your api code.
To make this easier I'd abstract the API_URL into a setting in your settings.py (or whatever you are loading in to Flask) and use:
r = requests.get(app.config['API_URL'], params=params)
This will allow you to make a single change if you find using this localhost method isn't for you or you have to move off one box.
Edit
Looking at your comments you are hoping to hit the Python function directly. I don't recommend doing this (for the reasons above - using the API itself is better). I can also see an issue if you did want to do this.
First of all we have to make sure the api package is in your PYTHONPATH. Easy to do, especially if you're using virtualenvs.
We from api import views and replace our code to have r = views.api() so that it calls our api() function.
Our api() function will fail for a couple of reasons:
It uses the flask.request to extract the GET arg 'name'. Because we haven't made a request with the flask WSGI we will not have a request to use.
Even if we did manage to pass the request from the front end through to the API the second problem we have is using the jsonify({'address':'100 Main'}). This returns a Response object with an application type set for JSON (not just the JSON itself).
You would have to completely rewrite your function to take into account the Response object and handle it correctly. A real pain if you do decide to go back to an API system again...
Depending on how you structure your code, your database access, and your functions, you can simply turn the other app into package, import the relevant modules and call the functions directly.
You can find more information on modules and packages here.
Please note that, as Ewan mentioned, there's some advantages to using the API. I would advise you to use requests until you actually need faster requests (this is probably premature optimization).
Another idea that might be worth considering, depending on your particular code, is creating a library that is used by both applications.