Public API with Private Elements in Python - python

I'm working on a web mapping service and would like to provide my users with a Python API that they can use to create custom plugins. These plugins would be running on my server so I'm trying to lock down Python as much as possible.
To ensure that users can't access files they are not supposed to, I'm planning on running the plugins inside of a PyPy sandboxed interpreter.
The last large hurdle that I'm trying to overcome is how to overcome is with the API itself. I have a database API that will allow the user to make controlled queries to the database. For example, I have a select(column,condition) function that will allow users to retrieve items from the database but they will be restricted to a table. Then, there is a private function _query(sql) that takes raw SQL commands and executes them on the server
The problem is that Python (at least to my knowledge) provides no way of preventing users from calling the query function directly and nuking my database. In order to prevent this, I was thinking of breaking my API into two parts:
------------- pipe -------------- ------------
| publicAPI | -------> | privateAPI | ---> | database |
------------- -------------- ------------
So publicAPI would basically be a proxy API communicating with the main API via a pipe. publicAPI would only contain proxies to the public API functions, users would be unable to get a hold of the private elements (such as _query(sql)).
Do you think this is a viable solution? Am I making way too much work for myself by overlooking a simpler solution?
Thanks so much for your help!

This looks like a clean way to implement this to me. I believe it's also sometimes referred to as the "Facade" design pattern.
In python this is very easy to implement using explicit method delegation (a short snippet to give you a general idea):
class FacadingAPI():
def __init__(fullapi_instance):
self.select = fullapi_instance.select # method delegating
...

There are many ways that you can do this. I tend to have a tuple with all the function names that are available and then check if the function being called is in the tuple, if not then throw an error.
e.g.
funcs = ("available","functions","here")
if (calledFunction in funcs):
#do something
else:
#throw error
or you can take the approach that google have on their App Engine page
http://code.google.com/appengine/articles/rpc.html

Related

todoist - Fetching the list of projects with python library

I am specifically trying to get the project id given the project name. I saw in the api the api.sync() is supposed to return to me all the projects as in array in a key which I was then planning to iterate through.
I tried using sync with the python library but my projects array is empty, is it some sort of promise mechanism if so how do I wait for success response in python language?
import todoist
api = todoist.TodoistAPI(token)
response = api.sync()
projects = response['projects']
for project in projects:
print(project['name'] + '-' + project['id'])
The python library automatically syncs so the sync method response rarely contains anything useful BUT that information is contained in the api class which has an overridden _find_object method. Therefore you can use the same notation as a dict to find elements - i.e. api['projects'].
I know this is an old question but I'd been having this problem and was struggling to find an answer so hopefully this will be useful to someone at some point!
The library takes care of the Sync for you. In case you already did a Sync in the past, it stores a hash into your $HOME/.todoist-sync/. I recommend you to try to clean this path and try again.

simple-salesforce convert lead

I'm pretty new to the salesforce api. I've been employing the python module simple-salesforce in order to create leads. It works great, but it's really unclear to me how to do non-CRUDlike actions. For example, I want to programatically convert a lead into an account.
The salesforce GUI makes this easy. One would simply open the lead, then click the convert button. Does anyone out there know how to do this with simple-salesforce?
UPDATE
I found this describing the creation of an APEX resource Is there any REST service available in Saleforce to Convert Leads into Accounts?
I'm hoping there is a more elegant way to achieve this but I'll post what I do with simple salesforce's apex support if that's what ends up happening.
It looks like the best way to deal with this problem is to create an APEX class as detailed in the linked post. After creating that class, you can use simple salesforce to query it like this:
conversion_result = sf.apexecute('Lead/{id}'.format(id=lead_result['id']), method='GET')
A tip to anyone trying this: please make sure you create the class in a sandbox account. I tried for a good 20 minutes to create the apex class in our production environment without realizing that salesforce doesn't let you do that.
After making the changes in your sandbox you need to upload them to production. Of course, the environments are not connected by default! Here is an explanation on how to allow uploads to your production environment.
UPDATE:
Here is the test class I created for the linked APEX class. Salesforce requires test classes with 75% coverage. This doesn't actually test any functionality, it just passes Salesforce's arbitrary requirements.
#isTest
class RestLeadConvertTest{
#isTest static void testIt(){
Lead lead = new Lead();
lead.LastName = 'salesforce';
lead.Company = 'unittest';
insert lead;
RestRequest req = new RestRequest();
RestResponse res = new RestResponse();
req.requestURI = '/services/apexrest/Lead/' + lead.Id; //Request URL
req.httpMethod = 'GET';//HTTP Request Type
RestContext.request = req;
RestContext.response= res;
RestLeadConvert.doGet();
}
}

Best way to parse arguments in a REST api

I am building a Rest API based on flask Restful. And I'm troubled what would be the best way to parse the arguments I am waiting for.
The info:
Table Schema:
-----------------------------------------
| ID | NAME | IP | MAIL | OS | PASSWORD |
-----------------------------------------
Method that does the job:
def update_entry(data, id):
...
...
pass
The resource that handles the request:
def put(self, id):
json_data = request.get_json(force=True)
update_entry(json_data, id)
pass
Json format:
{'NAME': 'john', 'OS': 'windows'}
I have to mention that I do not know if all the above are relevant to my question.
Now what I would like to know is, where is the proper place to check if the client sent the arguments i want or the keys in his request are valid.
I have thought a couple of alternatives but i have the feeling that i'm missing a best practice here.
Pass whatever the client sends to the backend let an error happen and catch it.
Create sth like a json template and validate the client's request with that before pass it back.
Ofc the first option is simplier, but the second doesn't create unnecessary load to my db although might become quite complex.
Any opinion for either of the above two or any other suggestion welcome.
Why you don't consider to use a library like marchmallow since the flask-restful documentation suggest it? It will answer your problems in a proper and non custom why like if you would right the validation from scratch.
It's not a good idea to let your database api catch errors. Repeated DB access will hamper performance.
Best case scenario, if you can error check the json at the client, do it.
Error check a json in python anyhow. You must always assume that the network
is compromised and you will get garbage values/ malicious requests.
A design principle I read somewhere (Clean Code I think) was that be strict on
output but go easy on the input.

Creating custom source for reading from cloud datastore using latest python apache_beam cloud datafow sdk

Recently cloud dataflow python sdk was made available and I decided to use it. Unfortunately the support to read from cloud datastore is yet to come so I have to fall back on writing custom source so that I can utilize the benefits of dynamic splitting, progress estimation etc as promised. I did study the documentation thoroughly but am unable to put the pieces together so that I can speed up my entire process.
To be more clear my first approach was:
querying the cloud datastore
creating ParDo function and passing the returned query to it.
But with this it took 13 minutes to iterate over 200k entries.
So I decided to write custom source that would read the entities efficiently. But am unable to achieve that due to my lack of understanding of putting the pieces together. Can any one please help me with how to create custom source for reading from datastore.
Edited:
For first approach the link to my gist is:
https://gist.github.com/shriyanka/cbf30bbfbf277deed4bac0c526cf01f1
Thank you.
In the code you provided, the access to Datastore happens before the pipeline is even constructed:
query = client.query(kind='User').fetch()
This executes the whole query and reads all entities before the Beam SDK gets involved at all.
More precisely, fetch() returns a lazy iterable over the query results, and they get iterated over when you construct the pipeline, at beam.Create(query) - but, once again, this happens in your main program, before the pipeline starts. Most likely, this is what's taking 13 minutes, rather than the pipeline itself (but please feel free to provide a job ID so we can take a deeper look). You can verify this by making a small change to your code:
query = list(client.query(kind='User').fetch())
However, I think your intention was to both read and process the entities in parallel.
For Cloud Datastore in particular, the custom source API is not the best choice to do that. The reason is that the underlying Cloud Datastore API itself does not currently provide the properties necessary to implement the custom source "goodies" such as progress estimation and dynamic splitting, because its querying API is very generic (unlike, say, Cloud Bigtable, which always returns results ordered by key, so e.g. you can estimate progress by looking at the current key).
We are currently rewriting the Java Cloud Datastore connector to use a different approach, which uses a ParDo to split the query and a ParDo to read each of the sub-queries. Please see this pull request for details.

Is there a way to print out output in a pyramid view callable?

I am new to python and pyramid and I am trying to figure out a way to print out some object values that I am using in a view callable to get a better idea of how things are working. More specifically, I am wanting to see what is coming out of a sqlalchemy query.
DBSession.query(User).filter(User.name.like('%'+request.matchdict['search']+'%'))
I need to take that query and then look up what Office a user belongs to by the office_id attribute that is part of the User object. I was thinking of looping through the users that come up from that query and doing another query to look up the office information (in the offices table). I need to build a dictionary that includes some User information and some Office information then return it to the browser as json.
Is there a way that I can experiment with different attempts at this while viewing my output without having to rely on the browser. I am more of a front end developer so when I am writing javascript I just view my outputs using console.log(output).
console.log(output) is to JavaScript
as
????? is to Python (specifically pyramid view callable)
Hope the question is not dumb. Just trying to learn. Appreciate anyones help.
This is a good reason to experiment with pshell, Pyramid's interactive python interpreter. From within pshell you can tinker with things on the command-line and see what they will do before adding them to your application.
http://docs.pylonsproject.org/projects/pyramid/en/1.4-branch/narr/commandline.html#the-interactive-shell
Of course, you can always use "print" and things will show up in the console. SQLAlchemy also has the sqlalchemy.echo ini option that you can turn on to see all queries. And finally, it sounds like you just need to do a join but maybe aren't familiar with how to write complex database queries, so I'd suggest you look into that before resorting to writing separate queries. Likely a single query can return you what you need.

Categories

Resources