Query google datastore by key in gcloud api - python

I'm trying to query for some data using the gcloud api that I just discovered. I'd like to query for a KeyPropery. e.g.:
from google.appengine.ext import ndb
class User(ndb.Model):
email = ndb.StringProperty()
class Data(ndb.Model):
user = ndb.KeyProperty('User')
data = ndb.JsonProperty()
In GAE, I can query this pretty easily assuming I have a user's key:
user = User.query(User.email == 'me#domain.com').get()
data_records = Data.query(Data.user == user.key).fetch()
I'd like to do something similar using gcloud:
from gcloud import datastore
client = datastore.Client(project='my-project-id')
user_qry = client.query(kind='User')
user_qry.add_filter('email', '=', 'me#domain.com')
users = list(user_qry.fetch())
user = users[0]
data_qry = client.query(kind='Data')
data_qry.add_filter('user', '=', user.key) # This doesn't work ...
results = list(data_qry.fetch()) # results = []
Looking at the documentation for add_filter, it doesn't appear that Entity.key is a supported type:
value (int, str, bool, float, NoneType, :classdatetime.datetime) – The value to filter on.
Is it possible to add filters for key properties?
I've done a bit more sleuthing to try to figure out what is really going on here. I'm not sure that this is helpful for me to understand this issue at the present, but maybe it'll be helpful for someone else.
I've mocked out the underlying calls in the respective libraries to record the protocol buffers that are being serialized and sent to the server. For GAE, it appears to be Batch.create_async in the datastore_query module.
For gcloud, it is the datastore.Client.connection.run_query method. Looking at the resulting protocol buffers (anonymized), I see:
gcloud query pb.
kind {
name: "Data"
}
filter {
composite_filter {
operator: AND
filter {
property_filter {
property {
name: "user"
}
operator: EQUAL
value {
key_value {
partition_id {
dataset_id: "s~app-id"
}
path_element {
kind: "User"
name: "user_string_id"
}
}
}
}
}
}
}
GAE query pb.
kind: "Data"
Filter {
op: 5
property <
name: "User"
value <
ReferenceValue {
app: "s~app-id"
PathElement {
type: "User"
name: "user_string_id"
}
}
>
multiple: false
>
}
The two libraries are using different versions of the proto as far as I can tell, but the data being passed looks very similar...

This is a subtle bug with your use of the ndb library:
All ndb properties accept a single positional argument that specifies the property's name in Datastore
Looking at your model definition, you'll see user = ndb.KeyProperty('User'). This isn't actually saying that the user property is a key of a User entity, but that it should be stored in Datastore with the property name User. You can verify this in your gae protocol buffer query where the property name is (case sensitive) User.
If you want to limit the key to a single kind, you need to specify it using the kind option.
user = ndb.KeyProperty(kind="User")
The KeyProperty also supports:
user = ndb.KeyProperty(User) # User is a class here, not a string
Here is a description of all the magic.
As it is now, your gcloud query is querying for the wrong cased user and should be:
data_qry = client.query(kind='Data')
data_qry.add_filter('User', '=', user.key)

Related

How to access a FastAPI Depends value from a Pydantic validator?

Let's say I have a route that allows clients to create a new user
(pseudocode)
#app.route("POST")
def create_user(user: UserScheme, db: Session = Depends(get_db)) -> User:
...
and my UserScheme accepts a field such as an email. I would like to be able to set some settings (for example max_length) globally in a different model Settings. How do I access that inside a scheme? I'd like to access the db inside my scheme.
So basically my scheme should look something like this (the given code does not work):
class UserScheme(BaseModel):
email: str
#validator("email")
def validate_email(cls, value: str) -> str:
settings = get_settings(db) # `db` should be set somehow
if len(value) > settings.email_max_length:
raise ValueError("Your mail might not be that long")
return value
I couldn't find a way to somehow pass db to the scheme. I was thinking about validating such fields (that depend on db) inside my route. While this approach works somehow, the error message itself is not raised on the specific field but rather on the entire form, but it should report the error for the correct field so that frontends can display it correctly.
One option to accept arbitrary JSON objects as input, and then construct a UserScheme instance manually inside the route handler:
#app.route(
"POST",
response_model=User,
openapi_extra={
"requestBody": {
"content": {
"application/json": {
"schema": UserScheme.schema(ref_template="#/components/schemas/{model}")
}
}
}
},
)
def create_user(request: Request, db: Session = Depends(get_db)) -> User:
settings = get_settings(db)
user_data = request.json()
user_schema = UserScheme(settings, **user_data)
Note that this idea was borrowed from https://stackoverflow.com/a/68815913/2954547, and I have not tested it myself.
In order to facilitate the above, you might want to redesign this class so that the settings object itself as an attribute on the UserScheme model, which means that you don't ever need to perform database access or other effectful operations inside the validator, while also preventing you from instantiating a UserScheme without some kind of sensible settings in place, even if they are fallbacks or defaults.
class SystemSettings(BaseModel):
...
def get_settings(db: Session) -> SystemSettings:
...
EmailAddress = typing.NewType('EmailAddress', st)
class UserScheme(BaseModel):
settings: SystemSettings
if typing.TYPE_CHECKING:
email: EmailAddress
else:
email: str | EmailAddress
#validator("email")
def _validate_email(cls, value: str, values: dict[str, typing.Any]) -> EmailAddress:
if len(value) > values['settings'].max_email_length:
raise ValueError('...')
return EmailAddress(value)
The use of tyipng.NewType isn't necessary here, but I think it's a good tool in situations like this. Note that the typing.TYPE_CHECKING trick is required to make it work, as per https://github.com/pydantic/pydantic/discussions/4823.

Partial update of an Object field in MongoDB document FastAPI

I am trying to update only given fields in an Object. In this case the object is a user whose schema looks like this:
class BasicUser(BaseModel):
timestamp: datetime = Field(default_factory=datetime.now)
name: str = Field(...)
surname: str = Field(...)
email: EmailStr = Field(...)
phone: constr(
strip_whitespace=True,
regex=r"^[\+]?[(]?[0-9]{3}[)]?[-\s\.]?[0-9]{3}[-\s\.]?[0-9]{4,6}$",
)
age: PositiveInt = Field(...)
This user is only a part of the entire document. The full document looks like this:
I am using FastAPI and I want to receive only the fields that need updating from the frontend,
my endpoint looks like this:
#router.put('/user/{id}', response_description='Updates a user', response_model=DocumentWithoutPassword)
async def updateUser (id: str, user: BasicUserUpdate = Body(...)):
new_user = {k: v for k, v in user.dict().items() if v is not None}
new_user = jsonable_encoder(new_user)
if len(new_user) >= 1:
update_result = await dbConnection.update_one({"_id": id}, {"$set": { "user" : new_user}})
the body of the request can include any number of fields (includes only the fields that need updating)
for example the body could look like:
{
"email" : "abcd123#gmail.com"
}
or
{
"email" : "abcd123#gmail.com",
"phone" : "+123456789"
}
The problem with the above code is that when the request arrives, instead of updating the fields, it overwrites the entire user with only the email (if the email was sent) or the email and phone (if they were both sent).
So my question is how can I update specific values in user without overwriting everything
e.g. I send {"email" : "abcd123#gmail.com"} as the body, but instead of updating the email and leaving everything else as-is.
This is the result:
Assuming that the generated new_user object does indeed only contain fields to change that are nested inside of the user field in the document (which sounds like it is the case), then probably the most straightforward option here is to use an aggregation pipeline to describe the update. This approach allows you to access a variety of pipeline operators notably the $mergeObjects operator. Changing the following line:
update_result = await dbConnection.update_one({"_id": id}, {"$set": { "user" : new_user}})
To something like:
update_result = await dbConnection.update_one({"_id": id}, [ { $set: { user: { $mergeObjects: [ "$user", new_user ] } } } ])
Should yield the results that you want. See a demonstration of it at this playground link.
One general question does come to mind. If these documents represent users, then is there any particular value in nesting all of the fields underneath a parent user field? It probably doesn't make a big difference one way or another in the end, but certainly the fact that this question was asked helps demonstrate some minor additional friction that can be encountered by nesting data (especially if it is unnecessary).

The alias field of the Pydantic Model schema is by default in swagger instead of the original field

I need to receive data from an external platform (cognito) that uses PascalCase, and the Pydantic model supports this through field aliases, adding an alias_generator = to_camel in the settings I make all fields have a PascalCase alias corresponding
In this way, the model:
class AuthenticationResult(BaseModel):
access_token: str
expires_in: int
token_type: str
refresh_token: str
id_token: str
class Config:
alias_generator = to_camel
allow_population_by_field_name = True
it can receive the following dictionary without the slightest problem:
data = {
"AccessToken": "myToken",
"ExpiresIn": 0,
"TokenType": "string",
"RefreshToken": "string",
"IdToken": "string"
}
auth_data = AuthenticationResult(**data)
print(auth_data.access_token)
# Output: myToken
However, in the application's Swagger documentation it is also in PascalCase format, but it must return in snake_case format, which is strange, since by_alias is False by default.
Here's the format:
I need it to be in snake_case format to send it to the client. How can I do this, so that it continues to accept being built using a PascalCase dictionary
Might be easiest to create a sub model inheriting from your main model to set the alias generator on, then user that model for validation and the first one to generate the schema.

mongoengine - filter by "OR" on a single field in a single query

Here's a (really) rough example of what I'm looking for - given a document with the following schema:
from mongoengine import Document, StringField, DateTimeField
class Client(Document):
name = StringField()
activated_on = DateTimeField(required=False)
How would I query it for a client that was never activated or activated before a certain point in time?
In other words, both of the documents would show up in the results if I searched for entries without an activation date or one that occurred before 2016-07-22.
{ "name": "Bob Lawbla" }
{ "name": "Gerry Mander", "activated_on": 2016-07-01T00:00:00 }
I know I can do:
Client.objects(activated_on__lte=datetime.datetime(2016,7,22))
and
Client.objects(activated_on__exists=False)
but how do I combine them into one query?
You can use Q class :
from mongoengine.queryset.visitor import Q as MQ
Client.objects(MQ(activated_on__exists=False)|MQ(activated_on__lte=datetime.datetime(2016,7,22)))

Sailsjs route and template rendering

This question is directed to anyone with both flask (python) and sailsjs knowledge. I am very new to the concept of web frameworks and such. I started using flask but now I must use Sailsjs. In flask, I can define a route as:
#app.route('/company/<org_name>')
def myfunction(org_name):
...use org_name to filter my database and get data for that company...
return render_template('companies.html', org_name=org_name, mydata=mydata)
Where I can use myfunction() to render a template in which I can pass the parameters org_name and mydata.
In sails, I am confused as to how to define my route with a given parameter. I understand:
'/': {
view: 'companies'
}
but I am not sure how to make the route dynamic in order to accept any variable org_name.
Another problem is that in python, mydata is a query from a MySQL database. I have the same data base connected to Sailsjs with the model completely set up but I am sure this model is useless. The site that I am creating will not be producing any new data (i.e. I will neither be updating nor saving new data to the database).
My Question is thus: with
'/company/:org_name': {
view: 'companies'
}
where should I create the function that filters the database? how should I be sure that sails will pass that org_name parameter into the function and how should I pass the data as a parameter into an html template?
Thanks a ton.
There are 2 options here but it helps to explain a bit about each in order for you to pick the best course of action.
Firstly, routes...You can indeed as you have shown render a view from a route directly, but you can also do a few other things. Below are 2 snippets from the sails.js website docs:
module.exports.routes = {
'get /signup': { view: 'conversion/signup' },
'post /signup': 'AuthController.processSignup',
'get /login': { view: 'portal/login' },
'post /login': 'AuthController.processLogin',
'/logout': 'AuthController.logout',
'get /me': 'UserController.profile'
}
'get /privacy': {
view: 'users/privacy',
locals: {
layout: 'users'
}
},
Snippet 1 shows how you can render a view directly as you have shown but also how you can point to a controller in order to do some more complex logic.
The login within your controller could mean that for the /me "GET" route you can execute a database query within the "profile" method to accept a get parameter, find a user and then display a view with the users data within. An example of that would be:
Profile: function (req,res){
User.find({name: req.param('name')}.exec(function founduser(err,result){
return view('profile',{userdata: result});
});
}
In the second snipped from the sails docs you can see "locals" being mentioned. Here in the GET privacy route we see that the view is being told whether to use the layout template or not. However, with that being said there is nothing stopping you pushing more into the locals such as users name etc.
In my opinion and what I feel is best practice, I would leave your routes.js to be quite thin and logicless, put the database queries/logic/redirections in to your controller.
For your specific example:
My routes.js file may look like this
// config/routes.js
module.exports.routes = {
'get /companies': 'CompanyController.index',
'get /companies/:orgname': 'CompanyController.company'
}
This allows the first route to potentially show a list of companies by going to /companies and my second route may fire when a get request is made based on clicking a company name e.g. /companies/Siemens
My CompanyController.js for these may look like this:
module.exports = {
index: function (req, res) {
return res.view('companieslist');
},
company: function (req, res) {
var companyName = req.param('orgname'); //Get the company name supplied
//My company model which could be in api/models as Company.js is used to find the company
Company.find({name: companyName}).limit(1).exec(function companyresult(err,result){
//Error catch
if(err){return res.negotiate(err)};
//The result of our query is pushed into the view as a local
return res.view('company',{companydata: result}); //Within the .find callback to ensure we keep async
});
}
};
In my view I can access the data retrieved under "companydata" e.g. for EJS:
<%=companydata[0].name%>
If you need any further help/clarifications let me know. I do recommend taking a look at the sails.js documentation but if you really want to get your head around things I recommend sails.js in Action which is an ebook from mannings. A couple of days of reading really got me up to speed!

Categories

Resources