Unique index in mongodb using db.command - python

I am using mongodb with PyMongo and I would like to separate schema definition from the rest of the application. I have file user_schema.jsonwith schema for user collection:
{
"collMod": "user",
"validator": {
"$jsonSchema": {
"bsonType": "object",
"required": ["name"],
"properties": {
"name": {
"bsonType": "string"
}
}
}
}
}
Then in db.py:
with open("user_schema.json", "r") as coll:
data = OrderedDict(json.loads(coll.read())) # Read JSON schema.
name = data["collMod"] # Get name of collection.
db.create_collection(name) # Create collection.
db.command(data) # Add validation to the collection.
Is there any way to add unique index to name field in user collection without changing db.py (only by changing user_schema.json)? I know I can use this:
db.user.create_index("name", unique=True)
however, then I have information about the collection in two places. I would like to have all configuration of the collection in the user_schema.json file. I need something like that:
{
"collMod": "user",
"validator": {
"$jsonSchema": {
...
}
},
"index": {
"name": {
"unique": true
}
}
}

No, you won't be able to do that without changing db.py.
The contents of user_schema.json is an object that can be passed to db.command to run the collmod command.
In order to create an index, you need to run the createIndexes command, or one of the helpers that calls this.
It is not possible to complete both of these actions with a single command.
A simple modification would be to store an array of commands to run in user_schema.json, and have db.py iterate the array and run each command.

Related

JSON Schema validation in Python

I wonder if it is possible to have a Swagger JSON with all schemas in one file to be validated across all tests.
Assume somewhere in a long .json file a schema is specified like so:
...
"Stop": {
"type": "object",
"properties": {
"id": {
"description": "id identifier",
"type": "string"
},
"lat": {
"$ref": "#/components/schemas/coordinate"
},
"lng": {
"$ref": "#/components/schemas/coordinate"
},
"name": {
"description": "Name of location",
"type": "string"
},
"required": [
"id",
"name",
"lat",
"lng"
]
}
...
So lat and lng schemas are defined in another schema (the same file at the top).
Now I want to get response from backend with array of those stops and would like to validate it against the schema.
How should I approach it?
I am trying to get a partial schema loaded and validate against it but then the $ref won't resolve. Also is there anyway to tell the validator to accept arrays? Schema only tells how a single object looks like. If I manually add
"type": "array",
"items": {
Stop...
with hardcoded coordinates
}
Then it seems to work.
Here are my functions to validate arbitrary response JSON against chosen "chunk" of the full Swagger schema:
def pick_schema(schema):
with open("json_schemas/full_schema.json", "r") as f:
jschema = json.load(f)
return jschema["components"]["schemas"][schema]
def validate_json_with_schema(json_data, json_schema_name):
validate(
json_data,
schema=pick_schema(json_schema_name),
format_checker=jsonschema.FormatChecker(),
)
Maybe another approach is preferred? Separate files with schemas for API responses (it is quite tedious to write, full_schema.json is generated from Swagger itself).

PyMongo: Is there a way to add data to a existing document in MongoDB using python?

I have a database 'Product'. Which contains a collection name 'ProductLog'. Inside this collection , there are 2 documents in the following format:
{
"environment": "DevA",
"data": [
{
"Name": "ABC",
"Stream": "Yes"
},
{
"Name": "ZYX",
"Stream": "Yes"
}
]
},
{
"environment": "DevB",
"data": [
{
"Name": "ABC",
"Stream": "Yes"
},
{
"Name": "ZYX",
"Stream": "Yes"
}
]
}
This gets added as 2 documents in collection. I want to append more data in the already existing document's 'data' field in MongoDB using python. Is there a way for that? I guess update would remove the existing fields in "data" field or may update a whole document.
For example: Adding one more array in EmployeeDetails field, while the earlier data in EmployeeDetail also remains.
I want to show how you can append more data in the already existing document 'data' field in MongoDB using python:
First install pymongo:
pip install mongoengine
Now let's get our hands dirty:
from pymongo import MongoClient
mongo_uri = "mongodb://user:pass#mongosrv:27017/"
client = MongoClient(mongo_uri)
database = client["Product"]
collection = "ProductLog"
database[collection].update_one({"environment": "DevB"}, {
"$push": {
"data": {"Name": "DEF", "Stream": "NO"}
}
})
There is a SQL library in Python language through which you can insert/add your data in your desired database. For more information, check out the tutorial

read-only user in solr not working as expected

I would like to create two users in solr: An admin and a dev.
The dev should not be able to edit the solr metadata. This user should not be able to use solr.add or solr delete, I would like him only to be able to use solr.search for our metadata solr-core (in python pysolr).
However, the user can always use solr.add and solr.delete, no matter what permissions I set for him. Here is my security.json:
{
"authentication":{
"blockUnknown":true,
"class":"solr.BasicAuthPlugin",
"credentials":{
"my_admin":"<creds>",
"my_dev":"<creds>"},
"forwardCredentials":false,
"":{"v":0}},
"authorization":{
"class":"solr.RuleBasedAuthorizationPlugin",
"user-role":{
"my_admin":["admin"],
"my_dev":["dev"]},
"permissions":[
{
"name":"security-edit",
"role":["admin"],
"index":1},
{
"name":"read",
"role":["dev"],
"index":2}],
"":{"v":0}}
}
I also tried zk-read, metics-read, security-read, collection-admin-read instead of read, always with the same result. The user my_dev can always use solr.add and solr.delete.
So, thanks to #MatsLindh, I got it. I changed the security.json as following and now it works as expected!
{
"authentication":{
"blockUnknown":true,
"class":"solr.BasicAuthPlugin",
"credentials":{
"my_admin":"<...>",
"my_dev":"<...>",
"my_user":"<...>"},
"forwardCredentials":false,
"":{"v":0}},
"authorization":{
"class":"solr.RuleBasedAuthorizationPlugin",
"user-role":{
"my_admin":["admin"],
"my_dev":["dev"],
"my_user":["user"]},
"permissions":[
{ "name":"update", "role":["admin", "dev"] },
{ "name":"read", "role":["admin", "dev", "user"] },
{ "name":"security-edit", "role":["admin"] },
{ "name":"all", "role":["admin"] }
]
}
}

Azure vm provisioning and user assigned identity?

I am looking to resolve issue with trying to apply to the vm I am creating an identity using python sdk. The code:
print("Creating VM " + resource_name)
compute_client.virtual_machines.begin_create_or_update(
resource_group_name,
resource_name,
{
"location": "eastus",
"storage_profile": {
"image_reference": {
# Image ID can be retrieved from `az sig image-version show -g $RG -r $SIG -i $IMAGE_DEFINITION -e $VERSION --query id -o tsv`
"id": "/subscriptions/..image id"
}
},
"hardware_profile": {
"vm_size": "Standard_F8s_v2"
},
"os_profile": {
"computer_name": resource_name,
"admin_username": "adminuser",
"admin_password": "somepassword",
"linux_configuration": {
"disable_password_authentication": True,
"ssh": {
"public_keys": [
{
"path": "/home/adminuser/.ssh/authorized_keys",
# Add the public key for a key pair that can get access to SSH to the runners
"key_data": "ssh-rsa …"
}
]
}
}
},
"network_profile": {
"network_interfaces": [
{
"id": nic_result.id
}
]
},
"identity": {
"type": "UserAssigned",
"user_assigned_identities": {
"identity_id": { myidentity }
}
}
}
The last part, identity: I found somewhere on the web, (not sure where), but it is failing with some weird set/get error when I try to use it. The vm will create fine if I comment out the identity: block, but I need the user assigned identity. I spent the better part of today trying to find information on the options for the begin_create_or_update and info on the identity piece, but I have had no luck. I am looking for help on how to apply a user assigned identity with python to the vm I am creating.
The Set and Get error is because you are declaring the identity block in a wrong way.
If you have a existing User Assigned Identity then you can use the identity block as below:
"identity": {
"type": "UserAssigned",
"user_assigned_identities": {
'/subscriptions/948d4068-xxxxxxxxxxxxxxx/resourceGroups/ansumantest/providers/Microsoft.ManagedIdentity/userAssignedIdentities/mi-identity' : {}
}
As you can see, inside the user_assigned_identities it will be :
'User Assigned Identity ResourceID':{}
instead of
"identity_id":{'User Assigned Identity ResourceID'}
Output:

Python Parse JSON objects in Different Way

I'm working on a python project In which we are getting some inputs from the user.
We are working on microservice deployments actually. Where user needs to provide the following things:
1): User will provide a GitHub repo which will include all of his microservices he wants to deploy inside a specific directory.
For example, we have a directory structure in GitHub repo like this:
mysvcs
|----nodeservice
|----pyservice
2): User will provide a JSON object in which he will mention the URL for this repo and some other information for these microservices, like this:
{
"repo_url": "https://github.com/arycloud/mysvcs.git",
"services":[
{
"name": "pyservice",
"routing": {
"path": "/",
"service": "pyservice",
"port": "5000"
}
},
{
"name": "nodeservice",
"routing": {
"path": "/",
"service": "nodeservice",
"port": "8080"
}
}
]
}
Then we are reading all the services from GitHub repo and using their directories to read the source code., and along with that, we are parsing the JSON object to get some information regarding these services.
We are reading the repo like this:
tempdir = tempfile.mkdtemp()
saved_unmask = os.umask(0o077)
out_dir = os.path.join(tempdir)
Repo.clone_from(data['repo_url'], out_dir)
list_dir = os.listdir(out_dir)
print(list_dir)
services = []
for svc in range(0, len(data['services'])):
services.append(list_dir[svc])
print(services)
According to our example above, it will return:
['nodesvc', 'pyservice']
But when we are reading the JSON object there user have mentioned the services in different order instead of alphabetically, so when we are the loop through the services using the above array we are trying to use the same index for JSON object services and the list of directories after cloning the GitHub repo but due to different orders it interchange the data.
Here's a sample code:
def my_deployment(data):
# data is JSON object
# Clone github repo and grab Dockerfiles
tempdir = tempfile.mkdtemp()
saved_unmask = os.umask(0o077)
out_dir = os.path.join(tempdir)
Repo.clone_from(data['repo_url'], out_dir)
list_dir = os.listdir(out_dir)
print(list_dir)
services = []
for svc in range(0, len(data['services'])):
services.append(list_dir[svc])
print(services)
for service in range(len(services)):
# Here i need to use the data from JSON object for current service
data['services'][service]['routing']['port']
# Here it's using the data of **pyservice** instead of **nodeservice** and vice versa.
Important: Ther order of services in GitHub is ['nodeservices', 'nodeservices'] but in the JSON object user can mention his services in a different order like pyservices, nodeservices. So when we are looping through how can we sync the order of both of these sources? This is the main issue.
I have tried it by changing the structure of my JSON object in this way:
"services":[
"pyservice": {
"routing": {
"path": "/",
"service": "pyservice",
"port": "5000"
}
},
"nodeservice": {
"routing": {
"path": "/node",
"service": "nodeservice",
"port": "8080"
}
}
]
But it says syntax is not correct.
How can I overcome this issue?
Thanks in advance!
You are thinking too complicated.
for svc in data['services']:
print(svc['name'], svc['routing']['port'])
Done.
General observation: You seem to cling to loop indexes. Don't. It's a good thing that Python loops have no indexes.
Whenever you are tempted to write
for thing in range(len(some_list)):
stop, and write
for thing in some_list:
instead.
The reason the JSON is not valid, is that you can't have name value pairs in a JSON array. This page tells you an array can be:
A comma-delimited list of unnamed values, either simple or complex, enclosed in brackets
Is the below JSON any use?
{
"repo_url": "https://github.com/arycloud/mysvcs.git",
"services":[
{
"pyservice": {
"routing": {
"path": "/",
"service": "pyservice",
"port": "5000"
}
}
},
{
"nodeservice": {
"routing": {
"path": "/node",
"service": "nodeservice",
"port": "8080"
}
}
}
]
}
If you want services sorted alphabetically, you can do the following:
services = data["services"]
b = {}
for node in services:
b.update(dict(node))
alphabetical_list = sorted(b)
Note:
This gives you a list ['nodeservice', 'pyservice'] which you can then use to get the object in b.
Here an approach we can use to overcome this order-sync issue:
First, the order of directories in GitHub repo is by default alphabetically, so if we sort the order for the array of services in JSON objects we will be able to get the same index for both sources. Even to make sure we can sort both of these sources alphabetically.
Here's the code:
First sort the services array of JSON object as:
data['services'] = sorted(data["services"], key=lambda d: d["name"])
By considering the example in the question, it will give us:
services = [
{"nodeservice": {
"A": "B"
}
},
{"pyservice":{
"X": "Y"
}
}
]
Then we will sort the list of directories from GitHub repo like this:
Repo.clone_from(data['repo_url'], out_dir)
list_dir = os.listdir(out_dir)
print(list_dir)
services = []
for svc in range(0, len(data['services'])):
services.append(list_dir[svc])
services.sort()
print(services)
It will give us: ['nodeservice', 'pyservice'] according to the example above in the question.
So, In both cases we have the nodeservice first the pyservice, mean the same order.

Categories

Resources