I am new to ORM's and trying to query a table with a timestamp column. My results however are empty and understandably so since I am querying a timestamp field with a date. I read and found out I can use 'sqlalchemy.sql import func' but my filter is dynamically created based on query params so I was wondering how to go about it.
Code for query model:
def merch_trans_sum_daily_summaries(db_engine, query_params):
query_filters = get_filters(query_params)
page, limit, filters = query_filters.page, query_filters.limit, query_filters.filters
strict_limit = query_filters.strict_limit
with Session(db_engine) as sess:
results = paginate(sess.query(MerchTransSumDaily)
.filter_by(**filters).yield_per(1000),
page, limit, strict_limit)
metadata = results.metadata
query_data = results.data
if not query_data:
raise exc.NoResultFound
data = [record._asdict() for record in query_data]
return data, metadata
Here is my get_filters function
def get_filters(query_parameters, strict_limit=100, default_page=1):
if query_parameters and "batch_type" in query_parameters:
query_parameters.pop('batch_type')
limit = int(query_parameters["limit"]) if query_parameters and "limit" in query_parameters else strict_limit
page = int(query_parameters["page"]) if query_parameters and "page" in query_parameters else default_page
filters = ""
if query_parameters:
filters = {key_: value_ for key_, value_ in query_parameters.items() if key_ not in ["page", "limit", "paginate", "filter"]}
return QueryFilters(limit, page, filters, strict_limit)
db.comments.find({"_id" : {"$gte": ObjectId("6225f932a7bce76715a9f3bd"), "$lt":ObjectId("6225f932a7bce76715a9f3bd")}}).sort({"created_datetime":1}).limit(10).pretty()
I am using this query which should give me the current "6225f932a7bce76715a9f3bd" doc, 4 docs inserted before this and 5 docs inserted after this. But currently when i run this query, i get null result. Where am i going wrong ??
I had no other option but to seperate my queries in order to achieve my expectation.
query = request.args.to_dict()
find_query = {}
find_query["_id"] = {"$lt": ObjectId(query["comment_id"])}
previous_comments = list(db.comments.find(find_query))
find_query["_id"] = {"$gte": ObjectId(query["comment_id"])}
next_comments = list(db.comments.find(find_query))
previous_comments.extend(next_comments)
return {"comments":previous_comments}
I have a pretty reasonable use case: Multiple possible filter_by matches for a single column. Basically, a multiselect JS dropdown on front end posts multiple company industries to the backend. I need to know how to write the SQLAlchemy query and am surprised at how I couldn't find it.
{ filters: { type: "Industry", minmax: false, value: ["Financial Services", "Biotechnology"] } }
#app.route("/dev/api/saved/symbols", methods=["POST"])
#cross_origin(origin="*")
def get_saved_symbols():
req = request.get_json()
# res = None
# if "minmax" in req["filters"]:
# idx = req["filters"].index("minmax")
# if req["filters"][idx] == "min":
# res = db.session.query.filter(Company[req["filter"]["type"]] >= req["filters"]["value"])
# else:
# res = db.session.query.filter(Company[req["filter"]["type"]] <= req["filters"]["value"])
# else:
res = db.session.query.filter_by(Company[req["filters"]["type"]] == req["filters"]["value"])
return jsonify(res)
As you can see I am also working on a minmax which is like an above or below filter for other columns like price or market cap. However, the multiselect OR dynamic statement is really what I am stuck on...
I ended up creating a separate filter function for this that I can than loop over results with.
I will just show the first case for brevity. I am sending a list of strings in which I create a list of filters and then use the or_ operator imported from sqlalchemy package.
def company_filter(db, filter_type, filter_value, minmax):
match filter_type:
case "industry":
filter_list = []
for filter in filter_value:
filter_list.append(Company.industry == filter)
return db.query(Company).with_entities(Company.id, Company.symbol, Company.name, Company.monthly_exp).filter(or_(*filter_list))
...
I have a lambda function that needs to retrieve an item from DynamoDB and update the counter of that item. But..
The DynamoDB table is structured as:
id: int
options: map
some_option: 0
some_other_option: 0
I need to first retrieve the item of the table that has a certain id and a certain option listed as a key in the options.
Then I want to increment that counter by some value.
Here is what I have so far:
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('options')
response = None
try:
response = table.get_item(Key={'id': id})
except ClientError as e:
print(e.response['Error']['Message'])
option = response.get('Item', None)
if option:
option['options'][some_option] = int(option['options'][some_option]) + some_value
# how to update item in DynamoDB now?
My issues is how to update the record now and more importantly will such solution cause data races? Could 2 simultaneous lambda calls that try to update the same item at the same option cause data races? If so what's the way to solve this?
Any pointers/help is appreciated.
Ok, I found the answer:
All I need is:
response = table.update_item(
Key={
'id': my_id,
},
UpdateExpression='SET options.#s = options.#s + :val',
ExpressionAttributeNames={
"#s": my_option
},
ExpressionAttributeValues={
':val': Decimal(some_value)
},
ReturnValues="UPDATED_NEW"
)
This is inspired from Step 3.4: Increment an Atomic Counter which provides an atomic approach to increment values. According to the documentation:
DynamoDB supports atomic counters, which use the update_item method to
increment or decrement the value of an existing attribute without
interfering with other write requests. (All write requests are applied
in the order in which they are received.)
In this SO question I had learnt that I cannot delete a Cosmos DB document using SQL.
Using Python, I believe I need the DeleteDocument() method. This is how I'm getting the document ID's that are required (I believe) to then call the DeleteDocument() method.
# set up the client
client = document_client.DocumentClient()
# use a SQL based query to get a bunch of documents
query = { 'query': 'SELECT * FROM server s' }
result_iterable = client.QueryDocuments('dbs/DB/colls/coll', query, options)
results = list(result_iterable);
for x in range(0, len (results)):
docID = results[x]['id']
Now, at this stage I want to call DeleteDocument().
The inputs into which are document_link and options.
I can define document_link as something like
document_link = 'dbs/DB/colls/coll/docs/'+docID
And successfully call ReadAttachments() for example, which has the same inputs as DeleteDocument().
When I do however, I get an error...
The partition key supplied in x-ms-partitionkey header has fewer
components than defined in the the collection
...and now I'm totally lost
UPDATE
Following on from Jay's help, I believe I'm missing the partitonKey element in the options.
In this example, I've created a testing database, it looks like this
So I think my partition key is /testPART
When I include the partitionKey in the options however, no results are returned, (and so print len(results) outputs 0).
Removing partitionKey means that results are returned, but the delete attempt fails as before.
# Query them in SQL
query = { 'query': 'SELECT * FROM c' }
options = {}
options['enableCrossPartitionQuery'] = True
options['maxItemCount'] = 2
options['partitionKey'] = '/testPART'
result_iterable = client.QueryDocuments('dbs/testDB/colls/testCOLL', query, options)
results = list(result_iterable)
# should be > 0
print len(results)
for x in range(0, len (results)):
docID = results[x]['id']
print docID
client.DeleteDocument('dbs/testDB/colls/testCOLL/docs/'+docID, options=options)
print 'deleted', docID
According to your description, I tried to use pydocument module to delete document in my azure document db and it works for me.
Here is my code:
import pydocumentdb;
import pydocumentdb.document_client as document_client
config = {
'ENDPOINT': 'Your url',
'MASTERKEY': 'Your master key',
'DOCUMENTDB_DATABASE': 'familydb',
'DOCUMENTDB_COLLECTION': 'familycoll'
};
# Initialize the Python DocumentDB client
client = document_client.DocumentClient(config['ENDPOINT'], {'masterKey': config['MASTERKEY']})
# use a SQL based query to get a bunch of documents
query = { 'query': 'SELECT * FROM server s' }
options = {}
options['enableCrossPartitionQuery'] = True
options['maxItemCount'] = 2
result_iterable = client.QueryDocuments('dbs/familydb/colls/familycoll', query, options)
results = list(result_iterable);
print(results)
client.DeleteDocument('dbs/familydb/colls/familycoll/docs/id1',options)
print 'delete success'
Console Result:
[{u'_self': u'dbs/hitPAA==/colls/hitPAL3OLgA=/docs/hitPAL3OLgABAAAAAAAAAA==/', u'myJsonArray': [{u'subId': u'sub1', u'val': u'value1'}, {u'subId': u'sub2', u'val': u'value2'}], u'_ts': 1507687788, u'_rid': u'hitPAL3OLgABAAAAAAAAAA==', u'_attachments': u'attachments/', u'_etag': u'"00002100-0000-0000-0000-59dd7d6c0000"', u'id': u'id1'}, {u'_self': u'dbs/hitPAA==/colls/hitPAL3OLgA=/docs/hitPAL3OLgACAAAAAAAAAA==/', u'myJsonArray': [{u'subId': u'sub3', u'val': u'value3'}, {u'subId': u'sub4', u'val': u'value4'}], u'_ts': 1507687809, u'_rid': u'hitPAL3OLgACAAAAAAAAAA==', u'_attachments': u'attachments/', u'_etag': u'"00002200-0000-0000-0000-59dd7d810000"', u'id': u'id2'}]
delete success
Please notice that you need to set the enableCrossPartitionQuery property to True in options if your documents are cross-partitioned.
Must be set to true for any query that requires to be executed across
more than one partition. This is an explicit flag to enable you to
make conscious performance tradeoffs during development time.
You could find above description from here.
Update Answer:
I think you misunderstand the meaning of partitionkey property in the options[].
For example , my container is created like this:
My documents as below :
{
"id": "1",
"name": "jay"
}
{
"id": "2",
"name": "jay2"
}
My partitionkey is 'name', so here I have two paritions : 'jay' and 'jay1'.
So, here you should set the partitionkey property to 'jay' or 'jay2',not 'name'.
Please modify your code as below:
options = {}
options['enableCrossPartitionQuery'] = True
options['maxItemCount'] = 2
options['partitionKey'] = 'jay' (please change here in your code)
result_iterable = client.QueryDocuments('dbs/db/colls/testcoll', query, options)
results = list(result_iterable);
print(results)
Hope it helps you.
Using the azure.cosmos library:
install and import azure cosmos package:
from azure.cosmos import exceptions, CosmosClient, PartitionKey
define delete items function - in this case using the partition key in query:
def deleteItems(deviceid):
client = CosmosClient(config.cosmos.endpoint, config.cosmos.primarykey)
# Create a database if not exists
database = client.create_database_if_not_exists(id=azure-cosmos-db-name)
# Create a container
# Using a good partition key improves the performance of database operations.
container = database.create_container_if_not_exists(id=container-name, partition_key=PartitionKey(path='/your-pattition-path'), offer_throughput=400)
#fetch items
query = f"SELECT * FROM c WHERE c.device.deviceid IN ('{deviceid}')"
items = list(container.query_items(query=query, enable_cross_partition_query=False))
for item in items:
container.delete_item(item, 'partition-key')
usage:
deviceid=10
deleteItems(items)
github full example here: https://github.com/eladtpro/python-iothub-cosmos