Document found on table but not present on an global secondary index

Document found on table but not present on an global secondary index - python

really wondering what's going on here.
We have stored documents in the (abreviated) form:
{
"a_certain_id": "259217078123",
"name": "company name",
"vat_number": "BE0912111111"
}
the pk in the table is the "vat_number" property, but we also have a global secondary index on "a_certain_id".
When we perform a query on the table with the vat number, we get the document as expected.
We then perform a query on the secondary index with the copy-pasted property from the document,
and we find no document.
We then perform a scan with the vat number described above, no document is found.
I can only conclude that the document doesn't exist in the index!
Is there a way to manage this, such as repopulating the index or is there something wrong with the chosen hash key / pk? It shouldn't acccording to the documentation.
We do the queries in the following form:
key_condition_expression = Key(hash_key).eq(hash_value)
query_args = {"IndexName": index, "KeyConditionExpression": key_condition_expression}
result = dynamo_table.query(**query_args)
but that should not matter as we get the same result either via boto3 or via the aws console client.
And the query is working for most companies, it's only companies which are not in the index, apparently.

Related

DynamoDB Query for users with expired IP addresses

So I have a DynamoDB database table which looks like this (exported to csv):
"email (S)","created_at (N)","firstName (S)","ip_addresses (L)","lastName (S)","updated_at (N)"
"name#email","1628546958.837838381","ddd","[ { ""M"" : { ""expiration"" : { ""N"" : ""1628806158"" }, ""IP"" : { ""S"" : ""127.0.0.1"" } } }]","ddd","1628546958.837940533"
I want to be able to do a "query" not a "scan" for all of the IP's (attribute attached to users) which are expired. The time is stored in unix time.
Right now I'm scanning the entire table and looking through each user, one by one and then I loop through all of their IPs to see if they are expired or not. But I need to do this using a query, scans are expensive.
The table layout is like this:
primaryKey = email
attributes = firstName, lastName, ip_addresses (array of {} maps where each map has IP, and Expiration as two keys).
I have no idea how to do this using a query so I would greatly appreciate if anyone could show me how! :)
I'm currently running the scan using python and boto3 like this:
response = client.scan(
TableName='users',
Select='SPECIFIC_ATTRIBUTES',
AttributesToGet=[
'ip_addresses',
])

As per the boto3 documentation, The Query operation finds items based on primary key values. You can query any table or secondary index that has a composite primary key (a partition key and a sort key).
Use the KeyConditionExpression parameter to provide a specific value for the partition key. The Query operation will return all of the items from the table or index with that partition key value. You can optionally narrow the scope of the Query operation by specifying a sort key value and a comparison operator in KeyConditionExpression . To further refine the Query results, you can optionally provide a FilterExpression . A FilterExpression determines which items within the results should be returned to you. All of the other results are discarded.
So long story short, it will only work to fetch a particular row whose primary key you have mentioned while running query.
A Query operation always returns a result set. If no matching items are found, the result set will be empt

How to get dynamodb to only return certain columns

Hello, I have a simple dynamodb table here filled with placeholder values.
How would i go about retrieving only sort_number, current_balance and side with a query/scan?
I'm using python and boto3, however, just stating what to configure for each of the expressions and parameters is also enough.

Within the Boto3 SDK you can use:
get_item if you're trying to retrieve a specific value
query, if you're trying to get values from a single partition (the hash key).
scan if you're trying to retrieve values from across multiple parititions.
Each of these have a parameter named ProjectionExpression, using this parameter provides the following functionality
A string that identifies one or more attributes to retrieve from the specified table or index. These attributes can include scalars, sets, or elements of a JSON document. The attributes in the expression must be separated by commas.
]You would specify the attributes that you want to retrieve comma separated, be aware that this does not reduce the cost of RCU that is applied for performing the interaction.

table = dynamodb.Table('tablename')
response = table.scan(
AttributesToGet=['id']
)
This works. but this method is deprecated, using Projections is recommended

to return only some fields you should use ProjectionExpression in the Query configuration object, this is an string array with all the fields:
var params = {
TableName: 'TableName',
KeyConditionExpression: '#pk = :pk AND #sk = :sk',
ExpressionAttributeValues: {
':pk': pk,
':sk': sk,
},
ExpressionAttributeNames: {
'#sk':'sk',
'#pk':'pk'
},
ProjectionExpression:['sort_number', 'current_balance','side']
};

Firebase Python: How to query for range of children if child values cannot be sorted?

I am very new to Firebase, so forgive me if my question is not well thought out.
I am trying to query for the N-th entry of a database. For example, if I have a database called 'dinosaurs', I should be able to query for the 10th entry, or a range of entries, such as 10th to 100th.
In the Firebase documentation, there are a few ways to query for specific entry or entries like so:
Querying for specific matches according to the child:
ref = db.reference()
ref.order_by_child('height').equal_to(25).get()
or querying for a range of starting and ending index values:
ref.order_by_key().start_at('b').end_at(u'b\uf8ff').get()
But my database, discover-db, has its (first-level) child populated with random alphanumeric characters, followed by second-level children such as "request", "results", and "timestamp".
I am able to query for top/bottom N values, by using the following code:
db.reference('discover-db').order_by_key().limit_to_last(N).get()
But how do I query for, say, 10th to 100th entries in the database, without the ability to sort the child values using order_by_child('height')?

I've figured it out. The indexing is not enabled in the admin level:
{
"rules": {
"dinosaurs": {
".indexOn": ["height", "length"]
}
}
}
Doing this solved the problem of querying by timestamp/other child of child entries

How to select all data in PyMongo?

I want to select all data or select with conditional in table random but I can't find any guide in MongoDB in Python to do this.
And I can't show all data was select.
Here my code:
def mongoSelectStatement(result_queue):
client = MongoClient('mongodb://localhost:27017')
db = client.random
cursor = db.random.find({"gia_tri": "0.5748676522161966"})
# cursor = db.random.find()
inserted_documents_count = cursor.count()
for document in cursor:
result_queue.put(document)

There is a quite comprehensive documentation for mongodb. For python (Pymongo) here is the URL: https://api.mongodb.org/python/current/
Note: Consider the version you are running. Since the latest version has new features and functions.
To verify pymongo version you are using execute the following:
import pymongo
pymongo.version
Now. Regarding the select query you asked for. As far as I can tell the code you presented is fine. Here is the select structure in mongodb.
First off it is called find().
In pymongo; if you want to select specific rows( not really rows in mongodb they are called documents. I am saying rows to make it easy to understand. I am assuming you are comparing mongodb to SQL); alright so If you want to select specific document from the table (called collection in mongodb) use the following structure (I will use random as collection name; also assuming that the random table has the following attributes: age:10, type:ninja, class:black, level:1903):
db.random.find({ "age":"10" }) This will return all documents that have age 10 in them.
you could add more conditions simply by separating with commas
db.random.find({ "age":"10", "type":"ninja" }) This will select all data with age 10 and type ninja.
if you want to get all data just leave empty as:
db.random.find({})
Now the previous examples display everything (age, type, class, level and _id). If you want to display specific attributes say only the age you will have to add another argument to find called projection eg: (1 is show, 0 is do not show):
{'age':1}
Note here that this returns age as well as _id. _id is always returned by default. You have to explicitly tell it not to returning it as:
db.random.find({ "age":"10", "name":"ninja" }, {"age":1, "_id":0} )
I hope that could get you started.
Take a look at the documentation is very thorough.

What is the proper way to perform a contextual search against NoSQL key-value pairs?

With MySQL, I might search through a table "photos" looking for matching titles as follows:
SELECT *
FROM photos
WHERE title LIKE '[string]%';
If the field "title" is indexed, this would perform rather efficiently. I might even set a FULLTEXT index on the title field to perform substring matching.
What is a good strategy for performing a similar search against a NoSQL table of photos, like Amazon's DynamoDB, in the format:
{key} -> photo_id,
{value} -> {photo_id = 2332532532235,
title = 'this is a title'}
I suppose one way would be to search the contents of each entry's value and return matches. But this seems pretty inefficient, especially when the data set gets very large.
Thanks in advance.

I can give you a Mongo shell example.
From the basic tutorial on MongoDB site:
j = { name : "mongo" };
t = { x : 3 };
db.things.save(j);
db.things.save(t);
So you now have a collection called things and have stored two documents in it.
Suppose you now want to do the equivalent of
SELECT * FROM things WHERE name like 'mon%'
In SQL, this would have returned you the "mongo" record.
In Mongo Shell, you can do this:
db.things.find({name:{$regex:'mon'}}).forEach(printjson);
This returns the "mongo" document.
Hope this helps.
Atish

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Document found on table but not present on an global secondary index - python

Related

DynamoDB Query for users with expired IP addresses

How to get dynamodb to only return certain columns

Firebase Python: How to query for range of children if child values cannot be sorted?

How to select all data in PyMongo?

What is the proper way to perform a contextual search against NoSQL key-value pairs?

Categories

Resources