SQLalchemy +Postgresql accessing Array of jsonb elements

SQLalchemy +Postgresql accessing Array of jsonb elements - python

I have a query that looks like this.
query = db.query(status_cte.c.finding_status_history)
the finding_status_history column is of type array when I check its .type. It's an array of jsonb objects I can easily change it to be json instead if it's easier. I've also tested this out with it as json.
[
{
"data": [
{
"status": "closed",
"created_at": "2023-01-27T18:05:27.579817",
"previous_status": "open"
},
{
"status": "open",
"created_at": "2023-01-27T18:05:28.694352",
"previous_status": "closed"
}
]
},
...
]
I'm trying to access the first dictionary nested inside data and access the status column.
I've tried to grab it using query = db.query(status_cte.c.finding_status_history[0]) but this returns a list of empty dictionaries like so.
[
{},
{},
{},
{},
{},
{},
{}
]
I'm not sure why that doesn't work as its my impression that i should grab the first entry. I'm assuming i need to access "data" some how first but i've also tried...
query = db.query(status_cte.c.finding_status_history.op('->>')('data')
Which gives me jsonb[] ->> unknown operator doesn't exist. I've tried to type cast data to be that of String and i get the same error but jsonb[] ->> String etc etc
Also when looping through the items for item in query.all() i'm seeing that [0] results in (None,) and [1] results in
({
"status": "closed",
"created_at": "2023-01-27T18:05:27.579817",
"previous_status": "open"
},)
as a tuple...

The secret was that [0] is not the first element. [1] is also noted that [-1] doesn't appear to give me the last element so i also had to order my aggregated json objects.

Related

How to get the filed data form json file?

This code are successful to get the "version" and "shapes" data. (for i in data.get("shapes"):)
but when i swtch "shapes" to the "label" ,the problme come
TypeError: 'NoneType' object is not iterable
Thanks help.
{
"version": "4.5.7",
"flags": {},
"shapes": [
{
"label": "2",
"points": [
[
602.8,
590.2
],
[
610.3,
642.0
]
]
}
],
"imagePath": "ladybug_13451176_20200410_141545_ColorProcessed_000497_Cam0_20059_026-2678.png",
"imageWidth": 1024
}
import json
import jsonpath
import io
with open("B123.json", mode="r") as dict:
data = json.load(dict)
for i in data.get("shapes"):
print(i,end="")

data is of type dict. Calling dict.get() would either return the value (if present) or None (if not present).
So if your data varies and might have no defined "shapes", don't just iterate over it as you could be iterating on a None object. So replace this:
for i in data.get("shapes"):
Instead, put a check first.
if "shapes" in data:
for i in data.get("shapes"):
Or better yet maximize the or operation which continues to the next if the previous has a false value.
for i in data.get("shapes") or []:

Because "shape" is a list of dict, you cabiterate it. But label is inside shapes dict, you can't just replace "shapes" by "label", also when you will have obtained the contents of label, you still won't be able to iterate it because the value of key label is not a list or dict

Filter JSON using Jmespath and return value if expression exist, if it doesn't return None/Null (python)

How can I get JMESPath to only return the value in a json if it exists, if it doesn't exist return none/null. I am using JMESPath in a python application, below is an example of a simple JSON data.
{
"name": "Sarah",
"region": "south west",
"age": 21,
"occupation": "teacher",
"height": 145,
"education": "university",
"favouriteMovie": "matrix",
"gender": "female",
"country": "US",
"level": "medium",
"tags": [],
"data": "abc",
"moreData": "xyz",
"logging" : {
"systemLogging" : [ {
"enabled" : true,
"example" : [ "this", "is", "an", "example", "array" ]
} ]
}
}
For example I want it to check if the key "occupation" contains the word "banker" if it doesn't return null.
In this case if I do jmespath query "occupation == 'banker'" I would get false. However for more complicated jmespath queries like "logging.systemLogging[?enabled == `false`]" this would result in an empty array [] because it doesn't exist, which is what I want.
The reason I want it to return none or null is because in another part of the application (my base class) I have code that checks if the dictionary/json data will return a value or not, this piece of code iterates through an array of dictionaries/ json data like the one above.
One thing I've noticed with JMESPath is that it is inconsistent with its return value. In more complicated dictionaries I am able to achieve what I want but from simple dictionaries I can't, also If you used a methods, e.g starts_with, it returns a boolean but if you just use an expression it returns the value you are looking for if it exists otherwise it will return None or an empty array.

This is traditionally accomplished by:
dictionary = json.loads(my_json)
dictionary.get(key, None) # None is the default value that is returned.
That will work if you know the exact structure to expect from the json. Alternatively you can make two calls to JMESpath, using one to try to get the value / None / empty list, and one to run the query you want.
The problem is that JMESpath is trying to answer your query: Does this structure contain this information pattern? It makes sense that the result of such a query should be True/False. If you want to get something like an empty list back, you need to modify your query to ask "Give me back all instances where this structure contains the information I'm looking for" or "Give me back the first instance where this structure contains the information I'm looking for."

Filters in JMESPath do apply to arrays (or list, to speak in Python).
So, indeed, your case is not a really common one.
This said, you can create an array out of a hash (or dictionary, to speak in Python again) using the to_array function.
Then, since you do know you started from a hash, you can select back the first element of the created array, and indeed, if the array ends up being empty, it will return a null.
To me, at least, it looks consistant, an array can be empty [], but an empty object is a null.
To use this trick, though, you will also have to reset the projection you created out of the array, with the pipe expression:
Projections are an important concept in JMESPath. However, there are times when projection semantics are not what you want. A common scenario is when you want to operate of the result of a projection rather than projecting an expression onto each element in the array.
For example, the expression people[*].first will give you an array containing the first names of everyone in the people array. What if you wanted the first element in that list? If you tried people[*].first[0] that you just evaluate first[0] for each element in the people array, and because indexing is not defined for strings, the final result would be an empty array, []. To accomplish the desired result, you can use a pipe expression, <expression> | <expression>, to indicate that a projection must stop.
Source: https://jmespath.org/tutorial.html#pipe-expressions
And so, with all this, the expression ends up being:
to_array(#)[?occupation == `banker`]|[0]
Which gives
null
On your example JSON, while the expression
to_array(#)[?occupation == `teacher`]|[0]
Would return your existing object, so:
{
"name": "Sarah",
"region": "south west",
"age": 21,
"occupation": "teacher",
"height": 145,
"education": "university",
"favouriteMovie": "matrix",
"gender": "female",
"country": "US",
"level": "medium",
"tags": [],
"data": "abc",
"moreData": "xyz",
"logging": {
"systemLogging": [
{
"enabled": true,
"example": [
"this",
"is",
"an",
"example",
"array"
]
}
]
}
}
And following this trick, all your other test will probably start to work e.g.
to_array(#)[?starts_with(occupation, `tea`)]|[0]
will give you back your object
to_array(#)[?starts_with(occupation, `ban`)]|[0]
will give you a null
And if you only need the value of the occupation property, as you are falling back to a hash now, it is as simple as doing, e.g.
to_array(#)[?starts_with(occupation, `tea`)]|[0].occupation
Which gives
"teacher"
to_array(#)[?starts_with(occupation, `ban`)]|[0].occupation
Which gives
null

How to extract data from json and add additional values to the extracted values using python?

I want to parse the value from json response using python and assign additional value to the list
{ "form": [{ "box": [60,120,260,115], "text": "hello", "label": "question", "words": [{ "box": [90,190,160,215 ],"text": "hello"} ], "linking": [[0,13]],"id": 0 }]}
I am trying to parse the value and assign to a variable using python. What I am trying to achieve is:
If the actual output is ([60,120,260,115],hello) I wanted to add few more values to the list: Thus expected output should be:
([60,120,260,120,260,115,60,115],hello)

try this:
tmp_json = { "form": [{ "box": [60,120,260,115], "text": "hello", "label": "question", "words": [{ "box": [90,190,160,215 ],"text": "hello"} ], "linking": [[0,13]],"id": 0 }]}
# Then do whatever you need to do with the list by accessing it as follows
# tmp_json["form"][0]["box"]

you can iterate through all elements of list here and if each item matches required condition extend the existing list with required values.
# Pseudocode
for item in data["form"]:
# check each item's box attribute has all such elements i.e 60,120,260,115
# AND item's text attribute has value "hello"
# If matches then to add extra values to box list you can use <list>.extend([115, 120 etc])
# e.g item["box"].extend([120, 115, 260])

Finding particular value from a list by matching in Python

I have a small doubt in accessing the values from a list.
I have a list of elements
"result":[{"_id": "55b8b9913f32df094c7ba922", "Total": "450"},
{"_id": "55b8a2083f32df1030b9ef16", "Total": "400"}]
Here we basically get values from a list by doing list[0] or something like this based on the no. of list elements.
I would like to know if we can get only the particular value from the list by matching it with the _id. Since if the database is large it would be difficult to get the values by doing list[]
My actual code is:
id = self.body['_id']
test = yield db.Result.aggregate(
[
{ '$group': { '_id' : "$StudentId",
'Total': {'$max': "$Total"}}
}
]
)
list = test.get('result')
print(list)
I would like to get the total of the provided id only.

Use $match first to just get the documents you want.
id = self.body['_id']
test = yield db.Result.aggregate(
[
{ '$match': { '_id': id } },
{ '$group': { '_id' : "$StudentId",
'Total': {'$max': "$Total"}}
}
]
)
list = test.get('result')
print(list)
And for multiple values to match, use $in, declare an array
listOfIds = [id1,id2,id3]
And the pipeline change:
{ '$match': { '_id': { '$in': listOfIds } },
As I said earlier though:
Your "Total" field contains a "string". If you don't change that to be numeric you will get unexpected results. Strings sort differently to numbers. i.e "8" is greater that "100".
So you really should change that in your data.

MongoDB search for each dict in list in collection

I have a collection containing a list of dicts and I want to search if any dict contains two specific key:values.
So for example I want to find_one where a dict contains a specific first and last names. This is my collection:
{
"names": [
{
"firstName": "bob",
"lastName": "jones",
"age": "34",
"gender": "m"
},
{
"firstName": "alice",
"lastName": "smith",
"age": "56",
"gender": "f"
},
{
"firstName": "bob",
"lastName": "smith",
"age": "19",
"gender": "m"
},
]
}
I want to see if there is a record with bob smith as first and last names, I am searching this as:
first = 'bob'
last = 'smith'
nameExists = db.user.find_one({'$and':[{'names.firstName':first,'names.lastName':last}]})
Would this query retrieve the one record for bob smith?

While it is mentioned that indeed the $and operator is not required, in either form this is not the query that you want. Consider the following:
db.user.find_one({ 'names.firstName': 'alice','names.lastName': 'jones' })
This in fact does match the given record as there are both elements with "firstName" equal to "alice" and "lastName" values equal to "jones". But of course the problem here is simple in that there is no actual element in the array that has a sub-document for both of those values.
In order to match where an array element contains "both" the criteria given, you need to use the $elemMatch operator. This applies the query condition to the "elements" of the array.
db.user.find_one({
'names': { '$elemMatch': { 'firstName': 'alice','lastName': 'smith' }
})
And of course if you tried "alice" and "jones" then that would not match as no element actually contains that operation.

Almost!
nameExists = db.user.find_one({'$and':[{'names.firstName':first},{'names.lastName':last}]})
You need to separate the asks into separate {} brackets.

You don´t even need to add the $and parameter. In mongoDB, comma separated fields inside a query are joined by an implicit AND operator, so using simply {'names.firstName':first,'names.lastName':last} inside the find_one will work.
Anyway, that´s only a "clean code" fix; Your code will work properly as you are doing an "and" operation with just one element (note that the list used for the parameter $and contains only one dictionary).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

SQLalchemy +Postgresql accessing Array of jsonb elements - python

The secret was that [0] is not the first element. [1] is also noted that [-1] doesn't appear to give me the last element so i also had to order my aggregated json objects.

Related

How to get the filed data form json file?

Filter JSON using Jmespath and return value if expression exist, if it doesn't return None/Null (python)

How to extract data from json and add additional values to the extracted values using python?

Finding particular value from a list by matching in Python

MongoDB search for each dict in list in collection

Categories

Resources