Get first item from nested list in glom - python

from glom import glom, T
target = {
"items": [
{
"label": "valuation",
"value": [
"900 USD"
]
},]
}
spec = ('items',[T['value'][0]])
r = glom(target,spec)
print(r)
The above code returns a list, ['900 USD'] but I'd like to just get the content of that list, i.e the first item in the 'value' list. In this case the result should just be 900 USD
Part 2
from glom import glom, T, Check, SKIP
target = {
"items": [
{
"label": "valuation",
"value": [
"900 USD"
]
},
{
"label": "other_info",
"value": [
"700 USD"
]
},]
}
spec = ({
'answer': ('items', [Check('label', equal_to='valuation', default=SKIP)],([T['value'][0]]))
})
r = glom(target,spec)
print(r)
The above code results in {'answer': ['900 USD'] but I need to just return 900 USD.
Tried adding [0] at the end of the brackets but that didn't work.
Playing around with the T type also didn't result in what I'm looking for

I solved it by iterating over my result list and picking out the first element.
The following spec worked
spec = ({
'answer': ('items', [Check('label', equal_to='valuation', default=SKIP)],( [T['value'][0]] ,Iter().first()) )
})
Notice the Iter().first() function call was added.

spec = {
'answer': ('items', Iter().filter(
lambda x: x['label'] is 'valuation'
).map('value.0').first())
}
Note, this is a streaming solution, thus performing better for larger datasets.
A string spec -- here 'value.0', the argument to the map method -- understands indexing into deep lists.
Depending on the specific logic that is desired this might work as well:
spec = {
'answer': ('items', Iter().first(
lambda x: x['label'] is 'valuation'
), 'value.0')
}
Here, we have combined filter logic with the restriction of a single result.

Related

Is there a way to get the particular Values from JSON Array using robot or Python code?

JSON OUTPUT:
${response}= [
{
"Name":"7122Project",
"checkBy":[
{
"keyId":"NA",
"target":"1232"
}
],
"Enabled":false,
"aceess":"123"
},
{
"Name":"7122Project",
"checkBy":[
{
"keyId":"_GU6S3",
"target":"123"
}
],
"aceess":"11222",
"Enabled":false
},
{
"Name":"7122Project",
"checkBy":[
{
"keyId":"-1lLUy",
"target":"e123"
}
],
"aceess":"123"
}
]
Need to get the keyId values from json without using hardcoded index using robot?
I did
${ID}= set variable ${response[0]['checkBy'][0]['keyId']}
But I need to check the length get all keyID values and store the values that dose not contain NA
How can I do check length and use for loop using robot framework?
I suppose you can have more elements in checkBy arrays, like so:
response = [
{
"Name":"7122Project",
"checkBy": [
{
"keyId": "NA",
"target": "1232"
}
],
"Enabled": False,
"aceess": "123"
},
{
"Name": "7122Project",
"checkBy": [
{
"keyId": "_GUO6g6S3",
"target": "123"
}
],
"aceess": "11222",
"Enabled": False
},
{
"Name": "7122Project",
"checkBy": [
{
"keyId": "-1lLlZOUy",
"target": "e123"
},
{
"keyId": "test",
"target": "e123"
}
],
"aceess": "123"
}
]
then you can key all keyIds in Python with this code:
def get_key_ids(response):
checkbys = [x["checkBy"] for x in response]
key_ids = []
for check_by in checkbys:
for key_id in check_by:
key_ids.append(key_id["keyId"])
return key_ids
for the example above, it will return: ['NA', '_GUO6g6S3', '-1lLlZOUy', 'test_NA'].
You want to get both ids with NA and without NA, so perhaps you can change the function a bit:
def get_key_ids(response, predicate):
checkbys = [x["checkBy"] for x in response]
key_ids = []
for check_by in checkbys:
for key_id in check_by:
if predicate(key_id["keyId"]):
key_ids.append(key_id["keyId"])
return key_ids
and use it like so:
get_key_ids(response, lambda id: id == "NA") # ['NA']
get_key_ids(response, lambda id: id != "NA") # ['_GUO6g6S3', '-1lLlZOUy', 'test_NA']
get_key_ids(response, lambda id: "NA" in id) # ['NA', 'test_NA']
get_key_ids(response, lambda id: "NA" not in id) # ['_GUO6g6S3', '-1lLlZOUy']
Now it's just a matter of creating a library and importing it into RF. You can get inspiration in the official documentation.
But I need to check the length get all keyID values and store the values that dose not contain NA
I don't completely understand what you are up to. Do you mean length of keyId strings, like "NA" and its length of 2, or the number of keyIds in the response?
How can I do check length and use for loop using robot framework?
You can use keyword Should Be Equal * from BuiltIn library. Some examples of for loops could be found in the user guide here.
Now you should have all the parts you need to accomplish your task, you can try to put it all together.

Append item to Mongo Array

I have a mongodb document I am trying to update. This answer was helpful, but every time I insert into the database, the data is inserted as an array inside of the array whereas I just want to insert the object directly into the array.
Here is what I am doing.
# My function to update the array
def append_site(gml_id, new_site):
col.update_one({'gml_id': gml_id}, {'$push': {'websites': new_site}}, upsert = True)
# My Dataframe
data = {'name':['ABC'],
'gml_id':['f9395e09'],
'url':['ABC.com']
}
df = pd.DataFrame(data)
# Grouping data for upsert
df = df.groupby(['gml_id']).apply(lambda x: x[['name','url']].to_dict('r')).reset_index().rename(columns={0:'websites'})
# Apply function to every row
df.apply(lambda row: append_site(row['gml_id'], row['websites']), axis = 1)
Here is the outcome:
{
"gml_id": "f9395e09",
"websites": [
{
"name": "XYZ.com",
"url": "...xyz.com"
},
[
{
"name": "ABC.com",
"url": "...abc.com"
}
]
]
}
Here is the goal:
{
"gml_id": "f9395e09",
"websites": [
{
"name": "XYZ.com",
"url": "...xyz.com"
},
{
"name": "ABC.com",
"url": "...abc.com"
}
]
}
Your issue is that the websites array is being appended with a list object rather than a dict, i.e. new_site is a list.
As you haven't posted where you call append_site(), this is a litle speculative, but you could try changing this line and seeing if it gives the effect you need.
col.update_one({'gml_id': gml_id}, {'$push': {'websites': new_site[0]}}, upsert = True)
Alternatively make sure you are passing a dict object to the function.
Instead of doing an unncessary groupby, I decided to leave the dataframe flat and then adjust the function like this:
def append_site(gml_id, name, url):
col.update_one({'gml_id': gml_id}, {'$push': {'websites': {'name': name, 'url': url}}}, upsert = True)
I now call it like this: df.apply(lambda row: append_site(row['gml_id'], row['url'], row['name']), axis = 1)
Works perfectly fine.

Python eve: using Sub Resource value in $match

I need to get a value inside an url (/some/url/value as a Sub Resource) usable as a parameter in an aggregation $match :
event/mac/11:22:33:44:55:66 --> {value:'11:22:33:44:55:66'}
and then:
{"$match":{"MAC":"$value"}},
here is a non-working example :
event = {
'url': 'event/mac/<regex("([\w:]+)"):value>',
'datasource': {
'source':"event",
'aggregation': {
'pipeline': [
{"$match": {"MAC":"$value"}},
{"$group": {"_id":"$MAC", "total": {"$sum": "$count"}}},
]
}
}
}
this example is working correctly with :
event/mac/blablabla?aggregate={"$value":"aa:11:bb:22:cc:33"}
any suggestion ?
The real quick and easy way would be to
path = "event/mac/11:22:33:44:55:66"
value = path.replace("event/mac/", "")
# or
value = path.split("/")[-1]

Adding key to values in json using Python

This is the structure of my JSON:
"docs": [
{
"key": [
null,
null,
"some_name",
"12345567",
"test_name"
],
"value": {
"lat": "29.538208354844658",
"long": "71.98762580927113"
}
},
I want to add the keys to the key list. This is what I want the output to look like:
"docs": [
{
"key": [
"key1":null,
"key2":null,
"key3":"some_name",
"key4":"12345567",
"key5":"test_name"
],
"value": {
"lat": "29.538208354844658",
"long": "71.98762580927113"
}
},
What's a good way to do it. I tried this but doesn't work:
for item in data['docs']:
item['test'] = data['docs'][3]['key'][0]
UPDATE 1
Based on the answer below, I have tweaked the code to this:
for number, item in enumerate(data['docs']):
# pprint (item)
# print item['key'][4]
newdict["key1"] = item['key'][0]
newdict["yek1"] = item['key'][1]
newdict["key2"] = item['key'][2]
newdict["yek2"] = item['key'][3]
newdict["key3"] = item['key'][4]
newdict["latitude"] = item['value']['lat']
newdict["longitude"] = item['value']['long']
This creates the JSON I am looking for (and I can eliminate the list I had previously). How does one make this JSON persist outside the for loop? Outside the loop, only the last value from the dictionary is added otherwise.
In your first block, key is a list, but in your second block it's a dict. You need to completely replace the key item.
newdict = {}
for number,item in enumerate(data['docs']['key']):
newdict['key%d' % (number+1)] = item
data['docs']['key'] = newdict

How to read this JSON into dataframe with specfic dataframe format

This is my JSON string, I want to make it read into dataframe in the following tabular format.
I have no idea what should I do after pd.Dataframe(json.loads(data))
JSON data, edited
{
"data":[
{
"data":{
"actual":"(0.2)",
"upper_end_of_central_tendency":"-"
},
"title":"2009"
},
{
"data":{
"actual":"2.8",
"upper_end_of_central_tendency":"-"
},
"title":"2010"
},
{
"data":{
"actual":"-",
"upper_end_of_central_tendency":"2.3"
},
"title":"longer_run"
}
],
"schedule_id":"2014-03-19"
}
That's a somewhat overly nested JSON. But if that's what you have to work with, and assuming your parsed JSON is in jdata:
datapts = jdata['data']
rownames = ['actual', 'upper_end_of_central_tendency']
colnames = [ item['title'] for item in datapts ] + ['schedule_id' ]
sched_id = jdata['schedule_id']
rows = [ [item['data'][rn] for item in datapts ] + [sched_id] for rn in rownames]
df = pd.DataFrame(rows, index=rownames, columns=colnames)
df is now:
If you wanted to simplify that a bit, you could construct the core data without the asymmetric schedule_id field, then add that after the fact:
datapts = jdata['data']
rownames = ['actual', 'upper_end_of_central_tendency']
colnames = [ item['title'] for item in datapts ]
rows = [ [item['data'][rn] for item in datapts ] for rn in rownames]
d2 = pd.DataFrame(rows, index=rownames, columns=colnames)
d2['schedule_id'] = jdata['schedule_id']
That will make an identical DataFrame (i.e. df == d2). It helps when learning pandas to try a few different construction strategies, and get a feel for what is more straightforward. There are more powerful tools for unfolding nested structures into flatter tables, but they're not as easy to understand first time out of the gate.
(Update) If you wanted a better structuring on your JSON to make it easier to put into this format, ask pandas what it likes. E.g. df.to_json() output, slightly prettified:
{
"2009": {
"actual": "(0.2)",
"upper_end_of_central_tendency": "-"
},
"2010": {
"actual": "2.8",
"upper_end_of_central_tendency": "-"
},
"longer_run": {
"actual": "-",
"upper_end_of_central_tendency": "2.3"
},
"schedule_id": {
"actual": "2014-03-19",
"upper_end_of_central_tendency": "2014-03-19"
}
}
That is a format from which pandas' read_json function will immediately construct the DataFrame you desire.

Categories

Resources