Extract values from array in python - python

I'm having some trouble accessing a value that is inside an array that contains a dictionary and another array.
It looks like this:
[{'name': 'Alex',
'number_of_toys': [{'classification': 3, 'count': 383},
{'classification': 1, 'count': 29},
{'classification': 0, 'count': 61}],
'total_toys': 473},
{'name': 'John',
'number_of_toys': [{'classification': 3, 'count': 8461},
{'classification': 0, 'count': 3825},
{'classification': 1, 'count': 1319}],
'total_toys': 13605}]
I want to access the 'count' number for each 'classification'. For example, for 'name' Alex, if 'classification' is 3, then the code returns the 'count' of 383, and so on for the other classifications and names.
Thanks for your help!

Not sure what your question asks, but if it's just a mapping exercise this will get you on the right track.
def get_toys(personDict):
person_toys = personDict.get('number_of_toys')
return [ (toys.get('classification'), toys.get('count')) for toys in person_toys]
def get_person_toys(database):
return [(personDict.get('name'), get_toys(personDict)) for personDict in database]
This result is:
[('Alex', [(3, 383), (1, 29), (0, 61)]), ('John', [(3, 8461), (0, 3825), (1, 1319)])]

This isn't as elegant as the previous answer because it doesn't iterate over the values, but if you want to select specific elements, this is one way to do that:
data = [{'name': 'Alex',
'number_of_toys': [{'classification': 3, 'count': 383},
{'classification': 1, 'count': 29},
{'classification': 0, 'count': 61}],
'total_toys': 473},
{'name': 'John',
'number_of_toys': [{'classification': 3, 'count': 8461},
{'classification': 0, 'count': 3825},
{'classification': 1, 'count': 1319}],
'total_toys': 13605}]
import pandas as pd
df = pd.DataFrame(data)
print(df.loc[0]['name'])
print(df.loc[0][1][0]['classification'])
print(df.loc[0][1][0]['count'])
which gives:
Alex
3
383

Related

How to implement Specific field data into one list and Another filed data into another list in Django

I'm using Django 3x. I have one table called Book. When I write a query for this I'm getting data like this:
book_details = Book.objects.all().values()
print(book_details)
<QuerySet [{'id': 1, 'author': 'ABC', 'price': 150, 'category': 'Comic', 'available': 1}, {'id': 2, 'author': 'XYZ', 'price': 500, 'category': 'Historical Fiction', 'available': 1}, {'id': 3, 'author': 'MNO', 'price': 200, 'category': 'Horror', 'available': 0}]>
I'm trying to create data in this below format
{'id':[1,2,3], 'author':['ABC', 'XYZ', 'MNO'], 'price':[150,500,200], 'category':['Comic', 'Historical Fiction', 'Horror'], 'available':[1,1,0]}
How to create data in this format.
Please advise the best way to do this. An explanation would be much appreciated. Thanks!.
If you do:
lst_of_dicts = list(book_details)
You will obtain the following, so if you print lst_of_dicts it will look like this:
[{'id': 1, 'author': 'ABC', 'price': 150, 'category': 'Comic', 'available': 1},
{'id': 2, 'author': 'XYZ', 'price': 500, 'category': 'Historical Fiction', 'available': 1},
{'id': 3, 'author': 'MNO', 'price': 200, 'category': 'Horror', 'available': 0}]
So this is a list of dictionaries, while you want a dict of lists so there is a further step that you must need to do and that is this one:
from itertools import chain # Necessary import
keys = set(chain(*[dic.keys() for dic in lst_of_dicts ]))
final_dict = {key : [dic[key] for dic in lst_of_dicts if key in dic] for key in keys }
final_dict will look like this:
{'id': [1, 2, 3], 'author': ['ABC', 'XYZ', 'MNO'], 'available': [1, 1, 0], 'price': [150, 500, 200], 'category': ['Comic', 'Historical Fiction', 'Horror']}

Extract key and value from json to new dataframe

I have a dataframe that has JSON values are in columns. Those were indented into multiple levels. I would like to extract the end key and value into a new dataframe. I will give you sample column values below
{'shipping_assignments': [{'shipping': {'address': {'address_type':
'shipping', 'city': 'Calder', 'country_id': 'US',
'customer_address_id': 1, 'email': 'roni_cost#example.com',
'entity_id': 1, 'firstname': 'Veronica', 'lastname': 'Costello',
'parent_id': 1, 'postcode': '49628-7978', 'region': 'Michigan',
'region_code': 'MI', 'region_id': 33, 'street': ['6146 Honey Bluff
Parkway'], 'telephone': '(555) 229-3326'}, 'method':
'flatrate_flatrate', 'total': {'base_shipping_amount': 5,
'base_shipping_discount_amount': 0,
'base_shipping_discount_tax_compensation_amnt': 0,
'base_shipping_incl_tax': 5, 'base_shipping_invoiced': 5,
'base_shipping_tax_amount': 0, 'shipping_amount': 5,
'shipping_discount_amount': 0,
'shipping_discount_tax_compensation_amount': 0, 'shipping_incl_tax':
5, 'shipping_invoiced': 5, 'shipping_tax_amount': 0}}, 'items':
[{'amount_refunded': 0, 'applied_rule_ids': '1',
'base_amount_refunded': 0, 'base_discount_amount': 0,
'base_discount_invoiced': 0, 'base_discount_tax_compensation_amount':
0, 'base_discount_tax_compensation_invoiced': 0,
'base_original_price': 29, 'base_price': 29, 'base_price_incl_tax':
31.39, 'base_row_invoiced': 29, 'base_row_total': 29, 'base_row_total_incl_tax': 31.39, 'base_tax_amount': 2.39,
'base_tax_invoiced': 2.39, 'created_at': '2019-09-27 10:03:45',
'discount_amount': 0, 'discount_invoiced': 0, 'discount_percent': 0,
'free_shipping': 0, 'discount_tax_compensation_amount': 0,
'discount_tax_compensation_invoiced': 0, 'is_qty_decimal': 0,
'item_id': 1, 'name': 'Iris Workout Top', 'no_discount': 0,
'order_id': 1, 'original_price': 29, 'price': 29, 'price_incl_tax':
31.39, 'product_id': 1434, 'product_type': 'configurable', 'qty_canceled': 0, 'qty_invoiced': 1, 'qty_ordered': 1,
'qty_refunded': 0, 'qty_shipped': 1, 'row_invoiced': 29, 'row_total':
29, 'row_total_incl_tax': 31.39, 'row_weight': 1, 'sku':
'WS03-XS-Red', 'store_id': 1, 'tax_amount': 2.39, 'tax_invoiced':
2.39, 'tax_percent': 8.25, 'updated_at': '2019-09-27 10:03:46', 'weight': 1, 'product_option': {'extension_attributes':
{'configurable_item_options': [{'option_id': '141', 'option_value':
167}, {'option_id': '93', 'option_value': 58}]}}}]}],
'payment_additional_info': [{'key': 'method_title', 'value': 'Check /
Money order'}], 'applied_taxes': [{'code': 'US-MI--Rate 1', 'title':
'US-MI--Rate 1', 'percent': 8.25, 'amount': 2.39, 'base_amount':
2.39}], 'item_applied_taxes': [{'type': 'product', 'applied_taxes': [{'code': 'US-MI--Rate 1', 'title': 'US-MI--Rate 1', 'percent':
8.25, 'amount': 2.39, 'base_amount': 2.39}]}], 'converting_from_quote': True}
Above is single row value of the dataframe column df['x']
My codes are below to convert
sample = data['x'].tolist()
data = json.dumps(sample)
df = pd.read_json(data)
it gives new dataframe with columns
Index(['applied_taxes', 'converting_from_quote', 'item_applied_taxes',
'payment_additional_info', 'shipping_assignments'],
dtype='object')
When I tried to do the same above to convert the column which has row values
m_df = df['applied_taxes'].apply(lambda x : re.sub('.?\[|$.|]',"", str(x)))
m_sample = m_df.tolist()
m_data = json.dumps(m_sample)
c_df = pd.read_json(m_data)
It doesn't work
Check this link to get the beautified_json
I came across a beautiful ETL package in python called petl. convert the json list into dict form with the help of function called fromdicts(json_string)
order_table = fromdicts(data_list)
If you find any nested dict in any of the columns, use unpackdict(order_table,'nested_col')
it will unpack the nested dict.
In my case, I need to unpack the applied_tax column. Below code will unpack and append the key and value as a column and row in the same table.
order_table = unpackdict(order_table, 'applied_taxes')
If you guys wants to know more about -petl
It seems that your mistake was in tolist(). Try the following:
import pandas as pd
import json
import re
data = {"shipping_assignments":[{"shipping":{"address":{"address_type":"shipping","city":"Calder","country_id":"US","customer_address_id":1,"email":"roni_cost#example.com","entity_id":1,"firstname":"Veronica","lastname":"Costello","parent_id":1,"postcode":"49628-7978","region":"Michigan","region_code":"MI","region_id":33,"street":["6146 Honey Bluff Parkway"],"telephone":"(555) 229-3326"},"method":"flatrate_flatrate","total":{"base_shipping_amount":5,"base_shipping_discount_amount":0,"base_shipping_discount_tax_compensation_amnt":0,"base_shipping_incl_tax":5,"base_shipping_invoiced":5,"base_shipping_tax_amount":0,"shipping_amount":5,"shipping_discount_amount":0,"shipping_discount_tax_compensation_amount":0,"shipping_incl_tax":5,"shipping_invoiced":5,"shipping_tax_amount":0}},"items":[{"amount_refunded":0,"applied_rule_ids":"1","base_amount_refunded":0,"base_discount_amount":0,"base_discount_invoiced":0,"base_discount_tax_compensation_amount":0,"base_discount_tax_compensation_invoiced":0,"base_original_price":29,"base_price":29,"base_price_incl_tax":31.39,"base_row_invoiced":29,"base_row_total":29,"base_row_total_incl_tax":31.39,"base_tax_amount":2.39,"base_tax_invoiced":2.39,"created_at":"2019-09-27 10:03:45","discount_amount":0,"discount_invoiced":0,"discount_percent":0,"free_shipping":0,"discount_tax_compensation_amount":0,"discount_tax_compensation_invoiced":0,"is_qty_decimal":0,"item_id":1,"name":"Iris Workout Top","no_discount":0,"order_id":1,"original_price":29,"price":29,"price_incl_tax":31.39,"product_id":1434,"product_type":"configurable","qty_canceled":0,"qty_invoiced":1,"qty_ordered":1,"qty_refunded":0,"qty_shipped":1,"row_invoiced":29,"row_total":29,"row_total_incl_tax":31.39,"row_weight":1,"sku":"WS03-XS-Red","store_id":1,"tax_amount":2.39,"tax_invoiced":2.39,"tax_percent":8.25,"updated_at":"2019-09-27 10:03:46","weight":1,"product_option":{"extension_attributes":{"configurable_item_options":[{"option_id":"141","option_value":167},{"option_id":"93","option_value":58}]}}}]}],"payment_additional_info":[{"key":"method_title","value":"Check / Money order"}],"applied_taxes":[{"code":"US-MI-*-Rate 1","title":"US-MI-*-Rate 1","percent":8.25,"amount":2.39,"base_amount":2.39}],"item_applied_taxes":[{"type":"product","applied_taxes":[{"code":"US-MI-*-Rate 1","title":"US-MI-*-Rate 1","percent":8.25,"amount":2.39,"base_amount":2.39}]}],"converting_from_quote":"True"}
df = pd.read_json(json.dumps(data))
m_df = df['applied_taxes'].apply(lambda x : re.sub('.?\[|$.|]',"", str(x)))
c_df = pd.read_json(json.dumps(list(m_df)))
print(c_df)
prints the following:
0
0 {'code': 'US-MI-*-Rate 1', 'title': 'US-MI-*-R...

Adding node elements to json object in Python from NetworkX

I have a json object that I made using networkx:
json_data = json_graph.node_link_data(network_object)
It is structured like this (mini version of my output):
>>> json_data
{'directed': False,
'graph': {'name': 'compose( , )'},
'links': [{'source': 0, 'target': 7, 'weight': 1},
{'source': 0, 'target': 2, 'weight': 1},
{'source': 0, 'target': 12, 'weight': 1},
{'source': 0, 'target': 9, 'weight': 1},
{'source': 2, 'target': 18, 'weight': 25},
{'source': 17, 'target': 25, 'weight': 1},
{'source': 29, 'target': 18, 'weight': 1},
{'source': 30, 'target': 18, 'weight': 1}],
'multigraph': False,
'nodes': [{'bipartite': 1, 'id': 'Icarus', 'node_type': 'Journal'},
{'bipartite': 1,
'id': 'A Giant Step: from Milli- to Micro-arcsecond Astrometry',
'node_type': 'Journal'},
{'bipartite': 1,
'id': 'The Astrophysical Journal Supplement Series',
'node_type': 'Journal'},
{'bipartite': 1,
'id': 'Astronomy and Astrophysics Supplement Series',
'node_type': 'Journal'},
{'bipartite': 1, 'id': 'Astronomy and Astrophysics', 'node_type': 'Journal'},
{'bipartite': 1,
'id': 'Astronomy and Astrophysics Review',
'node_type': 'Journal'}]}
What I want to do is add the following elements to each of the nodes so I can use this data as an input for sigma.js:
"x": 0,
"y": 0,
"size": 3
"centrality": 0
I can't seem to find an efficient way to do this though using add_node(). Is there some obvious way to add this that I'm missing?
While you have your data as a networkx graph, you could use the set_node_attributes method to add the attributes (e.g. stored in a python dictionary) to all the nodes in the graph.
In my example the new attributes are stored in the dictionary attr:
import networkx as nx
from networkx.readwrite import json_graph
# example graph
G = nx.Graph()
G.add_nodes_from(["a", "b", "c", "d"])
# your data
#G = json_graph.node_link_graph(json_data)
# dictionary of new attributes
attr = {"x": 0,
"y": 0,
"size": 3,
"centrality": 0}
for name, value in attr.items():
nx.set_node_attributes(G, name, value)
# check new node attributes
print(G.nodes(data=True))
You can then export the new graph in JSON with node_link_data.

Pythonic sort a list of dictionaries in a tricky order

I have a list of id's sorted in a proper oder:
ids = [1, 2, 4, 6, 5, 0, 3]
I also have a list of dictionaries, sorted in some random way:
rez = [{'val': 7, 'id': 1}, {'val': 8, 'id': 2}, {'val': 2, 'id': 3}, {'val': 0, 'id': 4}, {'val': -1, 'id': 5}, {'val': -4, 'id': 6}, {'val': 9, 'id': 0}]
My intention is to sort rez list in a way that corresponds to ids:
rez = [{'val': 7, 'id': 1}, {'val': 8, 'id': 2}, {'val': 0, 'id': 4}, {'val': -4, 'id': 6}, {'val': -1, 'id': 5}, {'val': 9, 'id': 0}, {'val': 2, 'id': 3}]
I tried:
rez.sort(key = lambda x: ids.index(x['id']))
However that way is too slow for me, as len(ids) > 150K, and each dict actually had a lot of keys (some values there are strings). Any suggestion how to do it in the most pythonic, but still fastest way?
You don't need to sort because ids specifies the entire ordering of the result. You just need to pick the correct elements by their ids:
rez_dict = {d['id']:d for d in rez}
rez_ordered = [rez_dict[id] for id in ids]
Which gives:
>>> rez_ordered
[{'id': 1, 'val': 7}, {'id': 2, 'val': 8}, {'id': 4, 'val': 0}, {'id': 6, 'val': -4}, {'id': 5, 'val': -1}, {'id': 0, 'val': 9}, {'id': 3, 'val': 2}]
This should be faster than sorting because it can be done in linear time on average, while sort is O(nlogn).
Note that this assumes that there will be one entry per id, as in your example.
I think you are on the right track. If you need to speed it up, because your list is too long and you are having quadratic complexity, you can turn the list into a dictionary first, mapping the ids to their respective indices.
indices = {id_: pos for pos, id_ in enumerate(ids)}
rez.sort(key = lambda x: indices[x['id']])
This way, indices is {0: 5, 1: 0, 2: 1, 3: 6, 4: 2, 5: 4, 6: 3}, and rez is
[{'id': 1, 'val': 7},
{'id': 2, 'val': 8},
{'id': 4, 'val': 0},
{'id': 6, 'val': -4},
{'id': 5, 'val': -1},
{'id': 0, 'val': 9},
{'id': 3, 'val': 2}]

mongodb elemMatch in dual array

I have a problem try to use $elemMatch in dual nested array:
Suppose I have this a document:
a = {'cart': [[{'id': 1, 'count': 1}, {'id': 2, 'count': 3}], [{'id': 1, 'count': 5}]]}
And I want to select a document out when id is 1 and count greater than 2:
db.cart.find_one({'cart.0.id': 1, 'cart.0.count': {'$gt': 2}})
But this query will select a out.
Then I have tried these queries:
db.cart.find_one({'cart': {'$elemMatch': {'id': 1, 'count': {'$gt': 2}}}})
db.cart.find_one({'cart': {'$elemMatch': {'id': 2, 'count': {'$gt': 2}}}})
db.cart.find_one({'cart.0': {'$elemMatch': {'id': 1, 'count': {'$gt': 2}}}})
db.cart.find_one({'cart.0': {'$elemMatch': {'id': 2, 'count': {'$gt': 2}}}})
But all return None.
So do $elemMatch support the nested array match? If so, how shall I tune my query?
Given the fact that you have an array within an array, I think you could try something like
db.cart.find_one({'cart': {'$elemMatch': { '$elemMatch' : {'id': 1, 'count': {'$gt': 2}}}}})

Categories

Resources