customized OrderedDict format and transfer into dictionary - python

I have an issue about how to customize OrderedDict format and convert them into a json or dictionary format(but be able to reset the key names and the structure). I have the data below:
result= OrderedDict([('index', 'cfs_fsd_00001'),
('host', 'GIISSP707'),
('source', 'D:\\usrLLSS_SS'),
('_time', '2018-11-02 14:43:30.000 EDT'),
('count', '153')])
...However, I want to change the format like this:
{
"servarname": {
"index": "cfs_fsd_00001",
"host": "GIISSP707"
},
"times": '2018-11-02 14:43:30.000 EDT',
"metricTags": {
"source": 'D:\\ddevel.log'"
},
"metricName": "serverice count",
"metricValue": 153,
"metricType": "count"
}
I will be really appreciate your help. Basically the output I got is pretty flat. But I want to customize the structure. The original structure is
OrderedDict([('index', 'cfs_fsd_00001'),('host', 'GIISSP707').....]).
The output I want to achieve is {"servarname"{"index":"cfs_fsd_00001","host":"GIISSP707"},......

You can simply reference the result dict with the respective keys that you want your target data structure to have:
{
"servarname": {
"index": result['index'],
"host": result['host']
},
"times": result['_time'],
"metricTags": {
"source": result['source']
},
"metricName": "serverice count",
"metricValue": result['count'],
"metricType": "count"
}

No sure how flexible you need for your method. I assume you have a few common keys in your OrderedDict and you want to find the metric there, then reformat them into a new dict. Here is a short function which is implemented in python 3 and I hope it could help.
from collections import OrderedDict
import json
def reformat_ordered_dict(dict_result):
"""Reconstruct the OrderedDict result into specific format
This method assumes that your input OrderedDict has the following common keys: 'index',
'host', 'source', '_time', and a potential metric whcih is subject to change (of course
you can support more metrics with minor tweak of the code). The function also re-map the
keys (for example, mapping '_time' to 'times', pack 'index' and 'source' into 'servarname'
).
:param dict_result: the OrderedDict
:return: the reformated OrderedDict
"""
common_keys = ('index', 'host', 'source', '_time')
assert all(common_key in dict_result for common_key in common_keys), (
'You have to provide all the commen keys!')
# write common keys
reformated = OrderedDict()
reformated["servarname"] = OrderedDict([
("index", dict_result['index']),
("host", dict_result['host'])
])
reformated["times"] = dict_result['_time']
reformated["metricTags"] = {"source": dict_result['source']}
# write metric
metric = None
for key in dict_result.keys():
if key not in common_keys:
metric = key
break
assert metric is not None, 'Cannot find metric in the OrderedDict!'
# don't know where you get this value. But you can customize it if needed
# for exampe if the metric name is needed here
reformated['metricName'] = "serverice count"
reformated['metricValue'] = dict_result[metric]
reformated['metricType'] = metric
return reformated
if __name__ == '__main__':
result= OrderedDict([('index', 'cfs_fsd_00001'),
('host', 'GIISSP707'),
('source', 'D:\\usrLLSS_SS'),
('_time', '2018-11-02 14:43:30.000 EDT'),
('count', '153')])
reformated = reformat_ordered_dict(result)
print(json.dumps(reformated))

Related

Filter the config dictionary based on user input - Python

I have a json config, based on user input, need to filter out the config and get only specific section. I tried running the code mentioned below, it returns the partially expected results.
Config:
superset_config = """
[ {
"Area":"Texas",
"Fruits": {
"RED": {
"Apple":["val1"],
"Grapes":["green"]
},
"YELLOW": {"key2":["val2"]}
}
},
{
"Area":"Dallas",
"Fruits": {
"GREEN": { "key3": ["val3"]}
}
}
]
"""
User Input:
inputs = ['Apple'] # input list
Code:
import json
derived_config = []
for each_src in json.loads(superset_config):
temp = {}
for src_keys in each_src:
if src_keys=='Fruits':
temp_inner ={}
for key,value in each_src[src_keys].items():
metrics = {key_inner:value_inner for key_inner,value_inner in value.items() if key_inner in inputs}
temp_inner[key]=metrics
temp[src_keys] = temp_inner
else:
temp[src_keys] = each_src[src_keys]
derived_config.append(temp)
what do I get from above code:
derived_config= [
{'Area': 'Texas',
'Fruits': {'RED': {'Apple': 'val1'},
'YELLOW': {}
}
},
{'Area': 'Dallas',
'Fruits': {'GREEN': {}
}
}
]
what is needed: I need below results
derived_config= [
{'Area': 'Texas',
'Fruits': {'RED': {'Apple': 'val1'}
}
}
]
can anyone please help? thanks.
Maybe something like this:
import json
inputs = ['Apple'] # input list
derived_config = []
for each_src in json.loads(superset_config):
filtered_fruits = {k: v for k, v in (each_src.get('Fruits') or {}).items()
if any(input_ in v for input_ in inputs)}
if filtered_fruits:
each_src['Fruits'] = filtered_fruits
derived_config.append(each_src)
print(derived_config)
Edit: Based on the comments, it looks like you might want to filter the inner Fruits map based on the input list of fruits as well. In that case, we don't need to use the any function as above.
There is also an unintentional risk that we might mutate the original source config. For example, if you save the result of json.loads(superset_config) to a variable and then try to filter multiple fruits from it, likely it'll mutate the original config object. If you are directly calling jsons.load each time, then you don't need to worry about mutating the object; however you need to be aware that due to list and dict being mutable types in Python, this can be a concern to us.
The solution below does a good job of eliminating a possibility of mutating the original source object. But again, if you are calling jsons.load each time anyway, then you don't need to worry about this and you are free to modify the original config object.
import json
# Note: If you are using Python 3.9+, you can just use the standard collections
# for `dict` and `list`, as they now support parameterized values.
from typing import Dict, Any, List
# The inferred type of the 'Fruits' key in the superset config.
# This is a mapping of fruit color to a `FruitMap`.
Fruits = Dict[str, 'FruitMap']
FruitMap = Dict[str, Any]
# The inferred type of the superset config.
Config = List[Dict[str, Any]]
def get_fruits_config(src_config: Config, fruit_names: List[str]) -> Config:
"""
Returns the specified fruit section(s) from the superset config.
"""
fruits_config: Config = []
final_src: Dict
for each_src in src_config:
fruits: Fruits = each_src.get('Fruits') or {}
final_fruits: Fruits = {}
for fruit_color, fruit_map in fruits.items():
desired_fruits = {fruit: val for fruit, val in fruit_map.items()
if fruit in fruit_names}
if desired_fruits:
final_fruits[fruit_color] = desired_fruits
if final_fruits:
final_src = each_src.copy()
final_src['Fruits'] = final_fruits
fruits_config.append(final_src)
return fruits_config
Usage:
inputs = ['Apple'] # input list
config = json.loads(superset_config)
derived_config = get_fruits_config(config, inputs)
print(derived_config)
# prints:
# [{'Area': 'Texas', 'Fruits': {'RED': {'Apple': ['val1']}}}]

What is the pythonic way to extract values from dict

What is the best way to extract values from dictionary. Let's suppose we have a list of dicts:
projects = [{'project': 'project_name1',
'dst-repo': 'some_dst_path',
'src-repo': 'some_src_path',
'branches': ['*']},
{...},
{...}]
Now I just iterate through this dictionary and get values, something like:
for project in projects:
project_name = project.get('project')
project_src = ....
project_dst = ....
....
....
So the question is: "Are there any more pythonic approaches to extract values by key from dictionary that allow not making so many lines of code for new variable assignment?"
There's nothing wrong with what you're doing, but you can make it more compact by using a list comprehension to extract the values from the current dictionary. Eg,
projects = [
{
'project': 'project_name1',
'dst-repo': 'some_dst_path',
'src-repo': 'some_src_path',
'branches': ['*']
},
]
keys = ['project', 'src-repo', 'dst-repo', 'branches']
for project in projects:
name, src, dst, branches = [project[k] for k in keys]
# Do stuff with the values
print(name, src, dst, branches)
output
project_name1 some_src_path some_dst_path ['*']
However, this approach gets unwieldy if the number of keys is large.
If keys are sometimes absent from the dict, then you will need to use the .get method, which returns None for missing keys (unless you pass it a default arg):
name, src, dst, branches = [project.get(k) for k in keys]
If you need specific default for each key, you could put them into a dict, eg
defaults = {
'project': 'NONAME',
'src-repo': 'NOSRC',
'dst-repo': 'NODEST',
'branches': ['*'],
}
projects = [
{
'project': 'project_name1',
'src-repo': 'some_src_path',
},
]
keys = ['project', 'src-repo', 'dst-repo', 'branches']
for project in projects:
name, src, dst, branches = [project.get(k, defaults[k]) for k in keys]
# Do stuff with the values
print(name, src, dst, branches)
output
project_name1 some_src_path NODEST ['*']
out = [elt.values() for elt in projects]
for project in projects:
project_name = project['project']
project_src = ....
project_dst = ....
....
....
I'm not sure you can get less typing
//EDIT:
Ok, it looks I misunderstood the question:
Assume we have a list of dicts like this:
projects = [ {'project': "proj1", 'other': "value1", 'other2': "value2"},
{'project': "proj2", 'other': "value3", 'other2': "value4"},
{'project': "proj2", 'other': "value3", 'other2': "value4"} ]
To extract the list of project fields, you can use the following expression:
projects_names = [x['project'] for x in projects]
This will iterate over project list, extracting the value of 'project' key from each dictionary.

How to fix mis-cast floats in IbPy messages

I'm using IbPy to read current orders. The response messages which come back to be processed with EWrapper methods have some attributes which appear to be of the wrong type.
To start, here is my handler for Order-related messages. It is intended to catch all messages due to having called reqAllOpenOrders().
from ib.opt import ibConnection, message
from ib.ext.Contract import Contract
from ib.ext.Order import Order
from ib.ext.OrderState import OrderState
_order_resp = dict(openOrderEnd=False, openOrder=[], openStatus=[])
def order_handler(msg):
""" Update our global Order data response dict
"""
global _order_resp
if msg.typeName in ['openStatus', 'openOrder']:
d = dict()
for i in msg.items():
if isinstance(i[1], (Contract, Order, OrderState)):
d[i[0]] = i[1].__dict__
else:
d[i[0]] = i[1]
_order_resp[msg.typeName].append(d.copy())
elif msg.typeName == 'openOrderEnd':
_order_resp['openOrderEnd'] = True
log.info('ORDER: {})'.format(msg))
In the above code, I'm loading all the objects and their attributes to a dict which is then appended to lists within _order_resp.
The log output lines show healthy interaction with IB:
25-Jan-16 14:57:04 INFO ORDER: <openOrder orderId=1, contract=<ib.ext.Contract.Contract object at 0x102a98150>, order=<ib.ext.Order.Order object at 0x102a98210>, orderState=<ib.ext.OrderState.OrderState object at 0x102a98350>>)
25-Jan-16 14:57:04 INFO ORDER: <orderStatus orderId=1, status=PreSubmitted, filled=0, remaining=100, avgFillPrice=0.0, permId=1114012437, parentId=0, lastFillPrice=0.0, clientId=0, whyHeld=None>)
25-Jan-16 14:57:04 INFO ORDER: <openOrderEnd>)
But when looking at the data put into the _order_resp dict, it looks like some numbers are off:
{
"contract": {
"m_comboLegsDescrip": null,
"m_conId": 265598,
"m_currency": "USD",
"m_exchange": "SMART",
...
},
"order": {
"m_account": "DU12345",
"m_action": "SELL",
"m_activeStartTime": "",
"m_activeStopTime": "",
"m_algoStrategy": null,
"m_allOrNone": false,
"m_auctionStrategy": 0,
"m_auxPrice": 0.0,
"m_basisPoints": 9223372036854775807,
"m_basisPointsType": 9223372036854775807,
...
},
"orderId": 1,
"orderState": {
"m_commission": 9223372036854775807,
"m_commissionCurrency": null,
"m_equityWithLoan": "1.7976931348623157E308",
"m_initMargin": "1.7976931348623157E308",
"m_maintMargin": "1.7976931348623157E308",
"m_maxCommission": 9223372036854775807,
"m_minCommission": 9223372036854775807,
...
}
}
],
"openOrderEnd": true,
In the source code, we see that m_maxCommission is a float(), yet the value looks like an int, and is much larger than most commissions people like paying.
Some other keys like m_equityWithLoan have string type values, but the source code says that's correct.
How do I fix the case where I'm getting large ints instead of floats? Is it possible to read the value from memory and reinterpret it as a float? Is this an Interactive Brokers API problem?

How to turn a dataframe of categorical data into a dictionary

I have a dataframe that I need to transform into JSON. I think it would be easier to first turn it into a dictionary, but I can't figure out how. I need to transform it into JSON so that I can visualize it with js.d3
Here is what the data looks like currently:
NAME, CATEGORY, TAG
Ex1, Education, Books
Ex2, Transportation, Bus
Ex3, Education, Schools
Ex4, Education, Books
Ex5, Markets, Stores
Here is what I want the data to look like:
Data = {
Education {
Books {
key: Ex1,
key: Ex2
}
Schools {
key: Ex3
}
}
Transportation {
Bus {
key: Ex2
}
}
Markets {
Stores {
key: Ex5
}
}
(I think my JSON isn't perfect here, but I just wanted to convey the general idea).
This code is thanks to Brent Washburne's very helpful answer above. I just needed to remove the tags column because for now it was too messy (many of the rows had more than one tag separated by commas). I also added a column (of integers) which I wanted connected to the names. Here it is:
import json, string
import pprint
def to_json(file):
data = {}
for line in open(file):
fields = map(string.strip, line.split(','))
categories = data.get(fields[1], [])
to_append = {}
to_append[fields[0]] = fields[3]
categories.append(to_append)
data[fields[1]] = categories
return json.dumps(data)
print to_json('data.csv')
You can't use 'key' as a key more than once, so the innermost group is a list:
import json, string
def to_json(file):
data = {}
for line in open(file):
fields = map(string.strip, line.split(','))
categories = data.get(fields[1], {})
tags = categories.get(fields[2], [])
tags.append(fields[0])
categories[fields[2]] = tags
data[fields[1]] = categories
return json.dumps(data)
print to_json('data.csv')
Result:
{"Markets": {"Stores": ["Ex5"]}, "Education": {"Schools": ["Ex3"], "Books": ["Ex1", "Ex4"]}, "Transportation": {"Bus": ["Ex2"]}}

Parsing json file with changeable structure in Python

I'm using Yahoo Placemaker API which gives different structure of json depending on input.
Simple json file looks like this:
{
'document':{
'itemDetails':{
'id'='0'
'prop1':'1',
'prop2':'2'
}
'other':{
'propA':'A',
'propB':'B'
}
}
}
When I want to access itemDetails I simply write json_file['document']['itemDetails'].
But when I get more complicated response, such as
{
'document':{
'1':{
'itemDetails':{
'id'='1'
'prop1':'1',
'prop2':'2'
}
},
'0':{
'itemDetails':{
'id'='0'
'prop1':'1',
'prop2':'2'
},
'2':{
'itemDetails':{
'id'='1'
'prop1':'1',
'prop2':'2'
}
'other':{
'propA':'A',
'propB':'B'
}
}
}
the solution obviously does not work.
I use id, prop1 and prop2 to create objects.
What would be the best approach to automatically access itemDetails in the second case without writing json_file['document']['0']['itemDetails'] ?
If I understand correctly, you want to loop through all of json_file['document']['0']['itemDetails'], json_file['document']['1']['itemDetails'], ...
If that's the case, then:
item_details = {}
for key, value in json_file['document']:
item_details[key] = value['itemDetails']
Or, a one-liner:
item_details = {k: v['itemDetails'] for k, v in json_file['document']}
Then, you would access them as item_details['0'], item_details['1'], ...
Note: You can suppress the single quotes around 0 and 1, by using int(key) or int(k).
Edit:
If you want to access both cases seamlessly (whether there is one result or many), you could check:
if 'itemDetails' in json_file['document']:
item_details = {'0': json_file['document']['itemDetails']}
else:
item_details = {k: v['itemDetails'] for k, v in json_file['document'] if k != 'other'}
Then loop through the item_details dict.

Categories

Resources