main_dict = {
'NSE:ACC': {'average_price': 0,
'buy_quantity': 0,
'depth': {'buy': [{'orders': 0, 'price': 0, 'quantity': 0},
{'orders': 0, 'price': 0, 'quantity': 0},
{'orders': 0, 'price': 0, 'quantity': 0},
{'orders': 0, 'price': 0, 'quantity': 0},
{'orders': 0, 'price': 0, 'quantity': 0}],
'sell': [{'orders': 0, 'price': 0, 'quantity': 0},
{'orders': 0, 'price': 0, 'quantity': 0},
{'orders': 0, 'price': 0, 'quantity': 0},
{'orders': 0, 'price': 0, 'quantity': 0},
{'orders': 0, 'price': 0, 'quantity': 0}]},
'instrument_token': 5633,
'last_price': 2488.9,
'last_quantity': 0,
'last_trade_time': '2022-09-23 15:59:10',
'lower_circuit_limit': 2240.05,
'net_change': 0,
'ohlc': {'close': 2555.7,
'high': 2585.5,
'low': 2472.2,
'open': 2575},
'oi': 0,
'oi_day_high': 0,
'oi_day_low': 0,
'sell_quantity': 0,
'timestamp': '2022-09-23 18:55:17',
'upper_circuit_limit': 2737.75,
'volume': 0},
}
convert dict to pandas dataframe
for example:
symbol last_price net_change Open High Low Close
NSE:ACC 2488.9 0 2575 2585.5 2472.2 2555.7
I am trying pd.DataFrame.from_dict(main_dict)
but it does not work.
please give the best suggestion.
I would first select the necessary data from your dict and then pass that as input to pd.DataFrame()
df_input = [{
"symbol": symbol,
"last_price": main_dict.get(symbol).get("last_price"),
"net_change": main_dict.get(symbol).get("net_change"),
"open": main_dict.get(symbol).get("ohlc").get("open"),
"high": main_dict.get(symbol).get("ohlc").get("high"),
"low": main_dict.get(symbol).get("ohlc").get("low"),
"close": main_dict.get(symbol).get("ohlc").get("close")
} for symbol in main_dict]
import pandas as pd
df = pd.DataFrame(df_input)
code is below
r = [{'eid': '1', 'data': 'Health'},
{'eid': '2', 'data': 'countries'},
{'eid': '3', 'data': 'countries currency'},
{'eid': '4', 'data': 'countries language'}]
from elasticsearch import Elasticsearch
es = Elasticsearch()
es.cluster.health()
es.indices.create(index='my-index_1', ignore=400)
for e in enumerate(r):
es.index(index="my-index_1", body=e[1])
search1 = es.search(index="my-index_1", body={'query': {'term' : {'data.keyword': 'Health'}}})
search1
First time out is below
{'took': 0,
'timed_out': False,
'_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0},
'hits': {'total': {'value': 0, 'relation': 'eq'},
'max_score': None,
'hits': []}}
Second time
{'took': 0,
'timed_out': False,
'_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0},
'hits': {'total': {'value': 1, 'relation': 'eq'},
'max_score': 1.2039728,
'hits': [{'_index': 'my-index_1',
'_type': '_doc',
'_id': 'Rov4UHMBpo0uANDoY2_5',
'_score': 1.2039728,
'_source': {'eid': '1', 'data': 'Health'}}]}}
Third time
{'took': 0,
'timed_out': False,
'_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0},
'hits': {'total': {'value': 2, 'relation': 'eq'},
'max_score': 1.2809337,
'hits': [{'_index': 'my-index_1',
'_type': '_doc',
'_id': 'Rov4UHMBpo0uANDoY2_5',
'_score': 1.2809337,
'_source': {'eid': '1', 'data': 'Health'}},
{'_index': 'my-index_1',
'_type': '_doc',
'_id': 'aov4UHMBpo0uANDonm_E',
'_score': 1.2809337,
'_source': {'eid': '1', 'data': 'Health'}}]}}
​Below tag are keep on repating while hitting again and again
{'_index': 'my-index_1',
'_type': '_doc',
'_id': 'aov4UHMBpo0uANDonm_E',
'_score': 1.2809337,
'_source': {'eid': '1', 'data': 'Health'}}
Is it because of enumerate?. My input is list of dictionary then which having multiple keys, otherwise how to parse this?
My expected out is it should show only one time for every hit
?
I am new to Python and JSON. I am calling an API and as response body I am getting below :
{'product': 'Cycle', 'available': 20, 'blocked': 0, 'orderBooked': 0, 'transfer': 0, 'restock': 0, 'unavailable': 0, 'total': 0, 'lCode': '2000112', 'locationId': '745', 'locationCode': '425', 'stockType': 'IN STOCK', 'adminStock': {'rp': 0, 'management': 0, 'rc': 0, 'total': 0, 'default': 0}, 'isBlocked': False, 'plannedDate': None, 'plannedUpdate': True, 'bookedQuantity': 0}
{'product': 'Cooker', 'available': 958, 'blocked': 10, 'orderBooked': 10, 'transfer': 30, 'restock': 0, 'unavailable': 0, 'total': 0, 'lCode': '589620', 'locationId': '420', 'locationCode': '695', 'stockType': 'PRE ORDER', 'adminStock': {'rp': 0, 'management': 0, 'rc': 0, 'total': 0, 'default': 0}, 'isBlocked': False, 'plannedDate': None, 'plannedUpdate': True, 'bookedQuantity': 0}
{'product': 'Cycle', 'available': 96220, 'blocked': 0, 'orderBooked': 0, 'transfer': 0, 'restock': 0, 'unavailable': 0, 'total': 0, 'lCode': '2000112', 'locationId': '745', 'locationCode': '425', 'stockType': 'CONFIRMED', 'adminStock': {'rp': 0, 'management': 0, 'rc': 0, 'total': 0, 'default': 0}, 'isBlocked': False, 'plannedDate': None, 'plannedUpdate': True, 'bookedQuantity': 0}
{'product': 'Lapms', 'available': 89958, 'blocked': 1890, 'orderBooked': 1045, 'transfer': 230, 'restock': 0, 'unavailable': 0, 'total': 0, 'lCode': '78963', 'locationId': '896', 'locationCode': '463', 'stockType': 'TRANSIT', 'adminStock': {'rp': 0, 'management': 0, 'rc': 0, 'total': 0, 'default': 0}, 'isBlocked': False, 'plannedDate': None, 'plannedUpdate': True, 'bookedQuantity': 0}
The data I mentioned above will vary as per the API request. So Whatever be the response. Based on the Products, I need to Print Multi line data. My request is to read this Json and get the following Data :
Name:<'product'>, Code:<'lCode'>, Location:<'locationCode'>, Stock Type:<'stockType'>, Availability:<'available'>
So For the Above Json, the output should be like :
Name:Cycle, Code:2000112, Location:425, Stock Type:PRE ORDER, Availability:20
Name:Cooker, Code:589620, Location:695, Stock Type:<'stockType'>, Availability:958
Name:Cycle, Code:2000112, Location:425, Stock Type:CONFIRMED, Availability:96220
Name:Lapms, Code:78963, Location:463, Stock Type:TRANSIT, Availability:89958
So Based on the Times,
product is occuring, the data output will be having that much lines
I dont have any idea on parsing Json in Python. Please help in understanding how I can get the data in below format. I havent tried anything as I am stuck
This is what I believe you want. As some comments say, indeed these outputs should be treated as dictionaries or lists, with dictionaries and/or lists nested within them. It's important to know the difference since the first should be addressed by its key whereas the latter by its index. You can find some extra information regarding how to read jsons/dictionaries here
import pandas as pd
json_1 = {'product': 'Cycle', 'available': 20, 'blocked': 0, 'orderBooked': 0, 'transfer': 0, 'restock': 0, 'unavailable': 0, 'total': 0, 'lCode': '2000112', 'locationId': '745', 'locationCode': '425', 'stockType': 'IN STOCK', 'adminStock': {'rp': 0, 'management': 0, 'rc': 0, 'total': 0, 'default': 0}, 'isBlocked': False, 'plannedDate': None, 'plannedUpdate': True, 'bookedQuantity': 0}
json_2 = {'product': 'Cooker', 'available': 958, 'blocked': 10, 'orderBooked': 10, 'transfer': 30, 'restock': 0, 'unavailable': 0, 'total': 0, 'lCode': '589620', 'locationId': '420', 'locationCode': '695', 'stockType': 'PRE ORDER', 'adminStock': {'rp': 0, 'management': 0, 'rc': 0, 'total': 0, 'default': 0}, 'isBlocked': False, 'plannedDate': None, 'plannedUpdate': True, 'bookedQuantity': 0}
json_3 = {'product': 'Cycle', 'available': 96220, 'blocked': 0, 'orderBooked': 0, 'transfer': 0, 'restock': 0, 'unavailable': 0, 'total': 0, 'lCode': '2000112', 'locationId': '745', 'locationCode': '425', 'stockType': 'CONFIRMED', 'adminStock': {'rp': 0, 'management': 0, 'rc': 0, 'total': 0, 'default': 0}, 'isBlocked': False, 'plannedDate': None, 'plannedUpdate': True, 'bookedQuantity': 0}
json_4 = {'product': 'Lapms', 'available': 89958, 'blocked': 1890, 'orderBooked': 1045, 'transfer': 230, 'restock': 0, 'unavailable': 0, 'total': 0, 'lCode': '78963', 'locationId': '896', 'locationCode': '463', 'stockType': 'TRANSIT', 'adminStock': {'rp': 0, 'management': 0, 'rc': 0, 'total': 0, 'default': 0}, 'isBlocked': False, 'plannedDate': None, 'plannedUpdate': True, 'bookedQuantity': 0}
support_list = []
support_list.append([json_1,json_2,json_3,json_4])
support_dict = {'Name':[],'Code':[],'Location':[],'Stock type':[],'Availability':[]}
for i in range(len(support_list[0])):
support_dict['Name'].append(support_list[0][i]['product'])
support_dict['Code'].append(support_list[0][i]['lCode'])
support_dict['Location'].append(support_list[0][i]['locationCode'])
support_dict['Stock type'].append(support_list[0][i]['stockType'])
support_dict['Availability'].append(support_list[0][i]['available'])
df = pd.DataFrame(support_dict)
print(df)
Output:
Name Code Location Stock type Availability
0 Cycle 2000112 425 IN STOCK 20
1 Cooker 589620 695 PRE ORDER 958
2 Cycle 2000112 425 CONFIRMED 96220
3 Lapms 78963 463 TRANSIT 89958
EDIT: OPs says it's only list with multiple jsons in it.
It applies the same logic:
import pandas as pd
json_output= [{'product': 'Cycle', 'available': 20, 'blocked': 0, 'orderBooked': 0, 'transfer': 0, 'restock': 0, 'unavailable': 0, 'total': 0, 'lCode': '2000112', 'locationId': '745', 'locationCode': '425', 'stockType': 'IN STOCK', 'adminStock': {'rp': 0, 'management': 0, 'rc': 0, 'total': 0, 'default': 0}, 'isBlocked': False, 'plannedDate': None, 'plannedUpdate': True, 'bookedQuantity': 0},{'product': 'Cooker', 'available': 958, 'blocked': 10, 'orderBooked': 10, 'transfer': 30, 'restock': 0, 'unavailable': 0, 'total': 0, 'lCode': '589620', 'locationId': '420', 'locationCode': '695', 'stockType': 'PRE ORDER', 'adminStock': {'rp': 0, 'management': 0, 'rc': 0, 'total': 0, 'default': 0}, 'isBlocked': False, 'plannedDate': None, 'plannedUpdate': True, 'bookedQuantity': 0},{'product': 'Cycle', 'available': 96220, 'blocked': 0, 'orderBooked': 0, 'transfer': 0, 'restock': 0, 'unavailable': 0, 'total': 0, 'lCode': '2000112', 'locationId': '745', 'locationCode': '425', 'stockType': 'CONFIRMED', 'adminStock': {'rp': 0, 'management': 0, 'rc': 0, 'total': 0, 'default': 0}, 'isBlocked': False, 'plannedDate': None, 'plannedUpdate': True, 'bookedQuantity': 0},{'product': 'Lapms', 'available': 89958, 'blocked': 1890, 'orderBooked': 1045, 'transfer': 230, 'restock': 0, 'unavailable': 0, 'total': 0, 'lCode': '78963', 'locationId': '896', 'locationCode': '463', 'stockType': 'TRANSIT', 'adminStock': {'rp': 0, 'management': 0, 'rc': 0, 'total': 0, 'default': 0}, 'isBlocked': False, 'plannedDate': None, 'plannedUpdate': True, 'bookedQuantity': 0}]
support_dict = {'Name':[],'Code':[],'Location':[],'Stock type':[],'Availability':[]}
for i in range(len(json_output)):
support_dict['Name'].append(json_output[i]['product'])
support_dict['Code'].append(json_output[i]['lCode'])
support_dict['Location'].append(json_output[i]['locationCode'])
support_dict['Stock type'].append(json_output[i]['stockType'])
support_dict['Availability'].append(json_output[i]['available'])
df = pd.DataFrame(support_dict)
print(df)
Output:
Name Code Location Stock type Availability
0 Cycle 2000112 425 IN STOCK 20
1 Cooker 589620 695 PRE ORDER 958
2 Cycle 2000112 425 CONFIRMED 96220
3 Lapms 78963 463 TRANSIT 89958
EDIT 2: If you want the output as lines:
json_output= [{'product': 'Cycle', 'available': 20, 'blocked': 0, 'orderBooked': 0, 'transfer': 0, 'restock': 0, 'unavailable': 0, 'total': 0, 'lCode': '2000112', 'locationId': '745', 'locationCode': '425', 'stockType': 'IN STOCK', 'adminStock': {'rp': 0, 'management': 0, 'rc': 0, 'total': 0, 'default': 0}, 'isBlocked': False, 'plannedDate': None, 'plannedUpdate': True, 'bookedQuantity': 0},{'product': 'Cooker', 'available': 958, 'blocked': 10, 'orderBooked': 10, 'transfer': 30, 'restock': 0, 'unavailable': 0, 'total': 0, 'lCode': '589620', 'locationId': '420', 'locationCode': '695', 'stockType': 'PRE ORDER', 'adminStock': {'rp': 0, 'management': 0, 'rc': 0, 'total': 0, 'default': 0}, 'isBlocked': False, 'plannedDate': None, 'plannedUpdate': True, 'bookedQuantity': 0},{'product': 'Cycle', 'available': 96220, 'blocked': 0, 'orderBooked': 0, 'transfer': 0, 'restock': 0, 'unavailable': 0, 'total': 0, 'lCode': '2000112', 'locationId': '745', 'locationCode': '425', 'stockType': 'CONFIRMED', 'adminStock': {'rp': 0, 'management': 0, 'rc': 0, 'total': 0, 'default': 0}, 'isBlocked': False, 'plannedDate': None, 'plannedUpdate': True, 'bookedQuantity': 0},{'product': 'Lapms', 'available': 89958, 'blocked': 1890, 'orderBooked': 1045, 'transfer': 230, 'restock': 0, 'unavailable': 0, 'total': 0, 'lCode': '78963', 'locationId': '896', 'locationCode': '463', 'stockType': 'TRANSIT', 'adminStock': {'rp': 0, 'management': 0, 'rc': 0, 'total': 0, 'default': 0}, 'isBlocked': False, 'plannedDate': None, 'plannedUpdate': True, 'bookedQuantity': 0}]
for i in range(len(json_output)):
print('Name: ' + str(json_output[i]['product']) + ', Code: ' + str(json_output[i]['lCode']) + ', Location: ' + str(json_output[i]['locationCode']) + ', Stock type: ' + str(json_output[i]['stockType']) + ', Availability: ' + str(json_output[i]['available']))
Output:
Name: Cycle, Code: 2000112, Location: 425, Stock type: IN STOCK, Availability: 20
Name: Cooker, Code: 589620, Location: 695, Stock type: PRE ORDER, Availability: 958
Name: Cycle, Code: 2000112, Location: 425, Stock type: CONFIRMED, Availability: 96220
Name: Lapms, Code: 78963, Location: 463, Stock type: TRANSIT, Availability: 89958
If you parse json file you will get standard python dictionary.
import json
json_data = '{"a": 1, "b": 2, "c": 3, "d": 4, "e": 5}'
parsed_json = (json.loads(json_data))
What is the correct way to use elasticsearch-py in multiprocessing script? Should I create a new client object before start processes and use that object or should I create a new object inside each of the processes. The 2nd one gives me an an error with connection issues from elasticsearch
Thanks
Kiran
It seems the first method works for me, when I declare the client object as a global variable.
from multiprocessing import Pool
from elasticsearch import Elasticsearch
import time
def task(body):
result = es.index(index='test', doc_type='test', body=body)
return result
def main():
pool = Pool(processes=MAX_CONNECTS)
result = []
for x in range(10):
result.append(pool.apply_async(task, ({'id': x},)))
time.sleep(1)
for rs in result:
print(rs.get())
if __name__ == "__main__":
MAX_CONNECTS = 5
es = Elasticsearch(hosts="localhost", maxsize=MAX_CONNECTS)
main()
The output looks like
{'_index': 'test', '_type': 'test', '_id': 'xEjqBWcB9xsUYKqz-P6U', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 1, '_primary_term': 1}
{'_index': 'test', '_type': 'test', '_id': 'w0jqBWcB9xsUYKqz-P6U', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 0, '_primary_term': 1}
{'_index': 'test', '_type': 'test', '_id': 'x0jqBWcB9xsUYKqz-P6X', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 4, '_primary_term': 1}
{'_index': 'test', '_type': 'test', '_id': 'xkjqBWcB9xsUYKqz-P6X', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 3, '_primary_term': 1}
{'_index': 'test', '_type': 'test', '_id': 'xUjqBWcB9xsUYKqz-P6W', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 2, '_primary_term': 1}
{'_index': 'test', '_type': 'test', '_id': 'yEjqBWcB9xsUYKqz-P66', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 4, '_primary_term': 1}
{'_index': 'test', '_type': 'test', '_id': 'ykjqBWcB9xsUYKqz-P7I', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 2, '_primary_term': 1}
{'_index': 'test', '_type': 'test', '_id': 'yUjqBWcB9xsUYKqz-P7I', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 3, '_primary_term': 1}
{'_index': 'test', '_type': 'test', '_id': 'y0jqBWcB9xsUYKqz-P7P', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 4, '_primary_term': 1}
{'_index': 'test', '_type': 'test', '_id': 'zEjqBWcB9xsUYKqz-P7V', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 5, '_primary_term': 1}
The recommended way is to create a unique client object and you can increase the number of simultaneous thread using the maxsize (10 by default).
es = Elasticsearch( "host1", maxsize=25)
Source