Why index keep on creating while hitting the api in elastic search - python

code is below
r = [{'eid': '1', 'data': 'Health'},
{'eid': '2', 'data': 'countries'},
{'eid': '3', 'data': 'countries currency'},
{'eid': '4', 'data': 'countries language'}]
from elasticsearch import Elasticsearch
es = Elasticsearch()
es.cluster.health()
es.indices.create(index='my-index_1', ignore=400)
for e in enumerate(r):
es.index(index="my-index_1", body=e[1])
search1 = es.search(index="my-index_1", body={'query': {'term' : {'data.keyword': 'Health'}}})
search1
First time out is below
{'took': 0,
'timed_out': False,
'_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0},
'hits': {'total': {'value': 0, 'relation': 'eq'},
'max_score': None,
'hits': []}}
Second time
{'took': 0,
'timed_out': False,
'_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0},
'hits': {'total': {'value': 1, 'relation': 'eq'},
'max_score': 1.2039728,
'hits': [{'_index': 'my-index_1',
'_type': '_doc',
'_id': 'Rov4UHMBpo0uANDoY2_5',
'_score': 1.2039728,
'_source': {'eid': '1', 'data': 'Health'}}]}}
Third time
{'took': 0,
'timed_out': False,
'_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0},
'hits': {'total': {'value': 2, 'relation': 'eq'},
'max_score': 1.2809337,
'hits': [{'_index': 'my-index_1',
'_type': '_doc',
'_id': 'Rov4UHMBpo0uANDoY2_5',
'_score': 1.2809337,
'_source': {'eid': '1', 'data': 'Health'}},
{'_index': 'my-index_1',
'_type': '_doc',
'_id': 'aov4UHMBpo0uANDonm_E',
'_score': 1.2809337,
'_source': {'eid': '1', 'data': 'Health'}}]}}
​Below tag are keep on repating while hitting again and again
{'_index': 'my-index_1',
'_type': '_doc',
'_id': 'aov4UHMBpo0uANDonm_E',
'_score': 1.2809337,
'_source': {'eid': '1', 'data': 'Health'}}
Is it because of enumerate?. My input is list of dictionary then which having multiple keys, otherwise how to parse this?
My expected out is it should show only one time for every hit
?

Related

Iterating over JSON data and printing. (or creating Pandas DataFrame from JSON file)

I’m trying to use Python print specific values from a JSON file that I pulled from an API. From what I understand, I am pulling it as a JSON file that has a list of dictionaries of players, with a nested dictionary for each player containing their data (i.e. name, team, etc.).
I’m running into issues printing the values within the JSON file, as each character is printing on a separate line.
The end result I am trying to get to is a Pandas DataFrame containing all the values from the JSON file, but I can’t even seem to iterate through the JSON file correctly.
Here is my code:
url = "https://api-football-v1.p.rapidapi.com/v3/players"
querystring = {"league":"39","season":"2020", "page":"2"}
headers = {
"X-RapidAPI-Host": "api-football-v1.p.rapidapi.com",
"X-RapidAPI-Key": "xxxxxkeyxxxxx"
}
response = requests.request("GET", url, headers=headers, params=querystring).json()
response_dump = json.dumps(response)
for item in response_dump:
for player_item in item:
print(player_item)
This is the output when I print the JSON response (first two items):
{'get': 'players', 'parameters': {'league': '39', 'page': '2', 'season': '2020'}, 'errors': [], 'results': 20, 'paging': {'current': 2, 'total': 37}, 'response': [{'player': {'id': 301, 'name': 'Benjamin Luke Woodburn', 'firstname': 'Benjamin Luke', 'lastname': 'Woodburn', 'age': 23, 'birth': {'date': '1999-10-15', 'place': 'Nottingham', 'country': 'England'}, 'nationality': 'Wales', 'height': '174 cm', 'weight': '72 kg', 'injured': False, 'photo': 'https://media.api-sports.io/football/players/301.png'}, 'statistics': [{'team': {'id': 40, 'name': 'Liverpool', 'logo': 'https://media.api-sports.io/football/teams/40.png'}, 'league': {'id': 39, 'name': 'Premier League', 'country': 'England', 'logo': 'https://media.api-sports.io/football/leagues/39.png', 'flag': 'https://media.api-sports.io/flags/gb.svg', 'season': 2020}, 'games': {'appearences': 0, 'lineups': 0, 'minutes': 0, 'number': None, 'position': 'Attacker', 'rating': None, 'captain': False}, 'substitutes': {'in': 0, 'out': 0, 'bench': 3}, 'shots': {'total': None, 'on': None}, 'goals': {'total': 0, 'conceded': 0, 'assists': None, 'saves': None}, 'passes': {'total': None, 'key': None, 'accuracy': None}, 'tackles': {'total': None, 'blocks': None, 'interceptions': None}, 'duels': {'total': None, 'won': None}, 'dribbles': {'attempts': None, 'success': None, 'past': None}, 'fouls': {'drawn': None, 'committed': None}, 'cards': {'yellow': 0, 'yellowred': 0, 'red': 0}, 'penalty': {'won': None, 'commited': None, 'scored': 0, 'missed': 0, 'saved': None}}]}, {'player': {'id': 518, 'name': 'Meritan Shabani', 'firstname': 'Meritan', 'lastname': 'Shabani', 'age': 23, 'birth': {'date': '1999-03-15', 'place': 'München', 'country': 'Germany'}, 'nationality': 'Germany', 'height': '185 cm', 'weight': '78 kg', 'injured': False, 'photo': 'https://media.api-sports.io/football/players/518.png'}, 'statistics': [{'team': {'id': 39, 'name': 'Wolves', 'logo': 'https://media.api-sports.io/football/teams/39.png'}, 'league': {'id': 39, 'name': 'Premier League', 'country': 'England', 'logo': 'https://media.api-sports.io/football/leagues/39.png', 'flag': 'https://media.api-sports.io/flags/gb.svg', 'season': 2020}, 'games': {'appearences': 0, 'lineups': 0, 'minutes': 0, 'number': None, 'position': 'Midfielder', 'rating': None, 'captain': False}, 'substitutes': {'in': 0, 'out': 0, 'bench': 3}, 'shots': {'total': None, 'on': None}, 'goals': {'total': 0, 'conceded': 0, 'assists': None, 'saves': None}, 'passes': {'total': None, 'key': None, 'accuracy': None}, 'tackles': {'total': None, 'blocks': None, 'interceptions': None}, 'duels': {'total': None, 'won': None}, 'dribbles': {'attempts': None, 'success': None, 'past': None}, 'fouls': {'drawn': None, 'committed': None}, 'cards': {'yellow': 0, 'yellowred': 0, 'red': 0}, 'penalty': {'won': None, 'commited': None, 'scored': 0, 'missed': 0, 'saved': None}}]},
This is the data type of each layer of the JSON file, from when I iterated through it with a For loop:
print(type(response)) <class 'dict'>
print(type(response_dump)) <class 'str'>
print(type(item)) <class 'str'>
print(type(player_item)) <class 'str'>
You do not have to json.dumps() in my opinion, just use the JSON from response to iterate:
for player in response['response']:
print(player)
{'player': {'id': 301, 'name': 'Benjamin Luke Woodburn', 'firstname': 'Benjamin Luke', 'lastname': 'Woodburn', 'age': 23, 'birth': {'date': '1999-10-15', 'place': 'Nottingham', 'country': 'England'}, 'nationality': 'Wales', 'height': '174 cm', 'weight': '72 kg', 'injured': False, 'photo': 'https://media.api-sports.io/football/players/301.png'}, 'statistics': [{'team': {'id': 40, 'name': 'Liverpool', 'logo': 'https://media.api-sports.io/football/teams/40.png'}, 'league': {'id': 39, 'name': 'Premier League', 'country': 'England', 'logo': 'https://media.api-sports.io/football/leagues/39.png', 'flag': 'https://media.api-sports.io/flags/gb.svg', 'season': 2020}, 'games': {'appearences': 0, 'lineups': 0, 'minutes': 0, 'number': None, 'position': 'Attacker', 'rating': None, 'captain': False}, 'substitutes': {'in': 0, 'out': 0, 'bench': 3}, 'shots': {'total': None, 'on': None}, 'goals': {'total': 0, 'conceded': 0, 'assists': None, 'saves': None}, 'passes': {'total': None, 'key': None, 'accuracy': None}, 'tackles': {'total': None, 'blocks': None, 'interceptions': None}, 'duels': {'total': None, 'won': None}, 'dribbles': {'attempts': None, 'success': None, 'past': None}, 'fouls': {'drawn': None, 'committed': None}, 'cards': {'yellow': 0, 'yellowred': 0, 'red': 0}, 'penalty': {'won': None, 'commited': None, 'scored': 0, 'missed': 0, 'saved': None}}]}
{'player': {'id': 518, 'name': 'Meritan Shabani', 'firstname': 'Meritan', 'lastname': 'Shabani', 'age': 23, 'birth': {'date': '1999-03-15', 'place': 'München', 'country': 'Germany'}, 'nationality': 'Germany', 'height': '185 cm', 'weight': '78 kg', 'injured': False, 'photo': 'https://media.api-sports.io/football/players/518.png'}, 'statistics': [{'team': {'id': 39, 'name': 'Wolves', 'logo': 'https://media.api-sports.io/football/teams/39.png'}, 'league': {'id': 39, 'name': 'Premier League', 'country': 'England', 'logo': 'https://media.api-sports.io/football/leagues/39.png', 'flag': 'https://media.api-sports.io/flags/gb.svg', 'season': 2020}, 'games': {'appearences': 0, 'lineups': 0, 'minutes': 0, 'number': None, 'position': 'Midfielder', 'rating': None, 'captain': False}, 'substitutes': {'in': 0, 'out': 0, 'bench': 3}, 'shots': {'total': None, 'on': None}, 'goals': {'total': 0, 'conceded': 0, 'assists': None, 'saves': None}, 'passes': {'total': None, 'key': None, 'accuracy': None}, 'tackles': {'total': None, 'blocks': None, 'interceptions': None}, 'duels': {'total': None, 'won': None}, 'dribbles': {'attempts': None, 'success': None, 'past': None}, 'fouls': {'drawn': None, 'committed': None}, 'cards': {'yellow': 0, 'yellowred': 0, 'red': 0}, 'penalty': {'won': None, 'commited': None, 'scored': 0, 'missed': 0, 'saved': None}}]}
or
for player in response['response']:
print(player['player'])
{'id': 301, 'name': 'Benjamin Luke Woodburn', 'firstname': 'Benjamin Luke', 'lastname': 'Woodburn', 'age': 23, 'birth': {'date': '1999-10-15', 'place': 'Nottingham', 'country': 'England'}, 'nationality': 'Wales', 'height': '174 cm', 'weight': '72 kg', 'injured': False, 'photo': 'https://media.api-sports.io/football/players/301.png'}
{'id': 518, 'name': 'Meritan Shabani', 'firstname': 'Meritan', 'lastname': 'Shabani', 'age': 23, 'birth': {'date': '1999-03-15', 'place': 'München', 'country': 'Germany'}, 'nationality': 'Germany', 'height': '185 cm', 'weight': '78 kg', 'injured': False, 'photo': 'https://media.api-sports.io/football/players/518.png'}
To get a DataFrame simply call pd.json_normalize() - Cause your question is not that clear I am not sure wiche information is needed and how to displayed. This is predestinated to ask a new question with exact that focus.:
pd.json_normalize(response['response'])
EDIT
Based on your comment and improvment:
pd.concat([pd.json_normalize(response,['response'])\
,pd.json_normalize(response,['response','statistics'])], axis=1)\
.drop(['statistics'], axis=1)
player.id
player.name
player.firstname
player.lastname
player.age
player.birth.date
player.birth.place
player.birth.country
player.nationality
player.height
player.weight
player.injured
player.photo
team.id
team.name
team.logo
league.id
league.name
league.country
league.logo
league.flag
league.season
games.appearences
games.lineups
games.minutes
games.number
games.position
games.rating
games.captain
substitutes.in
substitutes.out
substitutes.bench
shots.total
shots.on
goals.total
goals.conceded
goals.assists
goals.saves
passes.total
passes.key
passes.accuracy
tackles.total
tackles.blocks
tackles.interceptions
duels.total
duels.won
dribbles.attempts
dribbles.success
dribbles.past
fouls.drawn
fouls.committed
cards.yellow
cards.yellowred
cards.red
penalty.won
penalty.commited
penalty.scored
penalty.missed
penalty.saved
0
301
Benjamin Luke Woodburn
Benjamin Luke
Woodburn
23
1999-10-15
Nottingham
England
Wales
174 cm
72 kg
False
https://media.api-sports.io/football/players/301.png
40
Liverpool
https://media.api-sports.io/football/teams/40.png
39
Premier League
England
https://media.api-sports.io/football/leagues/39.png
https://media.api-sports.io/flags/gb.svg
2020
0
0
0
Attacker
False
0
0
3
0
0
0
0
0
0
0
1
518
Meritan Shabani
Meritan
Shabani
23
1999-03-15
München
Germany
Germany
185 cm
78 kg
False
https://media.api-sports.io/football/players/518.png
39
Wolves
https://media.api-sports.io/football/teams/39.png
39
Premier League
England
https://media.api-sports.io/football/leagues/39.png
https://media.api-sports.io/flags/gb.svg
2020
0
0
0
Midfielder
False
0
0
3
0
0
0
0
0
0
0

How to convert a column of dictionaries to separate columns in pandas?

Given the following dictionary created from df['statistics'].head().to_dict()
{0: {'executions': {'total': '1',
'passed': '1',
'failed': '0',
'skipped': '0'},
'defects': {'product_bug': {'total': 0, 'PB001': 0},
'automation_bug': {'AB001': 0, 'total': 0},
'system_issue': {'total': 0, 'SI001': 0},
'to_investigate': {'total': 0, 'TI001': 0},
'no_defect': {'ND001': 0, 'total': 0}}},
1: {'executions': {'total': '1',
'passed': '1',
'failed': '0',
'skipped': '0'},
'defects': {'product_bug': {'total': 0, 'PB001': 0},
'automation_bug': {'AB001': 0, 'total': 0},
'system_issue': {'total': 0, 'SI001': 0},
'to_investigate': {'total': 0, 'TI001': 0},
'no_defect': {'ND001': 0, 'total': 0}}},
2: {'executions': {'total': '1',
'passed': '1',
'failed': '0',
'skipped': '0'},
'defects': {'product_bug': {'total': 0, 'PB001': 0},
'automation_bug': {'AB001': 0, 'total': 0},
'system_issue': {'total': 0, 'SI001': 0},
'to_investigate': {'total': 0, 'TI001': 0},
'no_defect': {'ND001': 0, 'total': 0}}},
3: {'executions': {'total': '1',
'passed': '1',
'failed': '0',
'skipped': '0'},
'defects': {'product_bug': {'total': 0, 'PB001': 0},
'automation_bug': {'AB001': 0, 'total': 0},
'system_issue': {'total': 0, 'SI001': 0},
'to_investigate': {'total': 0, 'TI001': 0},
'no_defect': {'ND001': 0, 'total': 0}}},
4: {'executions': {'total': '1',
'passed': '1',
'failed': '0',
'skipped': '0'},
'defects': {'product_bug': {'total': 0, 'PB001': 0},
'automation_bug': {'AB001': 0, 'total': 0},
'system_issue': {'total': 0, 'SI001': 0},
'to_investigate': {'total': 0, 'TI001': 0},
'no_defect': {'ND001': 0, 'total': 0}}}}
Is there a way to expand the dictionary key/value pairs into their own columns and prefix these columns with the name of the original column, i.e. statisistics.executions.total would become statistics_executions_total or even executions_total?
I have demonstrated that I can create the columns using the following:
pd.concat([df.drop(['statistics'], axis=1), df['statistics'].apply(pd.Series)], axis=1)
However, you will notice that each of these newly created columns have a duplicate name "total".
I; however, have not been able to find a way to prefix the newly created columns with the original column name, i.e. executions_total.
For additional insight, statistics will expand into executions and defects and executions will expand into pass | fail | skipped | total and defects will expand into automation_bug | system_issue | to_investigate | product_bug | no_defect. The later will then expand into total | **001 columns where total is duplicated several times.
Any ideas are greatly appreciated. -Thanks!
.apply(pd.Series) is slow, don't use it.
See timing in Splitting dictionary/list inside a Pandas Column into Separate Columns
Create a DataFrame with a 'statistics' column from the dict in the OP.
This will create a DataFrame with a column of dictionaries.
Use pandas.json_normalize on the 'statistics' column.
The default sep is ..
Nested records will generate names separated by sep.
import pandas as pd
# this is for setting up the test dataframe from the data in the question, where data is the name of the dict
df = pd.DataFrame({'statistics': [v for v in data.values()]})
# display(df)
statistics
0 {'executions': {'total': '1', 'passed': '1', 'failed': '0', 'skipped': '0'}, 'defects': {'product_bug': {'total': 0, 'PB001': 0}, 'automation_bug': {'AB001': 0, 'total': 0}, 'system_issue': {'total': 0, 'SI001': 0}, 'to_investigate': {'total': 0, 'TI001': 0}, 'no_defect': {'ND001': 0, 'total': 0}}}
1 {'executions': {'total': '1', 'passed': '1', 'failed': '0', 'skipped': '0'}, 'defects': {'product_bug': {'total': 0, 'PB001': 0}, 'automation_bug': {'AB001': 0, 'total': 0}, 'system_issue': {'total': 0, 'SI001': 0}, 'to_investigate': {'total': 0, 'TI001': 0}, 'no_defect': {'ND001': 0, 'total': 0}}}
2 {'executions': {'total': '1', 'passed': '1', 'failed': '0', 'skipped': '0'}, 'defects': {'product_bug': {'total': 0, 'PB001': 0}, 'automation_bug': {'AB001': 0, 'total': 0}, 'system_issue': {'total': 0, 'SI001': 0}, 'to_investigate': {'total': 0, 'TI001': 0}, 'no_defect': {'ND001': 0, 'total': 0}}}
3 {'executions': {'total': '1', 'passed': '1', 'failed': '0', 'skipped': '0'}, 'defects': {'product_bug': {'total': 0, 'PB001': 0}, 'automation_bug': {'AB001': 0, 'total': 0}, 'system_issue': {'total': 0, 'SI001': 0}, 'to_investigate': {'total': 0, 'TI001': 0}, 'no_defect': {'ND001': 0, 'total': 0}}}
4 {'executions': {'total': '1', 'passed': '1', 'failed': '0', 'skipped': '0'}, 'defects': {'product_bug': {'total': 0, 'PB001': 0}, 'automation_bug': {'AB001': 0, 'total': 0}, 'system_issue': {'total': 0, 'SI001': 0}, 'to_investigate': {'total': 0, 'TI001': 0}, 'no_defect': {'ND001': 0, 'total': 0}}}
# normalize the statistics column
dfs = pd.json_normalize(df.statistics)
# display(dfs)
total passed failed skipped product_bug.total product_bug.PB001 automation_bug.AB001 automation_bug.total system_issue.total system_issue.SI001 to_investigate.total to_investigate.TI001 no_defect.ND001 no_defect.total
0 1 1 0 0 0 0 0 0 0 0 0 0 0 0
1 1 1 0 0 0 0 0 0 0 0 0 0 0 0
2 1 1 0 0 0 0 0 0 0 0 0 0 0 0
3 1 1 0 0 0 0 0 0 0 0 0 0 0 0
4 1 1 0 0 0 0 0 0 0 0 0 0 0 0

Elasticsearch [5.6] Sum Aggregations Losing Precision?

I'm trying to do a sum aggregation of amounts ($21.28 for example). But in the aggregation result, it only shows (21.0)
I've also tried changing the type of the mapping to float, and I got the same results.
The query itself looks like:
'aggs': {
'total': {
'sum': {
'field': 'amount'
}
}
}
And the mapping looks like:
'amount': {
'type': 'double',
'index': 'not_analyzed',
'store': False
},
And finally, here's the result, I've omitted some data but the important bits are the amounts:
{
'took': 3,
'aggregations': {'total': {'value': 21.0}},
'hits': {'total': 1, 'max_score': 0.51623213,
'hits': [
{
'_score': 0.51623213,
'_index': 'some_index',
'_type': 'donation',
'_source': {
'amount': 21.28,
'created_on': '2019-06-15T01:37:42.451249+00:00'
}
}
]},
'timed_out': False,
'_shards': {'total': 5, 'successful': 5, 'skipped': 0, 'failed': 0}
}
I'd expect to see 21.28 in the results, and not 21.0.
My solution was to update to Elasticsearch 6.x

Getting 500 Server Error when using Python Requests.put on Myjson.com API

Python Newbie here. I am using http://myjson.com/api to host a json file store and am having trouble using Python to PUT using their API.
I have a dict called budget that I would like to PUT to a specific API endpoint. It says to "Send updated JSON as String".
This the response when I type budget which I believe is a dict?:
{'spend': [{'isTransfer': False, 'catTypeFilter': 'Personal', 'rbal': 0.0, 'st'
1, 'isIncome': False, 'ex': False, 'isExpense': True, 'cat': 'Auto Insurance',
'amt': 150.0, 'date': '02/01/2016', 'aamt': 0.0, 'bgt': 150.0, 'type': 1, 'tbgt
: 1800.0, 'islast': False, 'id': 65369859, 'ramt': 0, 'period': 12, 'pid': 14},
{'isTransfer': False, 'rbal': -929.44, 'st': 3, 'ex': False, 'isExpense': True,
'cat': 'Auto Payment', 'amt': 1816.44, 'bgt': 887.0, 'type': 0, 'isIncome': Fal
e, 'catTypeFilter': 'Personal', 'id': 65369856, 'ramt': 0, 'pid': 14}, {'isTran
fer': False, 'rbal': 183.31, 'st': 1, 'ex': False, 'isExpense': True, 'cat': 'G
s & Fuel', 'amt': 266.69, 'bgt': 450.0, 'type': 0, 'isIncome': False, 'catTypeF
lter': 'Personal', 'id': 65369851, 'ramt': 0, 'pid': 14}, {'roll': True, 'isTra
sfer': False, 'rbal': -188.32, 'st': 3, 'isIncome': False, 'ex': False, 'isExpe
se': True, 'cat': 'Service & Parts', 'amt': 238.32, 'bgt': 50.0, 'type': 0, 'ca
TypeFilter': 'Personal', 'id': 65369855, 'ramt': 238.32, 'pid': 14}, {'isTransf
r': False, 'rbal': -14.99, 'st': 3, 'ex': False, 'isExpense': True, 'cat': 'Int
rnet', 'amt': 49.99, 'bgt': 35.0, 'type': 0, 'isIncome': False, 'catTypeFilter'
'Personal', 'id': 65369845, 'ramt': 0, 'pid': 13}, {'isTransfer': False, 'rbal
: -88.04, 'st': 3, 'ex': False, 'isExpense': True, 'cat': 'Mobile Phone', 'amt'
245.04, 'bgt': 157.0, 'type': 0, 'isIncome': False, 'catTypeFilter': 'Personal
, 'id': 65369858, 'ramt': 0, 'pid': 13}, {'isTransfer': False, 'rbal': 80.63, '
t': 1, 'ex': False, 'isExpense': True, 'cat': 'Utilities', 'amt': 74.37, 'bgt':
155.0, 'type': 0, 'isIncome': False, 'catTypeFilter': 'Personal', 'id': 6536985
, 'ramt': 0, 'pid': 13}, {'isTransfer': False, 'rbal': 74.0, 'st': 1, 'ex': Fal
e, 'isExpense': True, 'cat': 'Life Insurance', 'amt': 0, 'bgt': 74.0, 'type': 0
'isIncome': False, 'catTypeFilter': 'Personal', 'id': 65369843, 'ramt': 0, 'pi
': 11}, {'isTransfer': False, 'rbal': -31.55, 'st': 3, 'ex': False, 'isExpense'
True, 'cat': 'Groceries', 'amt': 831.55, 'bgt': 800.0, 'type': 0, 'isIncome':
alse, 'catTypeFilter': 'Personal', 'id': 65369842, 'ramt': 0, 'pid': 7}, {'isTr
nsfer': False, 'rbal': 6.73, 'st': 2, 'ex': False, 'isExpense': True, 'cat': 'R
staurants', 'amt': 93.27, 'bgt': 100.0, 'type': 0, 'isIncome': False, 'catTypeF
lter': 'Personal', 'id': 65369840, 'ramt': 0, 'pid': 7}, {'isTransfer': False,
rbal': 38.0, 'st': 1, 'ex': False, 'isExpense': True, 'cat': 'Charity', 'amt':
, 'bgt': 38.0, 'type': 0, 'isIncome': False, 'catTypeFilter': 'Personal', 'id':
65369860, 'ramt': 0, 'pid': 8}, {'isTransfer': False, 'rbal': 60.0, 'st': 1, 'e
': False, 'isExpense': True, 'cat': 'Gift', 'amt': 0, 'bgt': 60.0, 'type': 0, '
sIncome': False, 'catTypeFilter': 'Personal', 'id': 65369857, 'ramt': 0, 'pid':
8}, {'isTransfer': False, 'rbal': 560.0, 'st': 1, 'ex': False, 'isExpense': Tru
, 'cat': 'Tithes and Offerings', 'amt': 0, 'bgt': 560.0, 'type': 0, 'isIncome':
False, 'catTypeFilter': 'Personal', 'id': 65369846, 'ramt': 0, 'pid': 8}, {'isT
ansfer': False, 'rbal': -12.0, 'st': 3, 'ex': False, 'isExpense': True, 'cat':
Unknown', 'amt': 112, 'bgt': 100.0, 'type': 0, 'isIncome': False, 'catTypeFilte
': 'Personal', 'id': 65369853, 'ramt': 0, 'pid': None}, {'roll': True, 'isTrans
er': False, 'rbal': 1150.0, 'st': 1, 'isIncome': False, 'ex': False, 'isExpense
: True, 'cat': 'Home Insurance', 'amt': -1100.0, 'bgt': 50.0, 'type': 0, 'catTy
eFilter': 'Personal', 'id': 65369848, 'ramt': -1100.0, 'pid': 12}, {'isTransfer
: False, 'rbal': 12.0, 'st': 1, 'ex': False, 'isExpense': True, 'cat': 'Home Se
vices', 'amt': 0, 'bgt': 12.0, 'type': 0, 'isIncome': False, 'catTypeFilter': '
ersonal', 'id': 65369849, 'ramt': 0, 'pid': 12}, {'isTransfer': False, 'rbal':
.09, 'st': 2, 'ex': False, 'isExpense': True, 'cat': 'Mortgage & Rent', 'amt':
395.91, 'bgt': 2400.0, 'type': 0, 'isIncome': False, 'catTypeFilter': 'Personal
, 'id': 65369854, 'ramt': 0, 'pid': 12}, {'isTransfer': False, 'rbal': -35.97,
st': 3, 'ex': False, 'isExpense': True, 'cat': 'Unknown', 'amt': 185.97, 'bgt':
150.0, 'type': 0, 'isIncome': False, 'catTypeFilter': 'Personal', 'id': 6536984
, 'ramt': 0, 'pid': None}, {'isTransfer': False, 'rbal': 75.0, 'st': 1, 'ex': F
lse, 'isExpense': True, 'cat': 'Hair', 'amt': 0, 'bgt': 75.0, 'type': 0, 'isInc
me': False, 'catTypeFilter': 'Personal', 'id': 65369861, 'ramt': 0, 'pid': 4},
'isTransfer': False, 'rbal': -15.57, 'st': 3, 'ex': False, 'isExpense': True, '
at': 'Unknown', 'amt': 165.57, 'bgt': 150.0, 'type': 0, 'isIncome': False, 'cat
ypeFilter': 'Personal', 'id': 65369844, 'ramt': 0, 'pid': None}, {'roll': True,
'isTransfer': False, 'rbal': 14122.0, 'st': 1, 'isIncome': False, 'ex': False,
isExpense': True, 'cat': 'Property Tax', 'amt': -13508.0, 'bgt': 614.0, 'type':
0, 'catTypeFilter': 'Personal', 'id': 65369841, 'ramt': -13508.0, 'pid': 19}],
income': [{'isTransfer': False, 'rbal': 1319.87, 'st': 1, 'ex': False, 'isExpen
e': False, 'cat': 'Unknown', 'amt': 6680.13, 'bgt': 8000.0, 'type': 0, 'isIncom
': True, 'catTypeFilter': 'Personal', 'id': 65369852, 'ramt': 0, 'pid': None}]}
This is my script:
requests.put('https://api.myjson.com/bins/1quvq', data=str(budget))
But I keep getting a 500 Error:
b'{"status":500,"message":"Internal Server Error","more_info":"undefined method`string\' for #\\u003cPhusionPassenger::Utils::TeeInput:0x007f83edc7eac0\\u003e"}'
What am I doing wrong?
Ok this works for me: -
import requests
budget = {'s': 'd'}
r = requests.put('https://api.myjson.com/bins/1quvq', json=budget)
print r.content
If you are still having a problem with your data you need to include your full data in the question. Your dict seems to have been truncated at the edges.

elasticsearch-py and multiprocessing

What is the correct way to use elasticsearch-py in multiprocessing script? Should I create a new client object before start processes and use that object or should I create a new object inside each of the processes. The 2nd one gives me an an error with connection issues from elasticsearch
Thanks
Kiran
It seems the first method works for me, when I declare the client object as a global variable.
from multiprocessing import Pool
from elasticsearch import Elasticsearch
import time
def task(body):
result = es.index(index='test', doc_type='test', body=body)
return result
def main():
pool = Pool(processes=MAX_CONNECTS)
result = []
for x in range(10):
result.append(pool.apply_async(task, ({'id': x},)))
time.sleep(1)
for rs in result:
print(rs.get())
if __name__ == "__main__":
MAX_CONNECTS = 5
es = Elasticsearch(hosts="localhost", maxsize=MAX_CONNECTS)
main()
The output looks like
{'_index': 'test', '_type': 'test', '_id': 'xEjqBWcB9xsUYKqz-P6U', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 1, '_primary_term': 1}
{'_index': 'test', '_type': 'test', '_id': 'w0jqBWcB9xsUYKqz-P6U', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 0, '_primary_term': 1}
{'_index': 'test', '_type': 'test', '_id': 'x0jqBWcB9xsUYKqz-P6X', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 4, '_primary_term': 1}
{'_index': 'test', '_type': 'test', '_id': 'xkjqBWcB9xsUYKqz-P6X', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 3, '_primary_term': 1}
{'_index': 'test', '_type': 'test', '_id': 'xUjqBWcB9xsUYKqz-P6W', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 2, '_primary_term': 1}
{'_index': 'test', '_type': 'test', '_id': 'yEjqBWcB9xsUYKqz-P66', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 4, '_primary_term': 1}
{'_index': 'test', '_type': 'test', '_id': 'ykjqBWcB9xsUYKqz-P7I', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 2, '_primary_term': 1}
{'_index': 'test', '_type': 'test', '_id': 'yUjqBWcB9xsUYKqz-P7I', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 3, '_primary_term': 1}
{'_index': 'test', '_type': 'test', '_id': 'y0jqBWcB9xsUYKqz-P7P', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 4, '_primary_term': 1}
{'_index': 'test', '_type': 'test', '_id': 'zEjqBWcB9xsUYKqz-P7V', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 5, '_primary_term': 1}
The recommended way is to create a unique client object and you can increase the number of simultaneous thread using the maxsize (10 by default).
es = Elasticsearch( "host1", maxsize=25)
Source

Categories

Resources