How to bulk update a field in elasticsearch?

How to bulk update a field in elasticsearch? - python

def update(index):
index = "twitter"
list = ['0oPwSm4BxbPrifDrF7C1', 'r4MOWm4BxbPrifDrjbgR', 'y4NbWm4BxbPrifDrLLhh']
data = []
for i in list:
d = { "update" : {"_id" : i, "_index" : index, "retry_on_conflict" : 3} }
e = { "doc" : {"answer_1" : "test"} }
data.append(json.dumps(d))
data.append(json.dumps(e))
v = "\n".join(data)
response = requests.post('https://url/_bulk', headers='application/x-ndjson',
data=json.loads(v)
I want to bulk update the answer field for different documents. Unable to send request in a proper format i guess.

The bulk data set should be like this,
{'index': ''}\n
{'your': 'data'}\n
{'index': ''}\n
{'other': 'data'}\n
NB: the new-lines, even on the last row.
Your existing data seems OK to me,
{"update": {"_id": "0oPwSm4BxbPrifDrF7C1", "retry_on_conflict": 3, "_index": "twitter"}}
{"doc": {"answer_1": "test"}}
{"update": {"_id": "r4MOWm4BxbPrifDrjbgR", "retry_on_conflict": 3, "_index": "twitter"}}
{"doc": {"answer_1": "test"}}
{"update": {"_id": "y4NbWm4BxbPrifDrLLhh", "retry_on_conflict": 3, "_index": "twitter"}}
{"doc": {"answer_1": "test"}}
You have got a syntax error on the request.post() where you missed ending parenthesis ) and need to send v directly without using extra json.loads(v)
response = requests.post('https://url/_bulk',
data=v,
headers={'content-type':'application/json',
'charset':'UTF-8'})
print(response)

This seems to be the problem
data=json.loads(v)
'v' does not contain a parsable json string, it contains multiple JSON documents seperated by new lines. Try sending v directly without parsing.

Related

Python retrieve specified nested JSON value

I have a .json file with many entries looking like this:
{
"name": "abc",
"time": "20220607T190731.442",
"id": "123",
"relatedIds": [
{
"id": "456",
"source": "sourceA"
},
{
"id": "789",
"source": "sourceB"
}
],
}
I am saving each entry in a python object, however, I only need the related ID from source A. Problem is, the related ID from source A is not always first place in that nested list.
So data['relatedIds'][0]['id'] is not reliable to yield the right Id.
Currently I am solving the issue like this:
import json
with open("filepath", 'r') as file:
data = json.load(file)
for value in data['relatedIds']:
if(value['source'] == 'sourceA'):
id_from_a = value['id']
entry = Entry(data['name'], data['time'], data['id'], id_from_a)
I don't think this approach is the optimal solution though, especially if relatedIds list gets longer and more entries appended to the JSON file.
Is there a more sophisticated way of singling out this 'id' value from a specified source without looping through all entries in that nested list?

For a cleaner solution, you could try using python's filter() function with a simple lambda:
import json
with open("filepath", 'r') as file:
data = json.load(file)
filtered_data = filter(lambda a : a["source"] == "sourceA", data["relatedIds"])
id_from_a = next(filtered_data)['id']
entry = Entry(data['name'], data['time'], data['id'], id_from_a)
Correct me if I misunderstand how your json file looks, but it seems to work for me.

One step at a time, in order to get to all entries:
>>> data["relatedIds"]
[{'id': '789', 'source': 'sourceB'}, {'id': '456', 'source': 'sourceA'}]
Next, in order to get only those entries with source=sourceA:
>>> [e for e in data["relatedIds"] if e["source"] == "sourceA"]
[{'id': '456', 'source': 'sourceA'}]
Now, since you don't want the whole entry, but just the ID, we can go a little further:
>>> [e["id"] for e in data["relatedIds"] if e["source"] == "sourceA"]
['456']
From there, just grab the first ID:
>>> [e["id"] for e in data["relatedIds"] if e["source"] == "sourceA"][0]
'456'

Can you get whatever generates your .json file to produce the relatedIds as an object rather than a list?
{
"name": "abc",
"time": "20220607T190731.442",
"id": "123",
"relatedIds": {
"sourceA": "456",
"sourceB": "789"
}
}
If not, I'd say you're stuck looping through the list until you find what you're looking for.

Generating specific fields from JSON objects

I have a rather massive JSON objects and I'm trying to generate a specific JSON using only certain elements from it.
My current code looks like this:
data = get.req("/v2/users")
data = json.loads(data)
print (type(data)) # This returns <class 'list'>
print (data) # Below for what this returns
all_data = []
for d in data:
login_value = d['login']
if login_value.startswith('fe'):
continue
s = get.req("/v2/users/" + str(login_value)) # Sending another request with each
# login from the first request
all_data.append(s)
print (all_data) # Below for what this looks like this
print (data) before json.loads is str for the information, it returns data like this:
[
{
"id": 68663,
"login": "test1",
"url": "https://x.com/test"
},
{
"id": 67344,
"login": "test2",
"url": "https://x.com/test"
},
{
"id": 66095,
"login": "hi",
"url": "https://x.com/test"
}
]
print (all_data) returns a similar result to this for every user a request was sent for the first time
[b'{"id":68663,"email":"x#gmail.com","login":"xg","phone":"hidden","fullname":"xg gx","image_url":"https://imgur.com/random.png","mod":false,"points":5,"activity":0,"groups":['skill': 'archery']}
And this repeats for every user.
What I'm attempting to do is filtering by a few fields from all those results I received, so the final JSON I have will look something like this
[
{
"email": "x#gmail.com",
"fullname": "xg gf",
"points": 5,
"image_url", "https://imgur.com/random.pmg"
},
{
... similar json for the next user and so on
}
]
I feel as if the way I'm iterating over the data might be inefficient, so if you could guide me to a better way it would be wonderful.

In order for you to fetch login value, you have to iterate over data atleast once and for fetching details for every user, you have to make one call and that is exactly what you have done.
After you receive the user details instead of appending the whole object to the all_data list just take the field you need and construct a dict of it and then append it to all_data.
So your code has time complexity of O(n) which is best I understand.
Edit :
For each user you are receiving a byte response like below.
byte_response = [ b'{"id":68663,"email":"x#gmail.com","login":"xg","phone":"hidden","fullname":"xg gx","image_url":"https://imgur.com/random.png","mod":false,"points":5,"activity":0,"groups":[]}']
I'm not sure why would you get a response in a list [], but if it like that then take byte_response[0] so that we have the actual byte data like below.
byte_response = b'{"id":68663,"email":"x#gmail.com","login":"xg","phone":"hidden","fullname":"xg gx","image_url":"https://imgur.com/random.png","mod":false,"points":5,"activity":0,"groups":[]}'
response_decoded = byte_response.decode("utf-8") #decode it
import json
json_object_in_dict_form = json.loads(response_decoded) #convert it into dictionary
and then...
json_object_in_dict_form['take the field u want']

you can write:
data = get.req("/v2/users")
data = json.loads(data)
all_data = []
for d in data:
...
s = get.req("/v2/users/" + str(login_value))
new_data = {
'email': s['email'],
'fullname': s['fullname'],
'points': s['points'],
'image_url': s['image_url']
}
all_data.append(new_data)
print (all_data)
or you can make it fancy using an array with the fields you need:
data = get.req("/v2/users")
data = json.loads(data)
all_data = []
fields = ['email', 'fullname', 'point', 'image_url']
for d in data:
...
s = get.req("/v2/users/" + str(login_value))
new_data = dict()
for field in fields:
new_data[field] = s[field]
all_data.append(new_data)
print (all_data)

How to add Panda 'to_json' output to a rest call

I have the following JSON data:
x = df.to_json(orient='records')
print(x)
[{"val":"3760","id":"204","quantity":2},{"val":"8221","id":"220","quantity":8}]
I want to add the data to my REST call, but it results in the following string in the payload (note: the single quotes around the square bracket:
'updateVals': '[{"val":"3760","id":"204","quantity":2},{"val":"8221","id":"220","quantity":8}]'}}
The fact that the JASON values are listed as one big string, the REST call results in an HTTP 400 error.
The code is:
url = 'my_url'
payload = {
'client_id': 'my_id',
'api_key': 'my_key',
"data": {
"uuid": "myUUID",
"timeStamp": "2018-09-12T06:17:48+00:00",
"updateVals": x
}
}
How do I plug the JSON into the REST call? I assume I have to split the string, or maybe there is a more straightforward answer?

Still not sure what you need, JSON is really one big string:
x = [{"val":"3760","id":"204","quantity":2},
{"val":"8221","id":"220","quantity":8}]
>>> json.dumps(x)
>>> '[{"val": "3760", "id": "204", "quantity": 2}, {"val": "8221", "id": "220", "quantity": 8}]'

Use function output as input to another function

My question exactly is i want to use function output as input to another function,
Like:
def first(one,two):
#some codes gives me info from json files
api_data = json.loads(api.text)
return api_data["response"]["info"][one][two];
def second(moreinfo):
#this code is using first function and gives me the result to do something else on it, beside i want the first function as it is, because i'm using it in the project.
api_data = json.loads(api.text)
return api_data[moreinfo]["data"]["name"];
I'm using this in the file to get the result from second function
second(""+str(first(one,"two"))+"")
And i got this error
return api_data[first(one,"two")]["data"]["name"]
KeyError: 'data'
I think this error because first(one,"two") in second function doesn't took,
because i tried to put
return api_data["1"]["data"]["name"]
And its work with giving me info for number 1 result from first function,
Thanks and Regards :)
EDITED
Some examples for json
(for first function)
{
"response": {
"info": [
{
"id": 1,
"desc": "desc"
}
]
}
}
Second example (i want to print this in second function)
{
"1": {
"data": {
"name": "Name"
}
}
}

There are two approaches that either change the code as per your json response or change the json response on the basis of your code.
1st approach:
def first(index):
return api_data["response"]["info"][index]['id']
def second(moreinfo):
return api_data[moreinfo]["data"]["name"]
JSON:
api_data = {'1': {'data': {'name': 'name'}}, 'response': {'info': [{"id": 1, "name": "name"}]}}
Function Call:
second(""+str(first(0))+"")
2nd approach: Based on your given code JSON should be like this.
api_data = {'1': {'data': {'name': 'name'}}, 'response': {'info': {'one': {'two': 1}}}}
Function Call:
second(""+str(first('one', 'two'))+"")

The problem when i use PyQt, but if i just try the code without PyQt its work fine so, the question is no longer not solved.
def first(one,two):
api = requests.get("url")
api_data = json.loads(api.text)
return api_data["response"]["info"][one][two]
def second(id):
api = requests.get("another-url")
api_data = json.loads(api.text)
return api_data[id]["data"]["name"]
print(second(str(first("1","id"))))

error in accessing a nested json within a list in a json in python

I have this sample json data that is coming as a response from a POST request- response is the name of below sample json data variable:
{
"statusCode": "OK",
"data": [
{
"id": -199,
"result": {
"title": "test1",
"group": "test_grp2"
}
},
{
"id": -201,
"result": {
"title": "test2",
"group": "test_grp2"
}
}
]
}
Now what I want to do is form a list of list where each list will have id,title,and group values from above json data. Here is what my code looks like:
def get_list():
# code to get the response from POST request
json_data = json.loads(response.text)
group_list = []
if json_data['statusCode'] == 'OK':
for data in json_data['data']:
print(data)
for result_data in data['result']:
title = result_data['title']
group = result_data['group']
group_list.append([data['id'],title,group])
return(group_list)
When I execute this I get error as list indices must be int not str at line title = result_data['title]. How can I form the list of list above?

data['result'] in your case is a dictionary. When you iterate over it using for result_data in data['result'], you get the keys - in other words, result_data is a string and it does not support string indexing.
Instead, you meant:
for data in json_data['data']:
result_data = data['result']
title = result_data['title']
group = result_data['group']
Or, alternatively, you can make use of a list comprehension:
group_list = [(data['id'], data['result']['title'], data['result']['group'])
for data in json_data['data']]
print(group_list)

You can extract the list of lists directly using a list comprehension:
>>> [[d['id'], d['result']['title'], d['result']['group']] for d in json_data['data']]
[[-199, 'test1', 'test_grp2'], [-201, 'test2', 'test_grp2']]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to bulk update a field in elasticsearch? - python

This seems to be the problem data=json.loads(v) 'v' does not contain a parsable json string, it contains multiple JSON documents seperated by new lines. Try sending v directly without parsing.

Related

Python retrieve specified nested JSON value

Generating specific fields from JSON objects

How to add Panda 'to_json' output to a rest call

Use function output as input to another function

error in accessing a nested json within a list in a json in python

Categories

Resources