Extract specific JSON field from Twitter streaming API using Python

Extract specific JSON field from Twitter streaming API using Python - python

I am using Twitter's streaming API code (found here). I am able to get my desired output which is a series of filtered results. However, I specifically need to assign the 'text' field from the JSON result to a variable and I am unable to come up with the right way to do it.
I have isolated the part of the code that returns the streaming data and display it in the terminal when I run it:
for response_line in response.iter_lines():
if response_line:
json_response = json.loads(response_line)
print(json.dumps(json_response, indent=4, sort_keys=True))
What I need is to just get the text part of the tweet that is returned. Here's an output example, noting I only need to set a variable - twitterVariable to the "text" result:
{
"data": {
"id": "125855555555",
"text": "hello this is a test"
},
"matching_rules": [
{
"id": 1234567890,
"tag": ""
}
]
}

As you have already loaded the response into dict object of python, you can use key to get the text field as below:
twitter_variable = json_response['data']['text']

Related

How can I have only one api value?

everyone
I started programming in python yesterday to create a project. This consists of taking data from an API using the "Requests" library
So far I had no trouble getting familiar with the library, but I can't get results for what I'm specifically looking for.
My idea is just to get the name of the account.
Here the code
import requests
user = 'example'
payload = {'data': 'username'}
r = requests.get('https://api.imvu.com/user/user-'+user, params=payload)
json = r.json()
print(json)
My idea is that, within all the data that can be obtained, only obtain the name of the account. just the name
The code works perfectly, but it throws me all the account data.
For example:
{
"https://api.imvu.com/user/user-x?data=created": {
"data": {
"created": "2020-11-30T17:56:31Z",
"registered": "x",
"gender": "f",
"display_name": "‏‏‎ ‎",
"age": "None",
"country": "None",
"state": "None",
"avatar_image": "x",
"avatar_portrait_image": "https://......",
"is_vip": false,
"is_ap": true,
"is_creator": false,
"is_adult": true,
"is_ageverified": true,
"is_staff": false,
"is_greeter": false,
"greeter_score": 0,
"badge_level": 0,
"username": "=== ONLY THIS I NEED ==="
}
}
}
As you can see, I only need one thing from all that data.
Sorry for bothering and I hope I can learn from your answers. Thanks so much for reading

Unless API allows you to specify exactly what data to return (some does) then you got no control about the API behavior nor what data (and how) given endpoint returns. Publicly exposed API is all you can have in hand and sometimes you may get tons of useless data and there's basically nothing you can do about that.

To get specific item from json, you can simply make few changes in your code.
r = requests.get('https://api.imvu.com/user/user-'+user, params=payload)
json = r.json()
username = json["https://api.imvu.com/user/user-x?data=created"]["data"]["username"]
print(username)

you might check whether there is an alternative REST method that only provides you with the username.
The REST response you cannot modify as it is sent from the server, so you need to parse the response e.g. like here
Extract value from json response python?
python

How to validate JSON request body before sending PUT request in python

It's when I send a PUT request to my API endpoint from python with a JSON request body I receive empty request body, because sometimes It's containing special characters which is not supported by JSON.
How can I sanitize my JSON before sending my request?
I've tried with stringify and parsing json before I sent my request!
profile = json.loads(json.dumps(profile))
My example invalid json is:
{
"url": "https://www.example.com/edmund-chand/",
"name": "Edmund Chand",
"current_location": "FrankfurtAmMainArea, Germany",
"education": [],
"skills": []
}
and My expected validated json should be:
{
"url": "https://www.example.com/edmund-chand/",
"name": "Edmund Chand",
"current_location": "Frankfurt Am Main Area, Germany",
"education": [],
"skills": []
}

If you're looking for something quick to sanitize json data for limited fields i.e. current_location, you can try something like the following below:
def sanitize(profile):
profile['current_location'] = ', '.join([val.strip() for val in profile['current_location'].split(',')])
return profile
profile = sanitize(profile)
The idea here is that you would write code to sanitize each bits in that function and send it your api or throw exception if invalid etc.
For more robust validation, you can consider using jsonschema package. More details here.
With that package you can validate strings and json schema more flexibly.
Example taken from the package readme:
from jsonschema import validate
# A sample schema, like what we'd get from json.load()
schema = {
"type" : "object",
"properties" : {
"url" : {"type" : "string", "format":"uri"},
"current_location" : {"type" : "string", "maxLength":25, "pattern": "your_regex_pattern"},
},
}
# If no exception is raised by validate(), the instance is valid.
validate(instance=profile, schema=schema)
You can find more infor and types of available validation for strings here.

Thank you #Rithin for your solution but that one seems more coupled with one field of the whole JSON.
I found a solution to replace it with below example code which works for any field:
profile = json.loads(json.dumps(profile).replace("\t", " "))

How to save dictionary data in list ? and return that list in Python?

I have a json file i need to retrieve data from it and then insert it into another API.
WorkFlow: External Feed -> Parsing -> Insert into to Another API
Coding Part:
Function Defined in a Parsing class.
def parsed_items(self):
self.get_response()
items = self.soup.find_all('item')
self.payload = []
for item in items:
self.payload.append({'title': item.find('title').text,
'description': item.find('description').text,
'status': '3'
}
)
return self.payload
Function Defined in main class to get values of this function.
for items in parser.parsed_items():
response2 = requests.request('POST', settings.BASE_URL,
json= (items['title'], items['description'], items['status']),
headers=headers())
Sample of JSON:
{ Data:
{
"title": "ipsum",
"description": "lorem"
}
{
"title": "ipsum1",
"description": "lorem1"
}
{
"title": "ipsum2",
"description": "lorem2"
}
{
"title": "ipsum3",
"description": "lorem3"
}
}
Error:
{"errors":[{"status":"400","source":"non_field_errors","detail":"Invalid data. Expected a dictionary, but got list."}]}
I need to know ?
Q1: What is the best way to handle such scenarios? Please refer any tutorial which can be helpful in this scenario.
Q2: How to retrieve list of values from payload ? Any example that you can refer ?
Q3: How can the list which is returned back by parse_item() be converted into dictionary and passed into request for value of json parameter.
I need to fetch these values of "title" and "description" from JSON and POST them in local API. (Note: Local API is authenticated successfully)

400 Error while trying to POST to JIRA issue

I am trying to set the 'transition' property in a JIRA issue from whatever it is, to completed(which according to the doc is 10000). According to the documentation, this error is 'If there is no transition specified.'
Also I have used ?expand=transitions.fields to verify that 10000 is for complete.
using these docs
https://docs.atlassian.com/jira/REST/latest/#api/2/issue-doTransition
https://jira.atlassian.com/plugins/servlet/restbrowser#/resource/api-2-issue-issueidorkey-transitions/POST
Here is my request
url = 'http://MYURL/rest/api/2/issue/ISSUE-ID/transitions'
payload1 = open('data3.json', 'r').read()
payload = json.loads(payload1)
textFile = requests.post(url, auth=('username', 'password'), json=payload)
The contents on my data3.json file are
{
"transition": 10000
}
edit: I also changed my JSON to this and I get a 500 error
{
"transition": {
"id": "10000"
}
}
The error I get
{"errorMessages":["Can not instantiate value of type [simple type,classcom.atlassian.jira.rest.v2.issue.TransitionBean] from JSON integral number;no single-int-arg constructor/factory method (through reference chain:com.atlassian.jira.rest.v2.issue.IssueUpdateBean[\"transition\"])"]}400
I'm pretty confident that my issue is in my json file since I have used GET in the code above this snippit multiple times, but I could be wrong.
Possible cause - https://jira.atlassian.com/browse/JRA-32132

I believe the issue I was having was a process flow one. I cannot jump right from my issue being opened, to 'completed'. However, I can go from the issue being created to 'Done'.
{
"transition": {
"name": "Done",
"id": "151"
}
}
As this does what I need, I will use it. If I find how to make ticket complete I will post back.
Also, I think the fact we customize our JIRA lead to my getting 'Completed' as a valid transition even though it wasn't.

Yes, you're right that the JSON is wrong, it's not even a valid json since the value is not a number, string, object, or array. The doc says:
The fields that can be set on transtion, in either the fields
parameter or the update parameter can be determined using the
/rest/api/2/issue/{issueIdOrKey}/transitions?expand=transitions.fields
resource.
So you need to do a get request on /rest/api/2/issue/{issueIdOrKey}/transitions?expand=transitions.fields to get the list of possible values and then set that in the json
{
"transition": {
"id" : "an_id_from_response"
}
}

adding new dict to json output python

Very new to json so please forgive if I am using the wrong terms here. Any ways I am trying to create a json file every x minutes with updated twitter info. Now I know i could just use the twitter api and grab what I need but I am wanting to mess around a bit. My problem is getting a new key/dict? or what ever it is called for each new item added in a for statement.
What im trying to get
[
{
"name": "thename",
"date": "thedate",
"text": "thetext"
}, <--- trying to get that little comma
{
"name": "thename",
"date": "thedate",
"text": "thetext"
}
]
Now i am getting the data that i want and all but not that comma. It outputs it all but not like it should be with that little one character thing that makes it valid.
[
{
"name": "thename",
"date": "thedate",
"text": "thetext"
} <--- I get this instead for each new object
{
"name": "thename",
"date": "thedate",
"text": "thetext"
}
]
Here is just a snippet of the code as the rest should be explanatory since its just Twitter oauth stuff.
for player in users:
statuses = api.GetUserTimeline(screen_name=player, count=10)
print "["
for s in statuses:
print json.dumps('timestamp': s.created_at,'username': s.user.name,'status': s.text)
print "]"
Also is there a better way to do the [ ] at the start and end because I know that is ugly and way un-proper i bet XD . Like I said newbie on json/python stuff but its a good learning experience.

Instead of trying to build the JSON array yourself, you should just let the JSON encoder do that for you too. To do that, you just need to pass a list of objects to json.dumps. So instead of printing each JSON object on its own, just collect them in a list, and dump that:
allstatuses = []
for player in users:
statuses = api.GetUserTimeline(screen_name=player, count=10)
for s in statuses:
allstatuses.append({'timestamp': s.created_at, 'username': s.user.name, 'status': s.text})
print(json.dumps(allstatuses))

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Extract specific JSON field from Twitter streaming API using Python - python

As you have already loaded the response into dict object of python, you can use key to get the text field as below: twitter_variable = json_response['data']['text']

Related

How can I have only one api value?

How to validate JSON request body before sending PUT request in python

How to save dictionary data in list ? and return that list in Python?

400 Error while trying to POST to JIRA issue

adding new dict to json output python

Categories

Resources