Python and Vend JSON Queries - python

Just trying to make some sense of the JSON outputs I'm getting from the Vend JSON API:
Item pagination
Item {u'pages': 10, u'results': 487, u'page_size': 50, u'page': 1}
Item customers
Item [{u'custom_field_3': u'', u'customer_code': u'WALKIN', u'custom_field_1': u'', u'balance': u'0', u'customer_group_id': u'xxx', u'custom_field_2': u'',
Is an example.
I'm trying to isolate a number of fields, such as 'customer_code' from the JSON output, but haven't seem to have worked it out quite yet.
My code:
response = requests.get('http://xxx.vendhq.com/api/customers',
auth=('xxx', 'yyy'))
data = response.json()
for item in data.items():
print 'Item', item[0]
print 'Item', item[1]
If I could "walk" across the JSON output, isolating the fields that would be pertinent, that would be really good code.

According to the output and the given code, the structure of the data is:
{
'pagination': {u'pages': 10, u'results': 487, u'page_size': 50, u'page': 1}
'customers': [{u'custom_field_3': u'', u'customer_code': u'WALKIN',
u'custom_field_1': u'', u'balance': u'0',
u'customer_group_id': u'xxx', u'custom_field_2': u'', ..]
}
To get the customer_code list, you need to access a dict entry with the key customers and iterate it:
customer_codes = [customer['customer_code'] for customer in data['customers']]

Related

Remove nested keys and move value to main dictionary keys

I have a problem when I try to merge two dictionaries to fit for doing a post later. For some reason the get seems to be nested and Im not sure how to clean it up. Would be great to get some tips on optimizing the code as well, right now it looks a bit messy.
for network in networks:
post_dict = {e1:e2 for e1,e2 in network['extattrs'].iteritems() if e1 not in keys }
pprint (post_dict['Stuff-Name']['value'])
post_dict['name'] = post_dict.pop('Stuff-Name')
post_dict['sid'] = post_dict.pop('Stuff-id')
dict_to_post = merge_two_dicts(post_dict, default_keys)
network:
{u'_ref': u'ref number',
u'comment': u'Name of object',
u'extattrs': {u'Network-Type': {u'value': u'Internal'},
u'Stuff-Id': {u'value': 110},
u'Stuff-Name': {u'value': u'Name of object'}},
u'network': u'Subnet-A',
u'network_view': u'default'}
default_keys:
default_keys = {'status':'Active',
'group':None,
'site':'City-A',
'role':'Production',
'description':None,
'custom_fields':None,
'tenant':None}
post_dict:
{'name': {u'value': u'Name of object'},
'sid': {u'value': 110}}
So what I want to achive is to get rid of the nested keys (within key "name" and "sid" so the key and value pair should be "name: Name of object" and "sid: 110"
The post function is not yet defined.
In my understanding, you case is really specific and I would probably go for a easy & dirty solution. First of all have you tried this:
post_dict['name'] = (post_dict.pop('Stuff-Name'))['value']
Secondly, how about making use of the "filter and renaming" and collapse the indexing there? This is not advisable, but if you are trying to do a lazy work-around it will suffice. I recommend you go with my first suggestion, as I'm pretty confident that it will solve your issue.
To get this first value of any nested dictionary you could use this
d = {'custom_fields': None, 'description': None, 'group': None, 'name':
{'value': 'Name of object'}, 'role': 'Production', 'site': 'City-A',
'status': 'Active', 'tenant': None, 'sid': {'value': 110}}
for key in d.keys():
if type(d[key]) == dict:
d[key] = d[key].popitem()[1]
It returns
{'custom_fields': None, 'description': None, 'group': None, 'name': 'Name of
object', 'role': 'Production', 'site': 'City-A', 'status': 'Active',
'tenant': None, 'sid': 110}
I think it's this step that's causing the dictionaries to be nested in the first place
post_dict['name'] = post_dict.pop('Stuff-Name')
post_dict['sid'] = post_dict.pop('Stuff-id')
You could try popitem()[1] here if you'll only ever need value of that dictionary and not the key.

using twitter api to get Arabic trends , i get symbols instead of the actual trends?

am using this part of code to get trends about Egypt
`Egypt_WOE_ID = 23424802
Egypt_trends = twitter_api.trends.place(_id=Egypt_WOE_ID)
print Egypt_trends`
the problem is instead of getting the actual hastags and trends i get symobls doesn't mean any thing , this is a part of the output :-
[{u'created_at': u'2017-02-20T12:41:44Z', u'trends': [{u'url': u'http://twitter.com/search?q=%23%D9%85%D8%B0%D8%A8%D8%AD%D9%87_%D8%A8%D9%88%D8%B1%D8%B3%D8%B9%D9%8A%D8%AF', u'query': u'%23%D9%85%D8%B0%D8%A8%D8%AD%D9%87_%D8%A8%D9%88%D8%B1%D8%B3%D8%B9%D9%8A%D8%AF', u'tweet_volume': None, u'name': u'#\u0645\u0630\u0628\u062d\u0647_\u0628\u0648\u0631\u0633\u0639\u064a\u062f', u'promoted_content': None}, {u'url': u'/search?q=%23JFT74', u'query': u'%23JFT74', u'tweet_volume': None, u'name': u'#JFT74', u'promoted_content': None}, {u'url': u'/search?q=%23%D8%A8%D9%84%D8%A7%D9%87%D8%A7_%D9%84%D8%AD%D9%88%D9%85_%D9%81%D8%B1%D8%A7%D8%AE_%D8%B3%D9%85%D9%83', u'query': u'%23%D8%A8%D9%84%D8%A7%D9%87%D8%A7_%D9%84%D8%AD%D9%88%D9%85_%D9%81%D8%B1%D8%A7%D8%AE_%D8%B3%D9%85%D9%83', u'tweet_volume': None, u'name': u'#\u0628\u0644\u0627\u0647\u0627_\u0644\u062d\u0648\u0645_\u0641\u0631\u0627\u062e_\u0633\u0645\u0643', u'promoted_content': None}, {u'url': u'/search?q=%23%D8%A7%D9%85_%D8%AE%D8%AF%D8%A7%D8%B4_%D8%AA%D9%85%D8%A7%D8%B1%D8%B3_%D8%A7%D9%84%D8%AC%D9%86%D8%B3', u'query': u'%23%D8%A7%D9%85_%D8%AE%D8%AF%D8%A7%D8%B4_%D8%AA%D9%85%D8%A7%D8%B1%D8%B3_%D8%A7%D9%84%D8%AC%D9%86%D8%B3', u'tweet_volume': 14030, u'name': u'#\u0627\u0645_\u062e\u062f\u0627\u0634_\u062a\u0645\u0627\u0631\u0633_\u0627\u0644\u062c\u0646\u0633', u'promoted_content': None}]
thanks in advance , and please forgive me if my English bad or any thing.i will try to add and update any thing i found or any note any one tell me about it to make the question looks better.
Your strings containing % are url encoded. You can convert them with:
# Python 3
import urllib.parse
s='%23%D9%85%D8%B0%D8%A8%D8%AD%D9%87_%D8%A8%D9%88%D8%B1%D8%B3%D8%B9%D9%8A%D8%AF'
urllib.parse.unquote(s)
# '#مذبحه_بورسعيد'
# Python 2
import urllib
s='%23%D9%85%D8%B0%D8%A8%D8%AD%D9%87_%D8%A8%D9%88%D8%B1%D8%B3%D8%B9%D9%8A%D8%AF'
urllib.unquote(s)
# '#مذبحه_بورسعيد'

extracting hashtags out of Twitter trending topics data with Python Tweepy

I'm having a following problem:
using the Twitter API and tweepy module, I want to monitor the trending topics and extract hashtags out of the data.
This code:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import tweepy, json
CONSUMER_KEY = 'key'
CONSUMER_SECRET = 'secret'
ACCESS_KEY = 'key'
ACCESS_SECRET = 'secret'
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
api = tweepy.API(auth)
trends1 = api.trends_place(1)
print trends1
gives me data about globally trending topics that is structured like this:
[{u'created_at': u'2014-04-16T12:13:15Z', u'trends': [{u'url': u'http://twitter.com/search?q=%22South+Korea%22', u'query': u'%22South+Korea%22', u'name': u'South Korea', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%23FETUSONEDIRECTIONDAY', u'query': u'%23FETUSONEDIRECTIONDAY', u'name': u'#FETUSONEDIRECTIONDAY', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%23PrayForSouthKorea', u'query': u'%23PrayForSouthKorea', u'name': u'#PrayForSouthKorea', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%23GaraGaraRP', u'query': u'%23GaraGaraRP', u'name': u'#GaraGaraRP', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%23%D8%A5%D8%B3%D9%85_%D8%A3%D9%85%D9%8A_%D8%A8%D8%AC%D9%88%D8%A7%D9%84%D9%8A', u'query': u'%23%D8%A5%D8%B3%D9%85_%D8%A3%D9%85%D9%8A_%D8%A8%D8%AC%D9%88%D8%A7%D9%84%D9%8A', u'name': u'#\u0625\u0633\u0645_\u0623\u0645\u064a_\u0628\u062c\u0648\u0627\u0644\u064a', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%23Kad%C4%B1nlarKamyon%C5%9Eof%C3%B6r%C3%BCOlursa', u'query': u'%23Kad%C4%B1nlarKamyon%C5%9Eof%C3%B6r%C3%BCOlursa', u'name': u'#Kad\u0131nlarKamyon\u015eof\xf6r\xfcOlursa', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%22Dear+My+BestFriend%22', u'query': u'%22Dear+My+BestFriend%22', u'name': u'Dear My BestFriend', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%22%D0%A1%D0%B0%D0%BC%D0%BE%D0%BE%D0%B1%D0%BE%D1%80%D0%BE%D0%BD%D0%B0+100%22', u'query': u'%22%D0%A1%D0%B0%D0%BC%D0%BE%D0%BE%D0%B1%D0%BE%D1%80%D0%BE%D0%BD%D0%B0+100%22', u'name': u'\u0421\u0430\u043c\u043e\u043e\u0431\u043e\u0440\u043e\u043d\u0430 100', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%22If+I+Stay%22', u'query': u'%22If+I+Stay%22', u'name': u'If I Stay', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=Gabashvili', u'query': u'Gabashvili', u'name': u'Gabashvili', u'promoted_content': None}], u'as_of': u'2014-04-16T12:20:29Z', u'locations': [{u'woeid': 1, u'name': u'Worldwide'}]}]
Is this a python list, containing several dictionaries? How can I extract hashtags out of that data and save them into new variables?
I'm new to python so please explain your choices.
Thanks!
In your example you have a single entry in your list, consisting of nested dicts with key value 'trends' each value is a another dict, the one you are interested in is 'name' and in particular if it starts with '#':
In [180]:
[x for x in temp[0]['trends'] if x['name'].find('#') ==0]
Out[180]:
[{'name': '#FETUSONEDIRECTIONDAY',
'promoted_content': None,
'query': '%23FETUSONEDIRECTIONDAY',
'url': 'http://twitter.com/search?q=%23FETUSONEDIRECTIONDAY'},
{'name': '#PrayForSouthKorea',
'promoted_content': None,
'query': '%23PrayForSouthKorea',
'url': 'http://twitter.com/search?q=%23PrayForSouthKorea'},
{'name': '#GaraGaraRP',
'promoted_content': None,
'query': '%23GaraGaraRP',
'url': 'http://twitter.com/search?q=%23GaraGaraRP'},
{'name': '#إسم_أمي_بجوالي',
'promoted_content': None,
'query': '%23%D8%A5%D8%B3%D9%85_%D8%A3%D9%85%D9%8A_%D8%A8%D8%AC%D9%88%D8%A7%D9%84%D9%8A',
'url': 'http://twitter.com/search?q=%23%D8%A5%D8%B3%D9%85_%D8%A3%D9%85%D9%8A_%D8%A8%D8%AC%D9%88%D8%A7%D9%84%D9%8A'},
{'name': '#KadınlarKamyonŞoförüOlursa',
'promoted_content': None,
'query': '%23Kad%C4%B1nlarKamyon%C5%9Eof%C3%B6r%C3%BCOlursa',
'url': 'http://twitter.com/search?q=%23Kad%C4%B1nlarKamyon%C5%9Eof%C3%B6r%C3%BCOlursa'}]
EDIT
To get just the hastags:
In [181]:
[x['name'] for x in temp[0]['trends'] if x['name'].find('#') ==0]
Out[181]:
['#FETUSONEDIRECTIONDAY',
'#PrayForSouthKorea',
'#GaraGaraRP',
'#إسم_أمي_بجوالي',
'#KadınlarKamyonŞoförüOlursa']
You can use startswith instead of find:
[x['name'] for x in temp[0]['trends'] if x['name'].startswith('#')]
Your data is a list containing one dictionary. One of the keys in this dictionary is called trends. The value for this key is a list of dictionaries. Each of these dictionaries contains a key called name, which holds a string containing a hashtag. Here's an example of accessing your data:
hashtags = []
trends = data[0]['trends']
for trend in trends:
name = trend['name']
if name.startswith('#'):
hashtags.append(name)
This can be compacted to:
hashtags = [trend['name'] for trend in data[0]['trends'] if trend['name'].startswith('#')]
First three lines of output:
>>> for hashtag in hashtags:
print(hashtag)
#FETUSONEDIRECTIONDAY
#PrayForSouthKorea
#GaraGaraRP

Parsing complex and changing JSON data in Python, several levels deep

I am trying to parse changing JSON data, however the JSON data is a bit complex and changes wtih each iteration.
The JSON data is being parsed inside a loop so each time the loop runs, the json data is different. I'm focused right now on the education data.
THE JSON DATA:
First one might look like this:
{u'gender': u'female', u'id': u'15394'}
Next one might be:
{
u'gender': u'male', u'birthday': u'12/10/1983', u'location': {u'id': '12', u'name': u'Mexico City, Mexico'}, u'hometown': {u'id': u'19', u'name': u'Mexico City, Mexico'},
u'education': [
{
u'school': {u'id': u'22', u'name': u'Institut Saint Dominique de Rome'},
u'type': u'High School',
u'year': {u'id': u'33', u'name': u'2002'}
},
{
u'school': {u'id': u'44', u'name': u'Instituto Cumbres'},
u'type': u'High School',
u'year': {u'id': u'55', u'name': u'1999'}
},
{
u'school': {u'id': u'66', u'name': u'Chantemerle International School'},
u'type': u'High School',
u'year': {u'id': u'77', u'name': u'1998'}
},
{
u'school': {u'id': u'88', u'name': u'Columbia University'},
u'type': u'College',
u'concentration':
[{u'id': u'91', u'name': u'Economics'},
{u'id': u'92', u'name': u'Film Studies'}]
}
],
u'id': u'100384'}
I am trying to return all the values for school name, school id and school type, so essentially I want [education][school][id], [education][school][name], [education][school][type] in one line. However, every person has a different number of schools listed and different types of schools or no schools at all. I want to return each school with its associated name, id and type on a new line within my existing loop.
IDEAL OUTPUT:
1 34 Boston Latin School High School
1 26 Harvard University College
1 22 University of Michigan Graduate School
The one in this case refers to a friend_id, which I have already set up to append to the list as the first item in each loop.
I've tried:
friend_data = response.read()
friend_json = json.loads(friend_data)
#This below is inside a loop pulling data for each friend:
try:
for school_id in friend_json['education']:
school_id = school_id['school']['id']
friendedu.append(school_id)
for school_name in friend_json['education']:
school_name = school_name['school']['name']
friendedu.append(school_name)
for school_type in friend_json['education']:
school_type = school_type['type']
friendedu.append(school_type)
except:
school_id = "NULL"
print friendedu
writer.writerow(friendedu)
CURRENT OUTPUT:
[u'22', u'44', u'66', u'88', u'Institut Saint Dominique de Rome', u'Instituto Cumbres', u'Chantemerle International School', u'Columbia University', u'High School', u'High School', u'High School', u'College']
This output is just a list of the values it has pulled, instead I'm trying to organize the output as shown above. I think that perhaps another for-loop is called for since for one person I want each school to be on its own line. Right now, the friendedu list is appending all the education info for one person into each line of the list. I want each education item in a new line and then move on to the next person and continue to write rows for the next person.
how about
friend_data = response.read()
friend_json = json.loads(friend_data)
if 'education' in friend_json.keys():
for school_id in friend_json['education']:
friendedu = []
try:
friendedu.append(school_id['school']['id'])
friendedu.append(school_name['school']['name'])
friendedu.append(school_type['school']['type'])
except:
friendedu.append('School ID, NAME, or type not found')
print(" ".join(friendedu))
import csv
import json
import requests
def student_schools(student, fields=["id", "name", "type"], default=None):
schools = student.get("education", [])
return ((school.get(field, default) for field in fields) for school in schools)
def main():
res = requests.get(STUDENT_URL).contents
students = json.loads(res)
with open(OUTPUT, "wb") as outf:
outcsv = csv.writer(outf)
for student in students["results"]: # or whatever the root label is
outcsv.writerows(student_schools(student))
if __name__=="__main__":
main()
You certainly don't need more for loops.
One will do:
friendedu = []
for school_id in friend_json['education']:
friendedu.append("{id} {name} {type}".format(
id=school_id['school']['id'],
name=school_name['school']['name'],
type=school_type['school']['type'])
Or a list comprehension:
friendedu = ["{id} {name} {type}".format(
id=school_id['school']['id'],
name=school_name['school']['name'],
type=school_type['school']['type']) for school_id in friend_json['education']]

JSON array and python

I am having some problems to parse a JSON object that I get when I GET a URL:
[{"id":1,"version":23,"external_id":"2312","url":"https://example.com/432","type":"typeA","date":"2","notes":"notes","title":"title","abstract":"dsadasdas","details":"something","accuracy":0,"reliability":0,"severity":12,"thing":"32132","other":["aaaaaaaaaaaaaaaaaa","bbbbbbbbbbbbbb","cccccccccccccccc","dddddddddddddd","eeeeeeeeee"],"nana":8},{"id":2,"version":23,"external_id":"2312","url":"https://example.com/432","type":"typeA","date":"2","notes":"notes","title":"title","abstract":"dsadasdas","details":"something","accuracy":0,"reliability":0,"severity":12,"thing":"32132","other":["aaaaaaaaaaaaaaaaaa","bbbbbbbbbbbbbb","cccccccccccccccc","dddddddddddddd","eeeeeeeeee"],"nana":8}]
Like you can see the JSON start with "[" and ends "]"
I am using this code:
import json
import urllib2
data = json.load(urllib2.urlopen('http://someurl/path/to/json'))
print data
And I get this:
[{u'severity': 12, u'title': u'title', u'url': u'https://example.com/432', u'external_id': u'2312', u'notes': u'notes', u'abstract': u'dsadasdas', u'other': [u'aaaaaaaaaaaaaaaaaa', u'bbbbbbbbbbbbbb', u'cccccccccccccccc', u'dddddddddddddd', u'eeeeeeeeee'], u'thing': u'32132', u'version': 23, u'nana': 8, u'details': u'something', u'date': u'2', u'reliability': 0, u'type': u'typeA', u'id': 1, u'accuracy': 0}, {u'severity': 12, u'title': u'title', u'url': u'https://example.com/432', u'external_id': u'2312', u'notes': u'notes', u'abstract': u'dsadasdas', u'other': [u'aaaaaaaaaaaaaaaaaa', u'bbbbbbbbbbbbbb', u'cccccccccccccccc', u'dddddddddddddd', u'eeeeeeeeee'], u'thing': u'32132', u'version': 23, u'nana': 8, u'details': u'something', u'date': u'2', u'reliability': 0, u'type': u'typeA', u'id': 2, u'accuracy': 0}]
If the JSON is too large I don't get the full info.
What I am doing wrong?
Thank you
There is nothing wrong with [] in json. It simply means a list. To pretty print your json try this:
import json
import urllib2
data = json.load(urllib2.urlopen('http://someurl/path/to/json'))
print json.dumps(data, sort_keys=True, indent=4, separators=(',', ': '))
To find particular object just do this:
obj = next((obj for obj in data if obj["id"] == 2), None)

Categories

Resources