How to get the profile pictures of tweets using tweepy - python

I am getting the description of tweets I just wanted to know whether i can get the profile pictures of the tweets using tweepy api.
this is my code for getting tweets
{
'protected': False,
'followers_count': 503785,
'friends_count': 57994,
'listed_count': 212,
'created_at': 'Tue Sep 09 19:10:54 +0000 2014',
'favourites_count': 463435,
'utc_offset': None,
'time_zone': None,
'geo_enabled': True,
'verified': False,
'statuses_count': 105191,
'lang': 'en',
'contributors_enabled': False,
'is_translator': False,
'is_translation_enabled': False,
'profile_background_color': 'C0DEED',
'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png',
'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png',
'profile_background_tile': False,
'profile_image_url': 'http://pbs.twimg.com/profile_images/1109808930392363008/OFM92rn__normal.jpg',
'profile_image_url_https': 'https://pbs.twimg.com/profile_images/1109808930392363008/OFM92rn__normal.jpg',
'profile_banner_url': 'https://pbs.twimg.com/profile_banners/2800434769/1552662493',
'profile_link_color': '1DA1F2',
'profile_sidebar_border_color': 'C0DEED',
'profile_sidebar_fill_color': 'DDEEF6',
'profile_text_color': '333333',
'profile_use_background_image': True,
'has_extended_profile': True,
'default_profile': True,
'default_profile_image': False,
'following': None,
'follow_request_sent': None,
'notifications': None,
'translator_type': 'none'
},

The answer is simply iterate the results and json
for x in results:
print(x.user.profile_image_url)

Related

Invalid Syntax on ast.literal_eval() from data streaming result

I'm using tweepy for streaming and json.loads to get the data. I saved it as txt files.
def on_data(self, data):
all_data = json.loads(data)
save_file.write(str(all_data)+"\n")
Now I want to extract several property from the data, but the problem is when I'm using ast.literal_eval() for solving the quotes and comma error, I'm getting another error.
Traceback (most recent call last):
File "C:\Users\RandomScientist\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2910, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-27-ffea75fc7446>", line 3, in <module>
data = ast.literal_eval(data)
File "C:\Users\RandomScientist\Anaconda3\lib\ast.py", line 48, in literal_eval
node_or_string = parse(node_or_string, mode='eval')
File "C:\Users\RandomScientist\Anaconda3\lib\ast.py", line 35, in parse
return compile(source, filename, mode, PyCF_ONLY_AST)
File "<unknown>", line 2
{'created_at': 'Thu Apr 04 07:00:10 +0000 2019', 'id': 1113697753530392577, 'id_str': '1113697753530392577', 'text': 'Karena kita adalah suratan terbuka kasih-Nya untuk dunia \n#iamthemessenjah (link)', 'source': 'Facebook', 'truncated': False, 'in_reply_to_status_id': None, 'in_reply_to_status_id_str': None, 'in_reply_to_user_id': None, 'in_reply_to_user_id_str': None, 'in_reply_to_screen_name': None, 'user': {'id': 234355404, 'id_str': '234355404', 'name': 'Messenjah Clothing', 'screen_name': 'MessenjahCloth', 'location': 'YOGYAKARTA', 'url': 'http://www.messenjahclothing.com', 'description': 'THE WORLD CHANGER pages : http://www.facebook.com/messenjahclothingdotcom pin: 578CD443 WA: +6285 727 386 267 IG: #the_messenjah IG product #messenjahstore', 'translator_type': 'none', 'protected': False, 'verified': False, 'followers_count': 3405, 'friends_count': 190, 'listed_count': 7, 'favourites_count': 204, 'statuses_count': 58765, 'created_at': 'Wed Jan 05 13:13:10 +0000 2011', 'utc_offset': None, 'time_zone': None, 'geo_enabled': True, 'lang': 'en', 'contributors_enabled': False, 'is_translator': False, 'profile_background_color': '7E808A', 'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme3/bg.gif', 'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme3/bg.gif', 'profile_background_tile': False, 'profile_link_color': '0400DB', 'profile_sidebar_border_color': '000000', 'profile_sidebar_fill_color': '252429', 'profile_text_color': '666666', 'profile_use_background_image': True, 'profile_image_url': 'http://pbs.twimg.com/profile_images/882803510932156417/KenYVq-i_normal.jpg', 'profile_image_url_https': 'https://pbs.twimg.com/profile_images/882803510932156417/KenYVq-i_normal.jpg', 'profile_banner_url': 'https://pbs.twimg.com/profile_banners/234355404/1523949182', 'default_profile': False, 'default_profile_image': False, 'following': None, 'follow_request_sent': None, 'notifications': None}, 'geo': None, 'coordinates': None, 'place': None, 'contributors': None, 'is_quote_status': False, 'quote_count': 0, 'reply_count': 0, 'retweet_count': 0, 'favorite_count': 0, 'entities': {'hashtags': [{'text': 'iamthemessenjah', 'indices': [60, 76]}], 'urls': [{'url': 'https: (link)', 'expanded_url': 'https://www.facebook.com/messenjahclothingdotcom/posts/2575823355778752', 'display_url': 'facebook.com/messenjahcloth…', 'indices': [77, 100]}], 'user_mentions': [], 'symbols': []}, 'favorited': False, 'retweeted': False, 'possibly_sensitive': False, 'filter_level': 'low', 'lang': 'in', 'timestamp_ms': '1554361210602'}
^
SyntaxError: invalid syntax
Here's my code
with open('pre-process.txt','r') as file:
data = file.read()
data = ast.literal_eval(data)
print(data)
And I've been reading several answer like Python request using ast.literal_eval error Invalid syntax? and Python ast.literal_eval on dictionary string not working (SyntaxError: invalid syntax) but didn't get the suitable solution. Any idea? Thanks in advance.
It appears that you have one value per line in the file, so you need to read it a line at a time and call ast.literal_eval() on the line, not try to evaluate the entire file at once.
with open('pre-process.txt','r') as file:
for line in file:
data = ast.literal_eval(line)
print(data)

TypeError: an integer is required when select subset of rows dataframe pandas

{'contributors': None,
'coordinates': None,
'created_at': 'Tue Aug 02 19:51:58 +0000 2016',
'entities': {'hashtags': [],
'symbols': [],
'urls': [],
'user_mentions': [{'id': 873491544,
'id_str': '873491544',
'indices': [0, 13],
'name': 'Kenel M',
'screen_name': 'KxSweaters13'}]},
'favorite_count': 1,
'favorited': False,
'geo': None,
'id': 760563814450491392,
'id_str': '760563814450491392',
'in_reply_to_screen_name': 'KxSweaters13',
'in_reply_to_status_id': None,
'in_reply_to_status_id_str': None,
'in_reply_to_user_id': 873491544,
'in_reply_to_user_id_str': '873491544',
'is_quote_status': False,
'lang': 'en',
'metadata': {'iso_language_code': 'en', 'result_type': 'recent'},
'place': {'attributes': {},
'bounding_box': {'coordinates': [[[-71.813501, 42.4762],
[-71.702186, 42.4762],
[-71.702186, 42.573956],
[-71.813501, 42.573956]]],
'type': 'Polygon'},
'contained_within': [],
'country': 'Australia',
'country_code': 'AUS',
'full_name': 'Melbourne, V',
'id': 'c4f1830ea4b8caaf',
'name': 'Melbourne',
'place_type': 'city',
'url': 'https://api.twitter.com/1.1/geo/id/c4f1830ea4b8caaf.json'},
'retweet_count': 0,
'retweeted': False,
'source': 'Twitter for Android',
'text': '#KxSweaters13 are you the kenelx13 I see owning leominster for team valor?',
'truncated': False,
'user': {'contributors_enabled': False,
'created_at': 'Thu Apr 21 17:09:52 +0000 2011',
'default_profile': False,
'default_profile_image': False,
'description': "Arbys when it's cold. Kimballs when it's warm. #Ally__09 all year. Comp sci classes sometimes.",
'entities': {'description': {'urls': []}},
'favourites_count': 1106,
'follow_request_sent': None,
'followers_count': 167,
'following': None,
'friends_count': 171,
'geo_enabled': True,
'has_extended_profile': False,
'id': 285715182,
'id_str': '285715182',
'is_translation_enabled': False,
'is_translator': False,
'lang': 'en',
'listed_count': 2,
'location': 'MA',
'name': 'Steve',
'notifications': None,
'profile_background_color': '131516',
'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme14/bg.gif',
'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme14/bg.gif',
'profile_background_tile': True,
'profile_banner_url': 'https://pbs.twimg.com/profile_banners/285715182/1462218226',
'profile_image_url': 'http://pbs.twimg.com/profile_images/727223698332200961/bGPjGjHK_normal.jpg',
'profile_image_url_https': 'https://pbs.twimg.com/profile_images/727223698332200961/bGPjGjHK_normal.jpg',
'profile_link_color': '4A913C',
'profile_sidebar_border_color': 'FFFFFF',
'profile_sidebar_fill_color': 'EFEFEF',
'profile_text_color': '333333',
'profile_use_background_image': True,
'protected': False,
'screen_name': 'StephenBurke_',
'statuses_count': 5913,
'time_zone': 'Eastern Time (US & Canada)',
'url': None,
'utc_offset': -14400,
'verified': False}}
I have a json file which contains a list of json objects (each has the structure like above)
So I read it into a dataframe:
df = pd.read_json('data.json')
and then I try to get all the rows which are the 'city' type by:
df = df[df['place']['place_type'] == 'city']
but then I got the 'TypeError: an integer is required' During handling of the above exception, another exception occurred: KeyError: 'place_type'
Then I tried:
df['place'].head(3)
=>
0 {'id': '01864a8a64df9dc4', 'url': 'https://api...
1 {'id': '01864a8a64df9dc4', 'url': 'https://api...
2 {'id': '0118c71c0ed41109', 'url': 'https://api...
Name: place, dtype: object
So df['place'] return a series where keys are the indexes and that's why I got the TypeError
I've also tried to select the place_type of the first row and it works just fine:
df.iloc[0]['place']['place_type']
=>
city
The question is how can I filter out the rows in this case?
Solution:
Okay, so the problem lies in the fact that the pd.read_json cannot deal with nested JSON structure, so what I have done is to normalize the json object:
with open('data.json') as jsonfile:
data = json.load(jsonfile)
df = pd.io.json.json_normalize(data)
df = df[df['place.place_type'] == 'city']
You can use the a list comprehension to do the filtering you need.
df = [loc for loc in df if d['place']['place_type'] == 'city']
This will give you an array where the elements place_type is 'city'.
I don't know if you have to use the place_type that is the index, to show all the rows that contains city.
"and then I try to get all the rows which are the city type by:"
This way you can get all the rows that contains city in the column place:
df = df[(df['place'] == 'city')]

Reading Json objects from text file into pandas

I have extracted json objects from an api library and wrote them into a text file. I am now stuck on how to take the json structure saved in the .txt file and read that back into python pandas library.
There are many resources that walk through how to import a json file into pandas but since this is a text file and I'm new to programming and working with json structure I'm not sure how to efficiently perform this task.
There are numerous json objects in the text file and I would share an example but it has a bunch of url shorteners which is preventing me from being able to post this question so unless someone really needs to see the structure Ill hold off. I already tried pd.read_csv() and pd.read_json() but since this is a json structure in a .txt file its not working properly for either so far.
Here has been my best guess so far to get the data back into python:
data = []
with open('tweet_json.txt') as f:
for line in f:
data.append(json.loads(line))
But I got the following error message when I tried that: JSONDecodeError: Extra data: line 1 column 4626 (char 4625)
Here are two tweets that you can copy and save to a .txt file to replicate:
{'contributors': None,
'coordinates': None,
'created_at': 'Tue Aug 01 16:23:56 +0000 2017',
'display_text_range': [0, 85],
'entities': {'hashtags': [],
'media': [{'display_url': 'pic.twitter.com/MgUWQ76dJU',
'expanded_url': 'https://twitter.com/dog_rates/status/892420643555336193/photo/1',
'id': 892420639486877696,
'id_str': '892420639486877696',
'indices': [86, 109],
'media_url': 'http://pbs.twimg.com/media/DGKD1-bXoAAIAUK.jpg',
'media_url_https': 'https://pbs.twimg.com/media/DGKD1-bXoAAIAUK.jpg',
'sizes': {'large': {'h': 528, 'resize': 'fit', 'w': 540},
'medium': {'h': 528, 'resize': 'fit', 'w': 540},
'small': {'h': 528, 'resize': 'fit', 'w': 540},
'thumb': {'h': 150, 'resize': 'crop', 'w': 150}},
'type': 'photo',
'url': na}],
'symbols': [],
'urls': [],
'user_mentions': []},
'extended_entities': {'media': [{'display_url': 'pic.twitter.com/MgUWQ76dJU',
'expanded_url': 'https://twitter.com/dog_rates/status/892420643555336193/photo/1',
'id': 892420639486877696,
'id_str': '892420639486877696',
'indices': [86, 109],
'media_url': 'http://pbs.twimg.com/media/DGKD1-bXoAAIAUK.jpg',
'media_url_https': 'https://pbs.twimg.com/media/DGKD1-bXoAAIAUK.jpg',
'sizes': {'large': {'h': 528, 'resize': 'fit', 'w': 540},
'medium': {'h': 528, 'resize': 'fit', 'w': 540},
'small': {'h': 528, 'resize': 'fit', 'w': 540},
'thumb': {'h': 150, 'resize': 'crop', 'w': 150}},
'type': 'photo',
'url': na}]},
'favorite_count': 39311,
'favorited': False,
'full_text': "This is Phineas. He's a mystical boy. Only ever appears in the hole of a donut. 13/10 na ",
'geo': None,
'id': 892420643555336193,
'id_str': '892420643555336193',
'in_reply_to_screen_name': None,
'in_reply_to_status_id': None,
'in_reply_to_status_id_str': None,
'in_reply_to_user_id': None,
'in_reply_to_user_id_str': None,
'is_quote_status': False,
'lang': 'en',
'place': None,
'possibly_sensitive': False,
'possibly_sensitive_appealable': False,
'retweet_count': 8778,
'retweeted': False,
'source': 'Twitter for iPhone',
'truncated': False,
'user': {'contributors_enabled': False,
'created_at': 'Sun Nov 15 21:41:29 +0000 2015',
'default_profile': False,
'default_profile_image': False,
'description': 'Only Legit Source for Professional Dog Ratings STORE: #ShopWeRateDogs | IG, FB & SC: WeRateDogs | MOBILE APP: #GoodDogsGame Business: dogratingtwitter#gmail.com',
'entities': {'description': {'urls': []},
'url': {'urls': [{'display_url': 'weratedogs.com',
'expanded_url': 'http://weratedogs.com',
'indices': [0, 23],
'url': na }]}},
'favourites_count': 126135,
'follow_request_sent': False,
'followers_count': 4730764,
'following': False,
'friends_count': 109,
'geo_enabled': True,
'has_extended_profile': True,
'id': 4196983835,
'id_str': '4196983835',
'is_translation_enabled': False,
'is_translator': False,
'lang': 'en',
'listed_count': 3700,
'location': 'DM YOUR DOGS. WE WILL RATE',
'name': 'WeRateDogs™',
'notifications': False,
'profile_background_color': '000000',
'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png',
'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png',
'profile_background_tile': False,
'profile_banner_url': 'https://pbs.twimg.com/profile_banners/4196983835/1510812288',
'profile_image_url': 'http://pbs.twimg.com/profile_images/936608706107772929/GwbLQRxf_normal.jpg',
'profile_image_url_https': 'https://pbs.twimg.com/profile_images/936608706107772929/GwbLQRxf_normal.jpg',
'profile_link_color': 'F5ABB5',
'profile_sidebar_border_color': '000000',
'profile_sidebar_fill_color': '000000',
'profile_text_color': '000000',
'profile_use_background_image': False,
'protected': False,
'screen_name': 'dog_rates',
'statuses_count': 6301,
'time_zone': None,
'translator_type': 'none',
'url': n/a,
'utc_offset': None,
'verified': True}}
{'contributors': None,
'coordinates': None,
'created_at': 'Tue Aug 01 00:17:27 +0000 2017',
'display_text_range': [0, 138],
'entities': {'hashtags': [],
'media': [{'display_url': 'pic.twitter.com/0Xxu71qeIV',
'expanded_url': 'https://twitter.com/dog_rates/status/892177421306343426/photo/1',
'id': 892177413194625024,
'id_str': '892177413194625024',
'indices': [139, 162],
'media_url': 'http://pbs.twimg.com/media/DGGmoV4XsAAUL6n.jpg',
'media_url_https': 'https://pbs.twimg.com/media/DGGmoV4XsAAUL6n.jpg',
'sizes': {'large': {'h': 1600, 'resize': 'fit', 'w': 1407},
'medium': {'h': 1200, 'resize': 'fit', 'w': 1055},
'small': {'h': 680, 'resize': 'fit', 'w': 598},
'thumb': {'h': 150, 'resize': 'crop', 'w': 150}},
'type': 'photo',
'url': na}],
'symbols': [],
'urls': [],
'user_mentions': []},
'extended_entities': {'media': [{'display_url': 'pic.twitter.com/0Xxu71qeIV',
'expanded_url': 'https://twitter.com/dog_rates/status/892177421306343426/photo/1',
'id': 892177413194625024,
'id_str': '892177413194625024',
'indices': [139, 162],
'media_url': 'http://pbs.twimg.com/media/DGGmoV4XsAAUL6n.jpg',
'media_url_https': 'https://pbs.twimg.com/media/DGGmoV4XsAAUL6n.jpg',
'sizes': {'large': {'h': 1600, 'resize': 'fit', 'w': 1407},
'medium': {'h': 1200, 'resize': 'fit', 'w': 1055},
'small': {'h': 680, 'resize': 'fit', 'w': 598},
'thumb': {'h': 150, 'resize': 'crop', 'w': 150}},
'type': 'photo',
'url': na}]},
'favorite_count': 33662,
'favorited': False,
'full_text': "This is Tilly. She's just checking pup on you. Hopes you're doing ok. If not, she's available for pats, snugs, boops, the whole bit. 13/10 na,
'geo': None,
'id': 892177421306343426,
'id_str': '892177421306343426',
'in_reply_to_screen_name': None,
'in_reply_to_status_id': None,
'in_reply_to_status_id_str': None,
'in_reply_to_user_id': None,
'in_reply_to_user_id_str': None,
'is_quote_status': False,
'lang': 'en',
'place': None,
'possibly_sensitive': False,
'possibly_sensitive_appealable': False,
'retweet_count': 6431,
'retweeted': False,
'source': 'Twitter for iPhone',
'truncated': False,
'user': {'contributors_enabled': False,
'created_at': 'Sun Nov 15 21:41:29 +0000 2015',
'default_profile': False,
'default_profile_image': False,
'description': 'Only Legit Source for Professional Dog Ratings STORE: #ShopWeRateDogs | IG, FB & SC: WeRateDogs | MOBILE APP: #GoodDogsGame Business: dogratingtwitter#gmail.com',
'entities': {'description': {'urls': []},
'url': {'urls': [{'display_url': 'weratedogs.com',
'expanded_url': 'http://weratedogs.com',
'indices': [0, 23],
'url': na}]}},
'favourites_count': 126135,
'follow_request_sent': False,
'followers_count': 4730865,
'following': False,
'friends_count': 109,
'geo_enabled': True,
'has_extended_profile': True,
'id': 4196983835,
'id_str': '4196983835',
'is_translation_enabled': False,
'is_translator': False,
'lang': 'en',
'listed_count': 3728,
'location': 'DM YOUR DOGS. WE WILL RATE',
'name': 'WeRateDogs™',
'notifications': False,
'profile_background_color': '000000',
'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png',
'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png',
'profile_background_tile': False,
'profile_banner_url': 'https://pbs.twimg.com/profile_banners/4196983835/1510812288',
'profile_image_url': 'http://pbs.twimg.com/profile_images/936608706107772929/GwbLQRxf_normal.jpg',
'profile_image_url_https': 'https://pbs.twimg.com/profile_images/936608706107772929/GwbLQRxf_normal.jpg',
'profile_link_color': 'F5ABB5',
'profile_sidebar_border_color': '000000',
'profile_sidebar_fill_color': '000000',
'profile_text_color': '000000',
'profile_use_background_image': False,
'protected': False,
'screen_name': 'dog_rates',
'statuses_count': 6301,
'time_zone': None,
'translator_type': 'none',
'url': na,
'utc_offset': None,
'verified': True}}
Update
The following code produces this error: JSONDecodeError: Expecting ',' delimiter: line 1 column 4627 (char 4626)
with open('tweet_json.txt', 'r') as f:
datastore = json.load(f)
This post is the closest I've found so far to help me solve my issue:
Python json.loads shows ValueError: Expecting , delimiter: line 1
Thanks everyone for the feedback. I had to adjust the code regarding how I was extracting the data from the API and then it was pretty straight-forward to get the data into a list of dictionaries after that.
with open('tweet_json.txt', 'a+', encoding='utf-8') as file:
for tweet_id in twitter_archive_df['tweet_id']:
try:
tweet = api.get_status(id = tweet_id, tweet_mode='extended')
file.write(json.dumps(tweet))
file.write('\n')
except:
pass
file.close()
then I ran the following code to import the json objects from the .txt file into a list of dictionaries:
with open('tweet_json.txt') as file:
status = []
for line in file:
status.append(json.loads(line))

Python Tweepy no response on Stream

Hello i try to listen on a tweet channel using python with libary Tweepy.
I use python 2.7.11 and install Tweepy using pip. When i run the following code i get no response an no error. Can you tell me what the problem is and how can i fix this:
from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import time
import json
#EDITED 13:25
from tweepy.auth import API
# Twitter Credentials
ckey = 'Consumer Key (API Key)'
csecret = 'Consumer Secret (API Secret)'
atoken = 'Access Token'
asecret = 'Access Token Secret'
class listener(StreamListener):
def on_data(self, data):
try:
tweet = json.loads(data)
if tweet["lang"] == "nl":
print tweet["id"]
return True
except BaseException, e:
print 'failed on_date,', str(e)
time.sleep(5)
def on_error(self, status):
print status
auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)
twitterStream = Stream(auth, listener())
#EDITED 13:25
print api.verify_credentials()
# twitterStream.filter( track=lstZoekwaarde, languages="nl" )
twitterStream.filter(track='christmas', languages="nl")
CONSOLE: api.verify_credentials()
User(follow_request_sent=False, has_extended_profile=False, profile_use_background_image=True, _json={u'follow_request_sent': False, u'has_extended_profile': False, u'profile_use_background_image': True, u'default_profile_image': False, u'id': 169505005, u'profile_background_image_url_https': u'https://abs.twimg.com/images/themes/theme1/bg.png', u'verified': False, u'translator_type': u'none', u'profile_text_color': u'333333', u'profile_image_url_https': u'https://pbs.twimg.com/profile_images/1425063736/image_normal.jpg', u'profile_sidebar_fill_color': u'DDEEF6', u'entities': {u'description': {u'urls': []}}, u'followers_count': 7, u'profile_sidebar_border_color': u'C0DEED', u'id_str': u'169505005', u'profile_background_color': u'C0DEED', u'listed_count': 0, u'status': {u'contributors': None, u'truncated': False, u'text': u'aan het werk bij Alfam', u'is_quote_status': False, u'in_reply_to_status_id': None, u'id': 541894460343582720, u'favorite_count': 1, u'source': u'Twitter for Android', u'retweeted': False, u'coordinates': {u'type': u'Point', u'coordinates': [5.207323, 52.0616799]}, u'entities': {u'symbols': [], u'user_mentions': [], u'hashtags': [], u'urls': []}, u'in_reply_to_screen_name': None, u'in_reply_to_user_id': None, u'retweet_count': 0, u'id_str': u'541894460343582720', u'favorited': False, u'geo': {u'type': u'Point', u'coordinates': [52.0616799, 5.207323]}, u'in_reply_to_user_id_str': None, u'lang': u'nl', u'created_at': u'Mon Dec 08 09:58:01 +0000 2014', u'in_reply_to_status_id_str': None, u'place': {u'full_name': u'Bunnik, Nederland', u'url': u'https://api.twitter.com/1.1/geo/id/ef77325fbde0f5ad.json', u'country': u'The Netherlands', u'place_type': u'city', u'bounding_box': {u'type': u'Polygon', u'coordinates': [[[5.1532516, 51.9976555], [5.2803233, 51.9976555], [5.2803233, 52.0801935], [5.1532516, 52.0801935]]]}, u'contained_within': [], u'country_code': u'NL', u'attributes': {}, u'id': u'ef77325fbde0f5ad', u'name': u'Bunnik'}}, u'is_translation_enabled': False, u'utc_offset': None, u'statuses_count': 186, u'description': u'', u'friends_count': 7, u'location': u'', u'profile_link_color': u'1DA1F2', u'profile_image_url': u'http://pbs.twimg.com/profile_images/1425063736/image_normal.jpg', u'following': False, u'geo_enabled': True, u'profile_background_image_url': u'http://abs.twimg.com/images/themes/theme1/bg.png', u'screen_name': u'ehoeven', u'lang': u'en', u'profile_background_tile': False, u'favourites_count': 1, u'name': u'Erik Hoeven', u'notifications': False, u'url': None, u'created_at': u'Thu Jul 22 14:12:09 +0000 2010', u'contributors_enabled': False, u'time_zone': None, u'protected': False, u'default_profile': True, u'is_translator': False}, time_zone=None, id=169505005, description=u'', _api=<tweepy.api.API object at 0x7efdf2d5a510>, verified=False, profile_text_color=u'333333', profile_image_url_https=u'https://pbs.twimg.com/profile_images/1425063736/image_normal.jpg', profile_sidebar_fill_color=u'DDEEF6', is_translator=False, geo_enabled=True, entities={u'description': {u'urls': []}}, followers_count=7, protected=False, id_str=u'169505005', default_profile_image=False, listed_count=0, status=Status(contributors=None, truncated=False, text=u'aan het werk bij Alfam', is_quote_status=False, in_reply_to_status_id=None, id=541894460343582720, favorite_count=1, _api=<tweepy.api.API object at 0x7efdf2d5a510>, source=u'Twitter for Android', _json={u'contributors': None, u'truncated': False, u'text': u'aan het werk bij Alfam', u'is_quote_status': False, u'in_reply_to_status_id': None, u'id': 541894460343582720, u'favorite_count': 1, u'source': u'Twitter for Android', u'retweeted': False, u'coordinates': {u'type': u'Point', u'coordinates': [5.207323, 52.0616799]}, u'entities': {u'symbols': [], u'user_mentions': [], u'hashtags': [], u'urls': []}, u'in_reply_to_screen_name': None, u'in_reply_to_user_id': None, u'retweet_count': 0, u'id_str': u'541894460343582720', u'favorited': False, u'geo': {u'type': u'Point', u'coordinates': [52.0616799, 5.207323]}, u'in_reply_to_user_id_str': None, u'lang': u'nl', u'created_at': u'Mon Dec 08 09:58:01 +0000 2014', u'in_reply_to_status_id_str': None, u'place': {u'full_name': u'Bunnik, Nederland', u'url': u'https://api.twitter.com/1.1/geo/id/ef77325fbde0f5ad.json', u'country': u'The Netherlands', u'place_type': u'city', u'bounding_box': {u'type': u'Polygon', u'coordinates': [[[5.1532516, 51.9976555], [5.2803233, 51.9976555], [5.2803233, 52.0801935], [5.1532516, 52.0801935]]]}, u'contained_within': [], u'country_code': u'NL', u'attributes': {}, u'id': u'ef77325fbde0f5ad', u'name': u'Bunnik'}}, coordinates={u'type': u'Point', u'coordinates': [5.207323, 52.0616799]}, entities={u'symbols': [], u'user_mentions': [], u'hashtags': [], u'urls': []}, in_reply_to_screen_name=None, id_str=u'541894460343582720', retweet_count=0, in_reply_to_user_id=None, favorited=False, source_url=u'http://twitter.com/download/android', geo={u'type': u'Point', u'coordinates': [52.0616799, 5.207323]}, in_reply_to_user_id_str=None, lang=u'nl', created_at=datetime.datetime(2014, 12, 8, 9, 58, 1), in_reply_to_status_id_str=None, place=Place(_api=<tweepy.api.API object at 0x7efdf2d5a510>, country_code=u'NL', url=u'https://api.twitter.com/1.1/geo/id/ef77325fbde0f5ad.json', country=u'The Netherlands', place_type=u'city', bounding_box=BoundingBox(_api=<tweepy.api.API object at 0x7efdf2d5a510>, type=u'Polygon', coordinates=[[[5.1532516, 51.9976555], [5.2803233, 51.9976555], [5.2803233, 52.0801935], [5.1532516, 52.0801935]]]), contained_within=[], full_name=u'Bunnik, Nederland', attributes={}, id=u'ef77325fbde0f5ad', name=u'Bunnik'), retweeted=False), lang=u'en', utc_offset=None, statuses_count=186, profile_background_color=u'C0DEED', friends_count=7, profile_link_color=u'1DA1F2', profile_image_url=u'http://pbs.twimg.com/profile_images/1425063736/image_normal.jpg', notifications=False, default_profile=True, profile_background_image_url_https=u'https://abs.twimg.com/images/themes/theme1/bg.png', profile_background_image_url=u'http://abs.twimg.com/images/themes/theme1/bg.png', name=u'Erik Hoeven', is_translation_enabled=False, profile_background_tile=False, favourites_count=1, screen_name=u'ehoeven', url=None, created_at=datetime.datetime(2010, 7, 22, 14, 12, 9), contributors_enabled=False, location=u'', profile_sidebar_border_color=u'C0DEED', translator_type=u'none', following=False)
Try this syntax:
twitterStream.filter(track=['christmas'], languages=['nl'])

Python Json loads() returning string instead of dictionary?

I'm trying to do some simple JSON parsing using Python 3's built in JSON module, and from reading a bunch of other questions on SO and googling, it seems this is supposed to be pretty straightforward. However, I think I'm getting a string returned instead of the expected dictionary.
Firstly, here is the JSON I am trying to get values from. It's just some output from Twitter's API
[{'in_reply_to_status_id_str': None, 'in_reply_to_screen_name': None, 'retweeted': False, 'in_reply_to_status_id': None, 'contributors': None, 'favorite_count': 0, 'in_reply_to_user_id': None, 'coordinates': None, 'source': 'Twitter Web Client', 'geo': None, 'retweet_count': 0, 'text': 'Tweeting a url \nhttp://t.co/QDVYv6bV90', 'created_at': 'Mon Sep 01 19:36:25 +0000 2014', 'entities': {'symbols': [], 'user_mentions': [], 'urls': [{'expanded_url': 'http://www.isthereanappthat.com', 'display_url': 'isthereanappthat.com', 'url': 'http://t.co/QDVYv6bV90', 'indices': [16, 38]}], 'hashtags': []}, 'id_str': '506526005943865344', 'in_reply_to_user_id_str': None, 'truncated': False, 'favorited': False, 'lang': 'en', 'possibly_sensitive': False, 'id': 506526005943865344, 'user': {'profile_text_color': '333333', 'time_zone': None, 'entities': {'description': {'urls': []}}, 'url': None, 'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png', 'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png', 'protected': False, 'default_profile_image': True, 'utc_offset': None, 'default_profile': True, 'screen_name': 'KickzWatch', 'follow_request_sent': False, 'following': False, 'profile_background_color': 'C0DEED', 'notifications': False, 'description': '', 'profile_sidebar_border_color': 'C0DEED', 'geo_enabled': False, 'verified': False, 'friends_count': 40, 'created_at': 'Mon Sep 01 16:29:18 +0000 2014', 'is_translator': False, 'profile_sidebar_fill_color': 'DDEEF6', 'statuses_count': 4, 'location': '', 'id_str': '2784389341', 'followers_count': 4, 'favourites_count': 0, 'contributors_enabled': False, 'is_translation_enabled': False, 'lang': 'en', 'profile_image_url': 'http://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png', 'profile_image_url_https': 'https://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png', 'id': 2784389341, 'profile_use_background_image': True, 'listed_count': 0, 'profile_background_tile': False, 'name': 'Maktub Destiny', 'profile_link_color': '0084B4'}, 'place': None}]
I assigned this String to a variable named json_string like so:
json_string = json.dumps(output)
jason = json.loads(json_string)
Then, when I try to get a specific key from the "jason" dictionary:
print(jason['hashtags'])
I'm getting an error:
TypeError: string indices must be integers
I want to be able to convert the json output to a dictionary, then use jason[key_name] call to get values using specified keys. Is there something obvious that I'm missing here?
This is my fist time working with Python, after coming from Java. I absolutely love the language and think it's very powerful. So, any help on this would be greatly appreciated!
Ok first you should print your object so that you can read it:
>>> from pprint import pprint
>>> output = [{'in_reply_to_status_id_str': None, 'in_reply_to_screen_name': None, 'retweeted': False, 'in_reply_to_status_id': None, 'contributors': None, 'favorite_count': 0, 'in_reply_to_user_id': None, 'coordinates': None, 'source': 'Twitter Web Client', 'geo': None, 'retweet_count': 0, 'text': 'Tweeting a url \nhttp://t.co/QDVYv6bV90', 'created_at': 'Mon Sep 01 19:36:25 +0000 2014', 'entities': {'symbols': [], 'user_mentions': [], 'urls': [{'expanded_url': 'http://www.isthereanappthat.com', 'display_url': 'isthereanappthat.com', 'url': 'http://t.co/QDVYv6bV90', 'indices': [16, 38]}], 'hashtags': []}, 'id_str': '506526005943865344', 'in_reply_to_user_id_str': None, 'truncated': False, 'favorited': False, 'lang': 'en', 'possibly_sensitive': False, 'id': 506526005943865344, 'user': {'profile_text_color': '333333', 'time_zone': None, 'entities': {'description': {'urls': []}}, 'url': None, 'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png', 'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png', 'protected': False, 'default_profile_image': True, 'utc_offset': None, 'default_profile': True, 'screen_name': 'KickzWatch', 'follow_request_sent': False, 'following': False, 'profile_background_color': 'C0DEED', 'notifications': False, 'description': '', 'profile_sidebar_border_color': 'C0DEED', 'geo_enabled': False, 'verified': False, 'friends_count': 40, 'created_at': 'Mon Sep 01 16:29:18 +0000 2014', 'is_translator': False, 'profile_sidebar_fill_color': 'DDEEF6', 'statuses_count': 4, 'location': '', 'id_str': '2784389341', 'followers_count': 4, 'favourites_count': 0, 'contributors_enabled': False, 'is_translation_enabled': False, 'lang': 'en', 'profile_image_url': 'http://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png', 'profile_image_url_https': 'https://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png', 'id': 2784389341, 'profile_use_background_image': True, 'listed_count': 0, 'profile_background_tile': False, 'name': 'Maktub Destiny', 'profile_link_color': '0084B4'}, 'place': None}]
>>> pprint(output)
[{'contributors': None,
'coordinates': None,
'created_at': 'Mon Sep 01 19:36:25 +0000 2014',
'entities': {'hashtags': [],
'symbols': [],
'urls': [{'display_url': 'isthereanappthat.com',
'expanded_url': 'http://www.isthereanappthat.com',
'indices': [16, 38],
'url': 'http://t.co/QDVYv6bV90'}],
'user_mentions': []},
'favorite_count': 0,
'favorited': False,
'geo': None,
'id': 506526005943865344,
'id_str': '506526005943865344',
'in_reply_to_screen_name': None,
'in_reply_to_status_id': None,
'in_reply_to_status_id_str': None,
'in_reply_to_user_id': None,
'in_reply_to_user_id_str': None,
'lang': 'en',
'place': None,
'possibly_sensitive': False,
'retweet_count': 0,
'retweeted': False,
'source': 'Twitter Web Client',
'text': 'Tweeting a url \nhttp://t.co/QDVYv6bV90',
'truncated': False,
'user': {'contributors_enabled': False,
'created_at': 'Mon Sep 01 16:29:18 +0000 2014',
'default_profile': True,
'default_profile_image': True,
'description': '',
'entities': {'description': {'urls': []}},
'favourites_count': 0,
'follow_request_sent': False,
'followers_count': 4,
'following': False,
'friends_count': 40,
'geo_enabled': False,
'id': 2784389341,
'id_str': '2784389341',
'is_translation_enabled': False,
'is_translator': False,
'lang': 'en',
'listed_count': 0,
'location': '',
'name': 'Maktub Destiny',
'notifications': False,
'profile_background_color': 'C0DEED',
'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png',
'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png',
'profile_background_tile': False,
'profile_image_url': 'http://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png',
'profile_image_url_https': 'https://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png',
'profile_link_color': '0084B4',
'profile_sidebar_border_color': 'C0DEED',
'profile_sidebar_fill_color': 'DDEEF6',
'profile_text_color': '333333',
'profile_use_background_image': True,
'protected': False,
'screen_name': 'KickzWatch',
'statuses_count': 4,
'time_zone': None,
'url': None,
'utc_offset': None,
'verified': False}}]
From looking at this you can see that output is a list which contains a single dict. To access this you need:
>>> first_elem = output[0]
You will also see that the hashtags key in the first_elem is contained in a second level dict under the key entities:
>>> entities = first_elem['entities']
>>> pprint(entities)
{'hashtags': [],
'symbols': [],
'urls': [{'display_url': 'isthereanappthat.com',
'expanded_url': 'http://www.isthereanappthat.com',
'indices': [16, 38],
'url': 'http://t.co/QDVYv6bV90'}],
'user_mentions': []}
Now you are able to access hashtags:
>>> entities['hashtags']
[]
Which just happens to be the empty list.
To convert to JSON, note the comment:
>>> import json
>>> # Make sure output is the list object not a string representing the object
>>> json_string = json.dumps(output)
>>> jason = json.loads(output)
>>> jason[0]['entities']['hashtags']
[]
I think your problem is that you made output a string before you json.dumps it, meaning that json.loads will return a string, not a json object.
And #Dan's answer is correct, this is not valid JSON. It is however a valid python dict, and I'm assuming that you got it from Twitter using python then printed it.
I did json.loads(json.loads(string)) and was able to get the dictionary. You can check it out. The first time it doesn't just return the same string, but processes it (e.g. removes \\ characters).
First off, your JSON example is not valid JSON; the Twitter API would not output this, because it would break every conforming JSON consumer.
jsonlint shows the first, obvious syntax error: single-quoted rather than double quoted strings.
Secondly, you have None where JSON requires null, False instead of false, and True, instead of true.
Your alleged "JSON" example appears to have been pre-decoded into Python :). When I use a snippet of real JSON, it works exactly as expected:
import json
json_string = r"""
[{"actual_json_key":"actual_json_value"}]
"""
jason = json.loads(json_string)
print(jason[0]["actual_json_key"])

Categories

Resources