Twitter API search returns truncated tweets - python

I'm trying to build a python program to get tweets based around a certain keyword. However, while I can successfully retrieve tweets, they come back truncated.
How can I get the full text of a tweet?
CODE: (Sample output below code)
(python-twitter module)
import twitter
api = twitter.Api(consumer_key=CONSUMER_KEY,
consumer_secret=CONSUMER_SECRET,
access_token_key=ACCESS_TOKEN,
access_token_secret=ACCESS_SECRET)
results = api.GetSearch(term="car", since="2018-04-11", until="2018-04-12", count=5)
for twt in results:
tempTweet = (str(twt))
tweet = json.loads(tempTweet)
for key in tweet:
print(str(key) + ": " + str(tweet[key]))
print("#############################################")
SAMPLE OUTPUT:
created_at: Wed Apr 11 20:55:25 +0000 2018
favorite_count: 1573
hashtags: []
id: 984173096566341632
id_str: 984173096566341632
lang: en
retweet_count: 1480
source: TweetDeck
**text**: Caution: Disturbing video. Car speeds through red light, striking pedestrian during vigil Wednesday for cyclist kil… **SHORTENEDURLHERE**
truncated: True
urls: [{'expanded_url':'https://twitter.com/i/web/status/984173096566341632', 'url':**SHORTENEDURLHERE**}]
user: {'created_at': 'Wed Nov 14 17:43:42 +0000 2007', 'description': 'KTLA has been keeping Southern California informed since 1947. \n\nHave great video, photos or story tips? Share with us using #ktla.', 'favourites_count': 1078, 'followers_count': 717397, 'friends_count': 769, 'geo_enabled': True, 'id': 10252962, 'id_str': '10252962', 'lang': 'en', 'listed_count': 3885, 'location': 'Los Angeles, CA', 'name': 'KTLA', 'profile_background_color': '040718', 'profile_background_image_url': 'http://pbs.twimg.com/profile_background_images/507323957578436608/olqcU4MS.jpeg', 'profile_background_image_url_https': 'https://pbs.twimg.com/profile_background_images/507323957578436608/olqcU4MS.jpeg', 'profile_banner_url': 'https://pbs.twimg.com/profile_banners/10252962/1369959990', 'profile_image_url': 'http://pbs.twimg.com/profile_images/809849913240481792/YQ0aT9hv_normal.jpg', 'profile_image_url_https': 'https://pbs.twimg.com/profile_images/809849913240481792/YQ0aT9hv_normal.jpg', 'profile_link_color': '24009C', 'profile_sidebar_border_color': 'FFFFFF', 'profile_sidebar_fill_color': '95E8EC', 'profile_text_color': '3C3940', 'profile_use_background_image': True, 'screen_name': 'KTLA', 'statuses_count': 144937, 'time_zone': 'Pacific Time (US & Canada)', 'url': '**SHORTENEDURLHERE**', 'utc_offset': -25200, 'verified': True}
user_mentions: []
#############################################

You just need to pass tweet_mode='extended' while initialising Api.
import twitter
api = twitter.Api(consumer_key=CONSUMER_KEY,
consumer_secret=CONSUMER_SECRET,
access_token_key=ACCESS_TOKEN,
access_token_secret=ACCESS_SECRET,
tweet_mode='extended')
results = api.GetSearch(term="car", since="2018-04-11", until="2018-04-12", count=5)
for twt in results:
tempTweet = (str(twt))
tweet = json.loads(tempTweet)
print(tweet)
This will print,
{u'lang': u'en', u'full_text': u'Have you ever been in so much trouble that you\u2019ve narrowed your options down to a. Winning the lottery b. Wrapping your car around a telephone pole and c. Giving the creepy neighborhood millionaire the date he keeps pestering for at Cheescake Factory? \nPffffffttt. Me either. <twitter link>', u'media': [{u'expanded_url': u'https://twitter.com/_jkate/status/984219542061703168/photo/1', u'sizes': {u'large': {u'h': 1280, u'w': 719, u'resize': u'fit'}, u'small': {u'h': 680, u'w': 382, u'resize': u'fit'}, u'medium': {u'h': 1200, u'w': 674, u'resize': u'fit'}, u'thumb': {u'h': 150, u'w': 150, u'resize': u'crop'}}, u'url': u'<twitter link>', u'media_url_https': u'https://pbs.twimg.com/media/Daimc02VQAAulWe.jpg', u'display_url': u'pic.twitter.com/ZfCeeZN4g0', u'type': u'photo', u'id': 984219532733530112, u'media_url': u'http://pbs.twimg.com/media/Daimc02VQAAulWe.jpg'}], u'created_at': u'Wed Apr 11 23:59:59 +0000 2018', u'hashtags': [], u'user_mentions': [], u'source': u'Twitter for iPhone', u'id_str': u'984219542061703168', u'urls': [], u'retweet_count': 2, u'id': 984219542061703168, u'favorite_count': 83, u'user': {u'profile_use_background_image': True, u'id': 492519212, u'profile_image_url_https': u'https://pbs.twimg.com/profile_images/985737787306422273/aJykLNLj_normal.jpg', u'profile_sidebar_fill_color': u'F3F3F3', u'profile_text_color': u'333333', u'followers_count': 13069, u'location': u'United States', u'profile_background_color': u'EBEBEB', u'id_str': u'492519212', u'utc_offset': -21600, u'statuses_count': 1543, u'description': u'Illegitimate love child of digital marketing and \u2615\ufe0f. Instagram: <twitter link>', u'friends_count': 11470, u'profile_link_color': u'990000', u'profile_image_url': u'http://pbs.twimg.com/profile_images/985737787306422273/aJykLNLj_normal.jpg', u'profile_background_image_url_https': u'https://abs.twimg.com/images/themes/theme7/bg.gif', u'profile_banner_url': u'https://pbs.twimg.com/profile_banners/492519212/1509938057', u'profile_background_image_url': u'http://abs.twimg.com/images/themes/theme7/bg.gif', u'screen_name': u'_jkate', u'lang': u'en', u'favourites_count': 8060, u'name': u'\U0001f319J Kate \U0001f4ab', u'created_at': u'Tue Feb 14 20:23:54 +0000 2012', u'time_zone': u'Mountain Time (US & Canada)', u'profile_sidebar_border_color': u'DFDFDF', u'listed_count': 55}}

Related

Invalid Syntax on ast.literal_eval() from data streaming result

I'm using tweepy for streaming and json.loads to get the data. I saved it as txt files.
def on_data(self, data):
all_data = json.loads(data)
save_file.write(str(all_data)+"\n")
Now I want to extract several property from the data, but the problem is when I'm using ast.literal_eval() for solving the quotes and comma error, I'm getting another error.
Traceback (most recent call last):
File "C:\Users\RandomScientist\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2910, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-27-ffea75fc7446>", line 3, in <module>
data = ast.literal_eval(data)
File "C:\Users\RandomScientist\Anaconda3\lib\ast.py", line 48, in literal_eval
node_or_string = parse(node_or_string, mode='eval')
File "C:\Users\RandomScientist\Anaconda3\lib\ast.py", line 35, in parse
return compile(source, filename, mode, PyCF_ONLY_AST)
File "<unknown>", line 2
{'created_at': 'Thu Apr 04 07:00:10 +0000 2019', 'id': 1113697753530392577, 'id_str': '1113697753530392577', 'text': 'Karena kita adalah suratan terbuka kasih-Nya untuk dunia \n#iamthemessenjah (link)', 'source': 'Facebook', 'truncated': False, 'in_reply_to_status_id': None, 'in_reply_to_status_id_str': None, 'in_reply_to_user_id': None, 'in_reply_to_user_id_str': None, 'in_reply_to_screen_name': None, 'user': {'id': 234355404, 'id_str': '234355404', 'name': 'Messenjah Clothing', 'screen_name': 'MessenjahCloth', 'location': 'YOGYAKARTA', 'url': 'http://www.messenjahclothing.com', 'description': 'THE WORLD CHANGER pages : http://www.facebook.com/messenjahclothingdotcom pin: 578CD443 WA: +6285 727 386 267 IG: #the_messenjah IG product #messenjahstore', 'translator_type': 'none', 'protected': False, 'verified': False, 'followers_count': 3405, 'friends_count': 190, 'listed_count': 7, 'favourites_count': 204, 'statuses_count': 58765, 'created_at': 'Wed Jan 05 13:13:10 +0000 2011', 'utc_offset': None, 'time_zone': None, 'geo_enabled': True, 'lang': 'en', 'contributors_enabled': False, 'is_translator': False, 'profile_background_color': '7E808A', 'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme3/bg.gif', 'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme3/bg.gif', 'profile_background_tile': False, 'profile_link_color': '0400DB', 'profile_sidebar_border_color': '000000', 'profile_sidebar_fill_color': '252429', 'profile_text_color': '666666', 'profile_use_background_image': True, 'profile_image_url': 'http://pbs.twimg.com/profile_images/882803510932156417/KenYVq-i_normal.jpg', 'profile_image_url_https': 'https://pbs.twimg.com/profile_images/882803510932156417/KenYVq-i_normal.jpg', 'profile_banner_url': 'https://pbs.twimg.com/profile_banners/234355404/1523949182', 'default_profile': False, 'default_profile_image': False, 'following': None, 'follow_request_sent': None, 'notifications': None}, 'geo': None, 'coordinates': None, 'place': None, 'contributors': None, 'is_quote_status': False, 'quote_count': 0, 'reply_count': 0, 'retweet_count': 0, 'favorite_count': 0, 'entities': {'hashtags': [{'text': 'iamthemessenjah', 'indices': [60, 76]}], 'urls': [{'url': 'https: (link)', 'expanded_url': 'https://www.facebook.com/messenjahclothingdotcom/posts/2575823355778752', 'display_url': 'facebook.com/messenjahcloth…', 'indices': [77, 100]}], 'user_mentions': [], 'symbols': []}, 'favorited': False, 'retweeted': False, 'possibly_sensitive': False, 'filter_level': 'low', 'lang': 'in', 'timestamp_ms': '1554361210602'}
^
SyntaxError: invalid syntax
Here's my code
with open('pre-process.txt','r') as file:
data = file.read()
data = ast.literal_eval(data)
print(data)
And I've been reading several answer like Python request using ast.literal_eval error Invalid syntax? and Python ast.literal_eval on dictionary string not working (SyntaxError: invalid syntax) but didn't get the suitable solution. Any idea? Thanks in advance.
It appears that you have one value per line in the file, so you need to read it a line at a time and call ast.literal_eval() on the line, not try to evaluate the entire file at once.
with open('pre-process.txt','r') as file:
for line in file:
data = ast.literal_eval(line)
print(data)

Reading Json objects from text file into pandas

I have extracted json objects from an api library and wrote them into a text file. I am now stuck on how to take the json structure saved in the .txt file and read that back into python pandas library.
There are many resources that walk through how to import a json file into pandas but since this is a text file and I'm new to programming and working with json structure I'm not sure how to efficiently perform this task.
There are numerous json objects in the text file and I would share an example but it has a bunch of url shorteners which is preventing me from being able to post this question so unless someone really needs to see the structure Ill hold off. I already tried pd.read_csv() and pd.read_json() but since this is a json structure in a .txt file its not working properly for either so far.
Here has been my best guess so far to get the data back into python:
data = []
with open('tweet_json.txt') as f:
for line in f:
data.append(json.loads(line))
But I got the following error message when I tried that: JSONDecodeError: Extra data: line 1 column 4626 (char 4625)
Here are two tweets that you can copy and save to a .txt file to replicate:
{'contributors': None,
'coordinates': None,
'created_at': 'Tue Aug 01 16:23:56 +0000 2017',
'display_text_range': [0, 85],
'entities': {'hashtags': [],
'media': [{'display_url': 'pic.twitter.com/MgUWQ76dJU',
'expanded_url': 'https://twitter.com/dog_rates/status/892420643555336193/photo/1',
'id': 892420639486877696,
'id_str': '892420639486877696',
'indices': [86, 109],
'media_url': 'http://pbs.twimg.com/media/DGKD1-bXoAAIAUK.jpg',
'media_url_https': 'https://pbs.twimg.com/media/DGKD1-bXoAAIAUK.jpg',
'sizes': {'large': {'h': 528, 'resize': 'fit', 'w': 540},
'medium': {'h': 528, 'resize': 'fit', 'w': 540},
'small': {'h': 528, 'resize': 'fit', 'w': 540},
'thumb': {'h': 150, 'resize': 'crop', 'w': 150}},
'type': 'photo',
'url': na}],
'symbols': [],
'urls': [],
'user_mentions': []},
'extended_entities': {'media': [{'display_url': 'pic.twitter.com/MgUWQ76dJU',
'expanded_url': 'https://twitter.com/dog_rates/status/892420643555336193/photo/1',
'id': 892420639486877696,
'id_str': '892420639486877696',
'indices': [86, 109],
'media_url': 'http://pbs.twimg.com/media/DGKD1-bXoAAIAUK.jpg',
'media_url_https': 'https://pbs.twimg.com/media/DGKD1-bXoAAIAUK.jpg',
'sizes': {'large': {'h': 528, 'resize': 'fit', 'w': 540},
'medium': {'h': 528, 'resize': 'fit', 'w': 540},
'small': {'h': 528, 'resize': 'fit', 'w': 540},
'thumb': {'h': 150, 'resize': 'crop', 'w': 150}},
'type': 'photo',
'url': na}]},
'favorite_count': 39311,
'favorited': False,
'full_text': "This is Phineas. He's a mystical boy. Only ever appears in the hole of a donut. 13/10 na ",
'geo': None,
'id': 892420643555336193,
'id_str': '892420643555336193',
'in_reply_to_screen_name': None,
'in_reply_to_status_id': None,
'in_reply_to_status_id_str': None,
'in_reply_to_user_id': None,
'in_reply_to_user_id_str': None,
'is_quote_status': False,
'lang': 'en',
'place': None,
'possibly_sensitive': False,
'possibly_sensitive_appealable': False,
'retweet_count': 8778,
'retweeted': False,
'source': 'Twitter for iPhone',
'truncated': False,
'user': {'contributors_enabled': False,
'created_at': 'Sun Nov 15 21:41:29 +0000 2015',
'default_profile': False,
'default_profile_image': False,
'description': 'Only Legit Source for Professional Dog Ratings STORE: #ShopWeRateDogs | IG, FB & SC: WeRateDogs | MOBILE APP: #GoodDogsGame Business: dogratingtwitter#gmail.com',
'entities': {'description': {'urls': []},
'url': {'urls': [{'display_url': 'weratedogs.com',
'expanded_url': 'http://weratedogs.com',
'indices': [0, 23],
'url': na }]}},
'favourites_count': 126135,
'follow_request_sent': False,
'followers_count': 4730764,
'following': False,
'friends_count': 109,
'geo_enabled': True,
'has_extended_profile': True,
'id': 4196983835,
'id_str': '4196983835',
'is_translation_enabled': False,
'is_translator': False,
'lang': 'en',
'listed_count': 3700,
'location': 'DM YOUR DOGS. WE WILL RATE',
'name': 'WeRateDogs™',
'notifications': False,
'profile_background_color': '000000',
'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png',
'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png',
'profile_background_tile': False,
'profile_banner_url': 'https://pbs.twimg.com/profile_banners/4196983835/1510812288',
'profile_image_url': 'http://pbs.twimg.com/profile_images/936608706107772929/GwbLQRxf_normal.jpg',
'profile_image_url_https': 'https://pbs.twimg.com/profile_images/936608706107772929/GwbLQRxf_normal.jpg',
'profile_link_color': 'F5ABB5',
'profile_sidebar_border_color': '000000',
'profile_sidebar_fill_color': '000000',
'profile_text_color': '000000',
'profile_use_background_image': False,
'protected': False,
'screen_name': 'dog_rates',
'statuses_count': 6301,
'time_zone': None,
'translator_type': 'none',
'url': n/a,
'utc_offset': None,
'verified': True}}
{'contributors': None,
'coordinates': None,
'created_at': 'Tue Aug 01 00:17:27 +0000 2017',
'display_text_range': [0, 138],
'entities': {'hashtags': [],
'media': [{'display_url': 'pic.twitter.com/0Xxu71qeIV',
'expanded_url': 'https://twitter.com/dog_rates/status/892177421306343426/photo/1',
'id': 892177413194625024,
'id_str': '892177413194625024',
'indices': [139, 162],
'media_url': 'http://pbs.twimg.com/media/DGGmoV4XsAAUL6n.jpg',
'media_url_https': 'https://pbs.twimg.com/media/DGGmoV4XsAAUL6n.jpg',
'sizes': {'large': {'h': 1600, 'resize': 'fit', 'w': 1407},
'medium': {'h': 1200, 'resize': 'fit', 'w': 1055},
'small': {'h': 680, 'resize': 'fit', 'w': 598},
'thumb': {'h': 150, 'resize': 'crop', 'w': 150}},
'type': 'photo',
'url': na}],
'symbols': [],
'urls': [],
'user_mentions': []},
'extended_entities': {'media': [{'display_url': 'pic.twitter.com/0Xxu71qeIV',
'expanded_url': 'https://twitter.com/dog_rates/status/892177421306343426/photo/1',
'id': 892177413194625024,
'id_str': '892177413194625024',
'indices': [139, 162],
'media_url': 'http://pbs.twimg.com/media/DGGmoV4XsAAUL6n.jpg',
'media_url_https': 'https://pbs.twimg.com/media/DGGmoV4XsAAUL6n.jpg',
'sizes': {'large': {'h': 1600, 'resize': 'fit', 'w': 1407},
'medium': {'h': 1200, 'resize': 'fit', 'w': 1055},
'small': {'h': 680, 'resize': 'fit', 'w': 598},
'thumb': {'h': 150, 'resize': 'crop', 'w': 150}},
'type': 'photo',
'url': na}]},
'favorite_count': 33662,
'favorited': False,
'full_text': "This is Tilly. She's just checking pup on you. Hopes you're doing ok. If not, she's available for pats, snugs, boops, the whole bit. 13/10 na,
'geo': None,
'id': 892177421306343426,
'id_str': '892177421306343426',
'in_reply_to_screen_name': None,
'in_reply_to_status_id': None,
'in_reply_to_status_id_str': None,
'in_reply_to_user_id': None,
'in_reply_to_user_id_str': None,
'is_quote_status': False,
'lang': 'en',
'place': None,
'possibly_sensitive': False,
'possibly_sensitive_appealable': False,
'retweet_count': 6431,
'retweeted': False,
'source': 'Twitter for iPhone',
'truncated': False,
'user': {'contributors_enabled': False,
'created_at': 'Sun Nov 15 21:41:29 +0000 2015',
'default_profile': False,
'default_profile_image': False,
'description': 'Only Legit Source for Professional Dog Ratings STORE: #ShopWeRateDogs | IG, FB & SC: WeRateDogs | MOBILE APP: #GoodDogsGame Business: dogratingtwitter#gmail.com',
'entities': {'description': {'urls': []},
'url': {'urls': [{'display_url': 'weratedogs.com',
'expanded_url': 'http://weratedogs.com',
'indices': [0, 23],
'url': na}]}},
'favourites_count': 126135,
'follow_request_sent': False,
'followers_count': 4730865,
'following': False,
'friends_count': 109,
'geo_enabled': True,
'has_extended_profile': True,
'id': 4196983835,
'id_str': '4196983835',
'is_translation_enabled': False,
'is_translator': False,
'lang': 'en',
'listed_count': 3728,
'location': 'DM YOUR DOGS. WE WILL RATE',
'name': 'WeRateDogs™',
'notifications': False,
'profile_background_color': '000000',
'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png',
'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png',
'profile_background_tile': False,
'profile_banner_url': 'https://pbs.twimg.com/profile_banners/4196983835/1510812288',
'profile_image_url': 'http://pbs.twimg.com/profile_images/936608706107772929/GwbLQRxf_normal.jpg',
'profile_image_url_https': 'https://pbs.twimg.com/profile_images/936608706107772929/GwbLQRxf_normal.jpg',
'profile_link_color': 'F5ABB5',
'profile_sidebar_border_color': '000000',
'profile_sidebar_fill_color': '000000',
'profile_text_color': '000000',
'profile_use_background_image': False,
'protected': False,
'screen_name': 'dog_rates',
'statuses_count': 6301,
'time_zone': None,
'translator_type': 'none',
'url': na,
'utc_offset': None,
'verified': True}}
Update
The following code produces this error: JSONDecodeError: Expecting ',' delimiter: line 1 column 4627 (char 4626)
with open('tweet_json.txt', 'r') as f:
datastore = json.load(f)
This post is the closest I've found so far to help me solve my issue:
Python json.loads shows ValueError: Expecting , delimiter: line 1
Thanks everyone for the feedback. I had to adjust the code regarding how I was extracting the data from the API and then it was pretty straight-forward to get the data into a list of dictionaries after that.
with open('tweet_json.txt', 'a+', encoding='utf-8') as file:
for tweet_id in twitter_archive_df['tweet_id']:
try:
tweet = api.get_status(id = tweet_id, tweet_mode='extended')
file.write(json.dumps(tweet))
file.write('\n')
except:
pass
file.close()
then I ran the following code to import the json objects from the .txt file into a list of dictionaries:
with open('tweet_json.txt') as file:
status = []
for line in file:
status.append(json.loads(line))

Python :: nested JSON result in Spotify

I'm having a really hard time to get a track id in Spotify search endpoint.
It is deeply nested.
So, if I do this:
results = sp.search(q='artist:' + 'Nirvava + ' track:' + 'Milk it', type='track')
pprint.pprint(results)
I am able to get:
{u'tracks': {u'href': u'https://api.spotify.com/v1/search?query=artist%3ANirvana+track%3AMilk+it&type=track&offset=0&limit=10',
u'items': [{u'album': {u'album_type': u'album',
u'artists': [{u'external_urls': {u'spotify': u'https://open.spotify.com/artist/6olE6TJLqED3rqDCT0FyPh'},
u'href': u'https://api.spotify.com/v1/artists/6olE6TJLqED3rqDCT0FyPh',
u'id': u'6olE6TJLqED3rqDCT0FyPh',
u'name': u'Nirvana',
u'type': u'artist',
u'uri': u'spotify:artist:6olE6TJLqED3rqDCT0FyPh'}],
u'available_markets': [u'CA',
u'MX',
u'US'],
u'external_urls': {u'spotify': u'https://open.spotify.com/album/7wOOA7l306K8HfBKfPoafr'},
u'href': u'https://api.spotify.com/v1/albums/7wOOA7l306K8HfBKfPoafr',
u'id': u'7wOOA7l306K8HfBKfPoafr',
u'images': [{u'height': 640,
u'url': u'https://i.scdn.co/image/3dd2699f0fcf661c35d45745313b64e50f63f91f',
u'width': 640},
{u'height': 300,
u'url': u'https://i.scdn.co/image/a6c604a82d274e4728a8660603ef31ea35e9e1bd',
u'width': 300},
{u'height': 64,
u'url': u'https://i.scdn.co/image/f52728b0ecf5b6bfc998dfd0f6e5b6b5cdfe73f1',
u'width': 64}],
u'name': u'In Utero - 20th Anniversary Remaster',
u'type': u'album',
u'uri': u'spotify:album:7wOOA7l306K8HfBKfPoafr'},
u'artists': [{u'external_urls': {u'spotify': u'https://open.spotify.com/artist/6olE6TJLqED3rqDCT0FyPh'},
u'href': u'https://api.spotify.com/v1/artists/6olE6TJLqED3rqDCT0FyPh',
u'id': u'6olE6TJLqED3rqDCT0FyPh',
u'name': u'Nirvana',
u'type': u'artist',
u'uri': u'spotify:artist:6olE6TJLqED3rqDCT0FyPh'}],
u'available_markets': [u'CA', u'MX', u'US'],
u'disc_number': 1,
u'duration_ms': 234746,
u'explicit': False,
u'external_ids': {u'isrc': u'USGF19960708'},
u'external_urls': {u'spotify': u'https://open.spotify.com/track/4rtZtLpriBscg7zta3TZxp'},
u'href': u'https://api.spotify.com/v1/tracks/4rtZtLpriBscg7zta3TZxp',
u'id': u'4rtZtLpriBscg7zta3TZxp',
u'name': u'Milk It',
u'popularity': 43,
u'preview_url': None,
u'track_number': 8,
u'type': u'track',
-----> u'uri':u'spotify:track:4rtZtLpriBscg7zta3TZxp'},
QUESTION:
now, how do I fetch the last 'uri' (u'uri': u'spotify:track:4rtZtLpriBscg7zta3TZxp'}, under the name 'Milk It'?
>>> print results['tracks']['items'][0]['uri']
spotify:track:4rtZtLpriBscg7zta3TZxp

Python Tweepy no response on Stream

Hello i try to listen on a tweet channel using python with libary Tweepy.
I use python 2.7.11 and install Tweepy using pip. When i run the following code i get no response an no error. Can you tell me what the problem is and how can i fix this:
from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import time
import json
#EDITED 13:25
from tweepy.auth import API
# Twitter Credentials
ckey = 'Consumer Key (API Key)'
csecret = 'Consumer Secret (API Secret)'
atoken = 'Access Token'
asecret = 'Access Token Secret'
class listener(StreamListener):
def on_data(self, data):
try:
tweet = json.loads(data)
if tweet["lang"] == "nl":
print tweet["id"]
return True
except BaseException, e:
print 'failed on_date,', str(e)
time.sleep(5)
def on_error(self, status):
print status
auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)
twitterStream = Stream(auth, listener())
#EDITED 13:25
print api.verify_credentials()
# twitterStream.filter( track=lstZoekwaarde, languages="nl" )
twitterStream.filter(track='christmas', languages="nl")
CONSOLE: api.verify_credentials()
User(follow_request_sent=False, has_extended_profile=False, profile_use_background_image=True, _json={u'follow_request_sent': False, u'has_extended_profile': False, u'profile_use_background_image': True, u'default_profile_image': False, u'id': 169505005, u'profile_background_image_url_https': u'https://abs.twimg.com/images/themes/theme1/bg.png', u'verified': False, u'translator_type': u'none', u'profile_text_color': u'333333', u'profile_image_url_https': u'https://pbs.twimg.com/profile_images/1425063736/image_normal.jpg', u'profile_sidebar_fill_color': u'DDEEF6', u'entities': {u'description': {u'urls': []}}, u'followers_count': 7, u'profile_sidebar_border_color': u'C0DEED', u'id_str': u'169505005', u'profile_background_color': u'C0DEED', u'listed_count': 0, u'status': {u'contributors': None, u'truncated': False, u'text': u'aan het werk bij Alfam', u'is_quote_status': False, u'in_reply_to_status_id': None, u'id': 541894460343582720, u'favorite_count': 1, u'source': u'Twitter for Android', u'retweeted': False, u'coordinates': {u'type': u'Point', u'coordinates': [5.207323, 52.0616799]}, u'entities': {u'symbols': [], u'user_mentions': [], u'hashtags': [], u'urls': []}, u'in_reply_to_screen_name': None, u'in_reply_to_user_id': None, u'retweet_count': 0, u'id_str': u'541894460343582720', u'favorited': False, u'geo': {u'type': u'Point', u'coordinates': [52.0616799, 5.207323]}, u'in_reply_to_user_id_str': None, u'lang': u'nl', u'created_at': u'Mon Dec 08 09:58:01 +0000 2014', u'in_reply_to_status_id_str': None, u'place': {u'full_name': u'Bunnik, Nederland', u'url': u'https://api.twitter.com/1.1/geo/id/ef77325fbde0f5ad.json', u'country': u'The Netherlands', u'place_type': u'city', u'bounding_box': {u'type': u'Polygon', u'coordinates': [[[5.1532516, 51.9976555], [5.2803233, 51.9976555], [5.2803233, 52.0801935], [5.1532516, 52.0801935]]]}, u'contained_within': [], u'country_code': u'NL', u'attributes': {}, u'id': u'ef77325fbde0f5ad', u'name': u'Bunnik'}}, u'is_translation_enabled': False, u'utc_offset': None, u'statuses_count': 186, u'description': u'', u'friends_count': 7, u'location': u'', u'profile_link_color': u'1DA1F2', u'profile_image_url': u'http://pbs.twimg.com/profile_images/1425063736/image_normal.jpg', u'following': False, u'geo_enabled': True, u'profile_background_image_url': u'http://abs.twimg.com/images/themes/theme1/bg.png', u'screen_name': u'ehoeven', u'lang': u'en', u'profile_background_tile': False, u'favourites_count': 1, u'name': u'Erik Hoeven', u'notifications': False, u'url': None, u'created_at': u'Thu Jul 22 14:12:09 +0000 2010', u'contributors_enabled': False, u'time_zone': None, u'protected': False, u'default_profile': True, u'is_translator': False}, time_zone=None, id=169505005, description=u'', _api=<tweepy.api.API object at 0x7efdf2d5a510>, verified=False, profile_text_color=u'333333', profile_image_url_https=u'https://pbs.twimg.com/profile_images/1425063736/image_normal.jpg', profile_sidebar_fill_color=u'DDEEF6', is_translator=False, geo_enabled=True, entities={u'description': {u'urls': []}}, followers_count=7, protected=False, id_str=u'169505005', default_profile_image=False, listed_count=0, status=Status(contributors=None, truncated=False, text=u'aan het werk bij Alfam', is_quote_status=False, in_reply_to_status_id=None, id=541894460343582720, favorite_count=1, _api=<tweepy.api.API object at 0x7efdf2d5a510>, source=u'Twitter for Android', _json={u'contributors': None, u'truncated': False, u'text': u'aan het werk bij Alfam', u'is_quote_status': False, u'in_reply_to_status_id': None, u'id': 541894460343582720, u'favorite_count': 1, u'source': u'Twitter for Android', u'retweeted': False, u'coordinates': {u'type': u'Point', u'coordinates': [5.207323, 52.0616799]}, u'entities': {u'symbols': [], u'user_mentions': [], u'hashtags': [], u'urls': []}, u'in_reply_to_screen_name': None, u'in_reply_to_user_id': None, u'retweet_count': 0, u'id_str': u'541894460343582720', u'favorited': False, u'geo': {u'type': u'Point', u'coordinates': [52.0616799, 5.207323]}, u'in_reply_to_user_id_str': None, u'lang': u'nl', u'created_at': u'Mon Dec 08 09:58:01 +0000 2014', u'in_reply_to_status_id_str': None, u'place': {u'full_name': u'Bunnik, Nederland', u'url': u'https://api.twitter.com/1.1/geo/id/ef77325fbde0f5ad.json', u'country': u'The Netherlands', u'place_type': u'city', u'bounding_box': {u'type': u'Polygon', u'coordinates': [[[5.1532516, 51.9976555], [5.2803233, 51.9976555], [5.2803233, 52.0801935], [5.1532516, 52.0801935]]]}, u'contained_within': [], u'country_code': u'NL', u'attributes': {}, u'id': u'ef77325fbde0f5ad', u'name': u'Bunnik'}}, coordinates={u'type': u'Point', u'coordinates': [5.207323, 52.0616799]}, entities={u'symbols': [], u'user_mentions': [], u'hashtags': [], u'urls': []}, in_reply_to_screen_name=None, id_str=u'541894460343582720', retweet_count=0, in_reply_to_user_id=None, favorited=False, source_url=u'http://twitter.com/download/android', geo={u'type': u'Point', u'coordinates': [52.0616799, 5.207323]}, in_reply_to_user_id_str=None, lang=u'nl', created_at=datetime.datetime(2014, 12, 8, 9, 58, 1), in_reply_to_status_id_str=None, place=Place(_api=<tweepy.api.API object at 0x7efdf2d5a510>, country_code=u'NL', url=u'https://api.twitter.com/1.1/geo/id/ef77325fbde0f5ad.json', country=u'The Netherlands', place_type=u'city', bounding_box=BoundingBox(_api=<tweepy.api.API object at 0x7efdf2d5a510>, type=u'Polygon', coordinates=[[[5.1532516, 51.9976555], [5.2803233, 51.9976555], [5.2803233, 52.0801935], [5.1532516, 52.0801935]]]), contained_within=[], full_name=u'Bunnik, Nederland', attributes={}, id=u'ef77325fbde0f5ad', name=u'Bunnik'), retweeted=False), lang=u'en', utc_offset=None, statuses_count=186, profile_background_color=u'C0DEED', friends_count=7, profile_link_color=u'1DA1F2', profile_image_url=u'http://pbs.twimg.com/profile_images/1425063736/image_normal.jpg', notifications=False, default_profile=True, profile_background_image_url_https=u'https://abs.twimg.com/images/themes/theme1/bg.png', profile_background_image_url=u'http://abs.twimg.com/images/themes/theme1/bg.png', name=u'Erik Hoeven', is_translation_enabled=False, profile_background_tile=False, favourites_count=1, screen_name=u'ehoeven', url=None, created_at=datetime.datetime(2010, 7, 22, 14, 12, 9), contributors_enabled=False, location=u'', profile_sidebar_border_color=u'C0DEED', translator_type=u'none', following=False)
Try this syntax:
twitterStream.filter(track=['christmas'], languages=['nl'])

Why do I get a pymongo.cursor.Cursor when trying to query my mongodb db via pymongo?

I have consumed a bunch of tweets in a mongodb database. I would like to query these tweets using pymongo. For example, I would like to query for screen_name. However, when I try to do this, python does not return a tweet but a message about pymongo.cursor.Cursor. Here is my code:
import sys
import pymongo
from pymongo import Connection
connection = Connection()
db = connection.test
tweets = db.tweets
list(tweets.find())[:1]
I get a JSON, which looks like this:
{u'_id': ObjectId('51c8878fadb68a0b96c6ebf1'),
u'contributors': None,
u'coordinates': {u'coordinates': [-75.24692983, 43.06183036],
u'type': u'Point'},
u'created_at': u'Mon Jun 24 17:53:19 +0000 2013',
u'entities': {u'hashtags': [],
u'symbols': [],
u'urls': [],
u'user_mentions': []},
u'favorite_count': 0,
u'favorited': False,
u'filter_level': u'medium',
u'geo': {u'coordinates': [43.06183036, -75.24692983], u'type': u'Point'},
u'id': 349223725943623680L,
u'id_str': u'349223725943623680',
u'in_reply_to_screen_name': None,
u'in_reply_to_status_id': None,
u'in_reply_to_status_id_str': None,
u'in_reply_to_user_id': None,
u'in_reply_to_user_id_str': None,
u'lang': u'en',
u'place': {u'attributes': {},
u'bounding_box': {u'coordinates': [[[-79.76259, 40.477399],
[-79.76259, 45.015865],
[-71.777491, 45.015865],
[-71.777491, 40.477399]]],
u'type': u'Polygon'},
u'country': u'United States',
u'country_code': u'US',
u'full_name': u'New York, US',
u'id': u'94965b2c45386f87',
u'name': u'New York',
u'place_type': u'admin',
u'url': u'http://api.twitter.com/1/geo/id/94965b2c45386f87.json'},
u'retweet_count': 0,
u'retweeted': False,
u'source': u'Twitter for iPhone',
u'text': u'Currently having a heat stroke',
u'truncated': False,
u'user': {u'contributors_enabled': False,
u'created_at': u'Fri Oct 28 02:04:05 +0000 2011',
u'default_profile': False,
u'default_profile_image': False,
u'description': u'young and so mischievious',
u'favourites_count': 1798,
u'follow_request_sent': None,
u'followers_count': 368,
u'following': None,
u'friends_count': 335,
u'geo_enabled': True,
u'id': 399801173,
u'id_str': u'399801173',
u'is_translator': False,
u'lang': u'en',
u'listed_count': 0,
u'location': u'Upstate New York',
u'name': u'Joe Catanzarita',
u'notifications': None,
u'profile_background_color': u'D6640D',
u'profile_background_image_url': u'http://a0.twimg.com/profile_background_images/702001815/f87508e73bbfab8c8c85ebe10b29fcf6.png',
u'profile_background_image_url_https': u'https://si0.twimg.com/profile_background_images/702001815/f87508e73bbfab8c8c85ebe10b29fcf6.png',
u'profile_background_tile': True,
u'profile_banner_url': u'https://pbs.twimg.com/profile_banners/399801173/1367200323',
u'profile_image_url': u'http://a0.twimg.com/profile_images/378800000012256721/d8b5f801fb331de6ead4aed42dc77a46_normal.jpeg',
u'profile_image_url_https': u'https://si0.twimg.com/profile_images/378800000012256721/d8b5f801fb331de6ead4aed42dc77a46_normal.jpeg' ,
u'profile_link_color': u'140DE0',
u'profile_sidebar_border_color': u'FFFFFF',
u'profile_sidebar_fill_color': u'E0F5A6',
u'profile_text_color': u'120212',
u'profile_use_background_image': True,
u'protected': False,
u'screen_name': u'JoeCatanzarita',
u'statuses_count': 6402,
u'time_zone': u'Quito',
u'url': None,
u'utc_offset': -18000,
u'verified': False}}
However, when I try to query for this screen_name, I get:
tweets.find({"screen_name": "JoeCatanzarita"})
<pymongo.cursor.Cursor at 0x52c02f0>
And when I then try to count the number of tweets which have "screen_name": "name", I get:
tweets.find({"screen_name": "name"}).count()
0
Any idea what I am doing wrong/how I can get pymongo to return the tweets I am looking for?
Thanks!
PyMongo's find() method returns a Cursor. To actually execute the query on the server and retrieve results, iterate the cursor with list or a for loop:
for doc in tweets.find({'screen_name': 'name'}):
print(doc)
# Or:
docs = list(tweets.find({'screen_name': 'name'}))
If tweets.find({"screen_name": "name"}).count() returns 0, it means no documents match your query.
Edit: now that you've posted an example document, I see you want to query like:
list(tweets.find({'user.screen_name': 'name'}))
... since the screen_name field is embedded in the user sub-document.
I think the problem is that "screen_name" is inside a sub-document if you can provide the document structure I may be able to help you.
Ok now I see what's your problem:
If you look carefully into your document you will notice that "screen_name" is inside the subdocument user, so if you want to acess it all you have to do is the following:
tweets.find({"user.screen_name": "JoeCatanzarita"}) #for example.
Whenever you are in a situation where the element you are trying to find is inside a subdocument like in this situation or inside an array always use this syntax.
I had this same problem with a collection.find() call.
I checked the type of the object and it is python dict. so I took the dict and iterated through it even though there was only one item and she's working like a charm.
myResult = db.find({}, {<!-- blah blah blah for the fields you want -->}).sort({"_id":1}).limit(1)
for item in myResult:
print item
I know this was ages ago but I spent some time surfing this and couldn't find an easy explanation.
Hope this helps.

Categories

Resources