How to access Twitter included data object?

How to access Twitter included data object? - python

I have extracted the following Twitter data using Tweepy. However, I am not able to fetch data from the included data object. I am specifically trying to fetch the URL and description data. I can see from the json_response that both data on URL and description are present.
My data has the following structure:
{
"data": [
{
"attachments": {
"media_keys": [
"3_1376989039262195713"
]
},
"author_id": "964661980551266304",
"created_at": "2021-03-30T20:05:45.000Z",
"id": "1376989044039544836",
"text": "#RichardGrenell I also want to speak out against this FB group who blocked me (after asking me to invite all my friends) for making the point that this recall not be made a MAGA one. \n\nI didn\u2019t stump on the ground for Trump, I did it for my children."
},
{
"attachments": {
"media_keys": [
"3_1376986160963145736",
"3_1376986160988368898",
"3_1376986160963198980",
"3_1376986160954757129"
]
},
"author_id": "1000347213145563136",
"created_at": "2021-03-30T19:54:20.000Z",
"id": "1376986169704071171",
"text": "#Bobbrock8013 #irishson19161 #RandPaul It's ok to question the election of Trump, but if you question Biden's win you are a \"domestic terrorist.\" Does the Biden Admin welcome a discussion of opposing views on policies regarding lockdowns, masks and vaccines? Why is Big Tech censoring conservatives? Fascists censor."
},
{
"attachments": {
"media_keys": [
"3_1376961169450221571"
]
},
"author_id": "328673472",
"created_at": "2021-03-30T18:15:00.000Z",
"id": "1376961171841036291",
"text": "#ByronYork Newsworthy, but Democrats via their minions will likely censor Trump's statement from Twitter, Facebook, CNN, MSNBC, Washington Post, NY Times etc You know our free speech rules now are based on the Democrats' version of what they will ALLOW us Deplorables to say let alone think."
},
{
"author_id": "18774517",
"created_at": "2021-03-30T10:31:58.000Z",
"id": "1376844643837566986",
"text": "RT #BrexitBuster: #EditingMike #LauraHa15799415 I\u2019m old enough to remember when Piers Morgan was Donald J Trump\u2019s number one fanboy. Are yo\u2026"
},
{
"author_id": "52405628",
"created_at": "2021-03-30T10:30:33.000Z",
"id": "1376844286646480899",
"text": "RT #BrexitBuster: #EditingMike #LauraHa15799415 I\u2019m old enough to remember when Piers Morgan was Donald J Trump\u2019s number one fanboy. Are yo\u2026"
},
{
"author_id": "848911132496723969",
"created_at": "2021-03-30T10:30:11.000Z",
"id": "1376844194921250818",
"text": "RT #BrexitBuster: #EditingMike #LauraHa15799415 I\u2019m old enough to remember when Piers Morgan was Donald J Trump\u2019s number one fanboy. Are yo\u2026"
},
{
"attachments": {
"media_keys": [
"3_1376836461601898499",
"3_1376836461614542853"
]
},
"author_id": "848911132496723969",
"created_at": "2021-03-30T09:59:37.000Z",
"id": "1376836504308305921",
"text": "#EditingMike #LauraHa15799415 I\u2019m old enough to remember when Piers Morgan was Donald J Trump\u2019s number one fanboy. Are you?\n\nThen he praised Joe Biden\u2019s speech... until he was offered the chance to pen a vicious hatchet piece for the Daily Mail! Pointing this out earned me a block.\n#shapeshiftingcreep"
},
{
"attachments": {
"media_keys": [
"3_1376821889004363777"
]
},
"author_id": "31308988",
"created_at": "2021-03-30T09:01:34.000Z",
"id": "1376821895811715073",
"text": "A lady sent this to my messenger right before she blocked me because she was mad I typed the names of Trump's sex assault victims"
},
{
"attachments": {
"media_keys": [
"3_1376704749379145731"
]
},
"author_id": "198202008",
"created_at": "2021-03-30T01:16:05.000Z",
"id": "1376704753145643014",
"text": "#moondancer34 #MrCrispyMAGA #lonelymilkshake #EFMoriarty #CBSNews Who is this person who blocked me? A MAGA lover? Guess that\u2019s why. But how ironic that he\u2019s a Trump supporter yet a WA fan when Woody is about as liberal as they get. In fact he donated to Hillary\u2019s campaign so she\u2019d win against Trump. Whatever! \ud83d\ude02\ud83e\udd37\u200d\u2640\ufe0f"
}
],
"includes": {
"media": [
{
"media_key": "3_1376989039262195713",
"type": "photo",
"url": "https://pbs.twimg.com/media/ExwMPFDUYAEHKn0.jpg"
},
{
"media_key": "3_1376986160963145736",
"type": "photo",
"url": "https://pbs.twimg.com/media/ExwJnijWUAgfPlb.jpg"
},
{
"media_key": "3_1376986160988368898",
"type": "photo",
"url": "https://pbs.twimg.com/media/ExwJnipXMAIHmJp.jpg"
},
{
"media_key": "3_1376986160963198980",
"type": "photo",
"url": "https://pbs.twimg.com/media/ExwJnijXIAQ4F_x.jpg"
},
{
"media_key": "3_1376986160954757129",
"type": "photo",
"url": "https://pbs.twimg.com/media/ExwJnihWUAkr8bi.jpg"
},
{
"media_key": "3_1376961169450221571",
"type": "photo",
"url": "https://pbs.twimg.com/media/Exvy416WQAMRlO0.jpg"
},
{
"media_key": "3_1376836461601898499",
"type": "photo",
"url": "https://pbs.twimg.com/media/ExuBd4-WQAMgTTR.jpg"
},
{
"media_key": "3_1376836461614542853",
"type": "photo",
"url": "https://pbs.twimg.com/media/ExuBd5BXMAU2-p_.jpg"
},
{
"media_key": "3_1376821889004363777",
"type": "photo",
"url": "https://pbs.twimg.com/media/Ext0Np0WYAEUBXy.jpg"
},
{
"media_key": "3_1376704749379145731",
"type": "photo",
"url": "https://pbs.twimg.com/media/ExsJrOtWUAMgVxk.jpg"
}
],
"users": [
{
"created_at": "2018-02-17T00:45:13.000Z",
"description": "Congressional Candidate for CA-28 Proud Angeleno/Catholic/Californio by marriage Localist\u2022Centrist\u2022Pragmatist\u2022Realist",
"id": "964661980551266304",
"name": "Beatrice Cardenas",
"username": "RealBetyCardens"
},
{
"created_at": "2018-05-26T12:05:35.000Z",
"description": "Following President Trump .... KAG 2020 \ud83c\uddfa\ud83c\uddf8",
"id": "1000347213145563136",
"name": "Joseph Fong",
"username": "JosephEugeneFo1"
},
{
"created_at": "2011-07-03T20:29:43.000Z",
"description": "Husband, Dad, Granddad, Christian,Army MP Sgt vet, I.U. grad, former banker & retired City Finance Director, Reagan guy. Cancer survivor. \u271d\ufe0f\ud83c\uddfa\ud83c\uddf8",
"id": "328673472",
"name": "Steve B",
"username": "Stevebfrs"
},
{
"created_at": "2009-01-08T19:06:29.000Z",
"description": "a younger Victor Meldrew but interesting - I hope - nice sometimes !",
"id": "18774517",
"name": "NORBET",
"username": "NORBET"
},
{
"created_at": "2009-06-30T14:17:41.000Z",
"description": "Tanglewood and Gretsch",
"id": "52405628",
"name": "FSociety Tom \ud83c\uddea\ud83c\uddfa #FBPE ANTIFA #RESIST #FBPPR #BLM",
"username": "thebdaman"
},
{
"created_at": "2017-04-03T14:52:40.000Z",
"description": "We are the Remain Resistance... popping Brexit bubbles one at a time. Mostly sarcasm, occasionally deadly serious. Love the UK & the EU. Detest racism & Nazis.",
"id": "848911132496723969",
"name": "Brexit Buster",
"username": "BrexitBuster"
},
{
"created_at": "2009-04-15T02:18:58.000Z",
"description": "No DMs !!! \ud83c\udf0a \ud83c\udf0a\nBLM ,Trans lives matter, LGBT \ud83c\udf08\nAlly of all marginalized",
"id": "31308988",
"name": "Stephy Pachuco (Her, She) \ud83c\udf0a\ud83c\udf0a",
"username": "Stephaniespc"
},
{
"created_at": "2010-10-03T16:56:45.000Z",
"description": "How'd you know I was looking at you if you weren't looking at me? \ud83d\udde3Mike Patton \u2615\ufe0fCoffee \ud83d\ude0eWeekends \ud83c\udf0aPolitics \ud83d\ude0dNYC \ud83e\udd96Museum Employee",
"id": "198202008",
"name": "Patti\ud83d\uddfd",
"username": "PattiFromNYC"
}
]
},
"meta": {
"newest_id": "1376989044039544836",
"next_token": "b26v89c19zqg8o3fosqtjm19orv2gber5hh7b0fu7uem5",
"oldest_id": "1376704753145643014",
"result_count": 9
}
}
I can successfully fetch the data from the data object which is 'id', 'text', 'created_at', and 'author_id' using the following code. However, the code does not retrieve the 'URL' and 'description' data from the included object which leaves me with two empty columns.
# Create file
csvFile = open("data.csv", "a", newline="", encoding='utf-8')
csvWriter = csv.writer(csvFile)
# Create headers for the data
csvWriter.writerow(
['author id', 'created_at', 'id', 'tweet', 'bio', 'image_url'])
csvFile.close()
def append_to_csv(json_response, fileName):
# A counter variable
counter = 0
# Open OR create the target CSV file
csvFile = open(fileName, "a", newline="", encoding='utf-8')
csvWriter = csv.writer(csvFile)
# Loop through each tweet
for tweet in json_response['data']:
# We will create a variable for each since some of the keys might not exist for some tweets
# So we will account for that
# 1. Author ID
author_id = tweet['author_id']
# 2. Time created
created_at = dateutil.parser.parse(tweet['created_at'])
# 3. Tweet ID
tweet_id = tweet['id']
# 4. Tweet text
text = tweet['text']
# 5. description
if('description' in tweet):
bio = tweet['users']['description']
else:
bio = " "
# 6. image url
if ('url' in tweet):
image_url = tweet['media']['url']
else:
image_url = " "
# Assemble all data in a list
res = [author_id, created_at, tweet_id, text, bio, image_url]
# Append the result to the CSV file
csvWriter.writerow(res)
counter += 1
# When done, close the CSV file
csvFile.close()
# Print the number of tweets for this iteration
print("# of Tweets added from this response: ", counter)

Related

filter data based on a condition in json

I am working on a requirement where I need to filter data if a condition is satisfied from json data into a data frame in python. When I use the below code I run into following error. I am trying to filter data based on a random condition here, I am checking if the Country Code is US then I need data frame to be populated with all the records where country is USA.
Code:
import json
data = {
"demographic": [
{
"id": 1,
"country": {
"code": "AU",
"name": "Australia"
},
"state": {
"name": "New South Wales"
},
"location": {
"time_zone": {
"name": "(UTC+10:00) Canberra, Melbourne, Sydney",
"standard_name": "AUS Eastern Standard Time",
"symbol": "AUS Eastern Standard Time"
}
},
"address_info": {
"address_1": "",
"address_2": "",
"city": "",
"zip_code": ""
}
},
{
"id": 2,
"country": {
"code": "AU",
"name": "Australia"
},
"state": {
"name": "New South Wales"
},
"location": {
"time_zone": {
"name": "(UTC+10:00) Canberra, Melbourne, Sydney",
"standard_name": "AUS Eastern Standard Time",
"symbol": "AUS Eastern Standard Time"
}
},
"address_info": {
"address_1": "",
"address_2": "",
"city": "",
"zip_code": ""
}
},
{
"id": 3,
"country": {
"code": "US",
"name": "United States"
},
"state": {
"name": "Illinois"
},
"location": {
"time_zone": {
"name": "(UTC-06:00) Central Time (US & Canada)",
"standard_name": "Central Standard Time",
"symbol": "Central Standard Time"
}
},
"address_info": {
"address_1": "",
"address_2": "",
"city": "",
"zip_code": "60611"
}
}
]
}
jd = json.loads(data)
df = [cnt for cnt in jd["demographic"] if cnt["country"]["code"] == "US"]
print(df)
Error:
TypeError: string indices must be integers

You don't need to parse a json string into python dict, cause data var is already a python dict!
Remove this line
jd = json.loads(data)
This is your code:
df = [cnt for cnt in data["demographic"] if cnt["country"]["code"] == "US"]
print(df)

Nested json files - Python

Good afternoon all,
I've been reading through the various posts regarding reading .json files using pandas but so far I've not been sucessful extract.
I need to read a specific 'score' in the json file of which I'll then iterate through all the json files I have as the label would be the same.
In the below how would I read the 'score'? I've tried using the normalise function but regardless of the agruement I put in I cannot get any closer.
Part of the json file:
"template_id": "template_fe61177cb0eb4642901b1eae9488fbb4",
"audit_id": "audit_1a0e9ef4a7914286808accb3dcb0700b",
"archived": false,
"created_at": "2022-10-07T08:00:14.021Z",
"modified_at": "2022-10-07T08:05:56.594Z",
"audit_data": {
"score": 10,
"total_score": 11,
"score_percentage": 90.909,
"name": "7 Oct 2022 / Test",
"duration": 240,
"authorship": {
"device_id": "user_65c3799b0f1a48549cacbceca244e1db",
"owner": "test",
"owner_id": "user_65c3799b0f1a48549cacbceca244e1db",
"author": "test",
"author_id": "user_65c3799b0f1a48549cacbceca244e1db"
},
"date_completed": "2022-10-07T08:05:55.860Z",
"date_modified": "2022-10-07T08:05:56.594Z",
"date_started": "2022-10-07T08:00:13.000Z",
"site": {
"name": "Blue Warehouse"
}
},
"template_data": {
"authorship": {
"device_id": "user_4bb896b5308341f7a7543a32f6c1f3ec",
"owner": "test",
"owner_id": "user_4bb896b5308341f7a7543a32f6c1f3ec",
"author": "test",
"author_id": "user_4bb896b5308341f7a7543a32f6c1f3ec"
},
"metadata": {
"description": "",
"name": "RCS",
"image": {
"date_created": "2022-04-12T13:27:18.852Z",
"file_ext": "png",
"label": "Go \u0026 See icon.PNG",
"media_id": "cf944a4b-7589-47e6-b42a-8d17f06b7031",
"href": "https://1"
}
},
"response_sets": {
"5b69aee5-0532-46a4-b2f5-d020d4d5381d": {
"id": "5b69aee5-0532-46a4-b2f5-d020d4d5381d",
"type": "question",
"responses": [
{
"id": "ef4abf51-3361-46f5-ba04-70c23c85ca20",
"label": "Good",
"colour": "19,133,95",
***"score": 1,***
"enable_score": true
},
Thanks for your help.
Rob.

This is done without pandas
import json
with open("my_file.json", 'r') as f:
my_dict = json.load(f)
score = my_dict["response_sets"]["5b69aee5-0532-46a4-b2f5-d020d4d5381d"]["responses"][0]["score"]

Extract urls from elements in a list

I have a list json_response containing Twitter data including image URLs. I am trying to extract the url in the from ['includes']['media'] object. However, the majority of elements in the list does not have ['media'] which I believe causes the loop to fail. Running the code I get the KeyError: 'media' even though I row['image_url'] = None in the loop would account for list elements without a ['media']
I have provided a sample of the json_response. However, the actual URLs have been replaced due to Stackoverflows restricting on posting URLs
print(json.dumps(json_response[10:13], indent=4, sort_keys=True)) # look at json_response object.
[
{
"data": [
{
"author_id": "125700232",
"created_at": "2021-12-31T07:13:04.000Z",
"id": "1476813641265549317",
"text": "You can\u2019t be a democrat or a liberal or progressive & besties with racists who radicalize people like this. \n\nI\u2019ve never publicly named him but since he blocked me years ago for holding him accountable, maybe I will."
},
{
"author_id": "800464894361382912",
"created_at": "2021-12-27T12:17:25.000Z",
"id": "1475440681258737673",
"text": "For $9 an hour, I was told to kill myself over a confusing sale sign, I'd been called worthless and stupid weekly. I've had things thrown at me, been spat on. A customer blocked me from coming on a bus so I couldn't go home. If people were kind to begin with, more would \"show up\""
},
{
"attachments": {
"media_keys": [
"3_1474448924249407490"
]
},
"author_id": "1390363055150845959",
"created_at": "2021-12-24T18:36:32.000Z",
"id": "1474448926891782149",
"text": "Blocked by China boy. Spy banging snowflake #RepSwalwell"
},
{
"author_id": "196428643",
"created_at": "2021-12-21T22:22:15.000Z",
"id": "1473418564430229505",
"text": "I replied to an Eric Swalwell lame tweet with a Fang Fang reference yesterday and he blocked me. Then suddenly my account was hacked and my account linked email was changed from a Manhattan ISP. I don't think it was a coincidence."
},
{
"attachments": {
"media_keys": [
"3_1462187451292819458",
"3_1462187494385065994"
]
},
"author_id": "25871358",
"created_at": "2021-11-20T22:40:05.000Z",
"id": "1462189029919805450",
"text": "Pearl clutch elsewhere about #RepSwalwell unfollowing you, when I told you you were a lying gaslighting jackwagon and you blocked me. Truth hurts."
},
{
"author_id": "1251510910390337536",
"created_at": "2021-10-30T01:40:32.000Z",
"id": "1454261909759406086",
"text": "Eric Swalwell blocked me tonight \ud83d\ude02"
},
{
"author_id": "15790644",
"created_at": "2021-07-23T20:11:58.000Z",
"id": "1418665211221925889",
"text": "Twitter won't allow me to follow anyone.\n\nAlso, tried to retweet Eric Swalwell's tweet and it blocked me. And other tweets...\n\nGuess I'm like a mosquito buzzing around the head of Jack."
},
{
"attachments": {
"media_keys": [
"3_1411309517317586945"
]
},
"author_id": "107575508",
"created_at": "2021-07-03T13:03:05.000Z",
"id": "1411309521251745796",
"text": "This tweet was blocked by Twitter for retweets and quotes. In summary a team member of Eric Swalwell illegally entered Mo Brooks home to serve papers and assaulted Brooks wife. There is security camera footage. Papers being serve claim Brooks caused Jan. 6 \u2018riot\u2019."
},
{
"author_id": "26182604",
"created_at": "2021-06-07T03:58:01.000Z",
"id": "1401750267121524738",
"text": "I can't # him because my words hurt his feeling and he blocked me. LOL! CNN : Democratic Rep. Eric Swalwell's suit seeks to hold Brooks, ex-President Trump and others liable for the January 6 attack."
},
{
"author_id": "258617217",
"created_at": "2021-04-06T20:37:13.000Z",
"id": "1379533675772186630",
"text": "George Webb Blocked me.\nI guess because I pointed out that in one of his books he connected a lady FANG FANG from Wuhan as the same Fang Fang CCP agent that tried to seduce Eric Swalwell, \n\n2 different ladies.\nThat wasn't nice Mr. Webb"
}
],
"includes": {
"media": [
{
"media_key": "3_1474448924249407490",
"type": "photo",
"url": "url here"
},
{
"media_key": "3_1462187451292819458",
"type": "photo",
"url": "url here"
},
{
"media_key": "3_1462187494385065994",
"type": "photo",
"url": "url here"
},
{
"media_key": "3_1411309517317586945",
"type": "photo",
"url": "url here"
}
],
"users": [
{
"created_at": "2010-03-23T15:50:53.000Z",
"description": "founder of Melanated Mingle|licensed psychotherapist|psych prof|Latin\u00e8 Ph.D|chingona",
"id": "125700232",
"name": "Dr. Lisa Xochitl Vallejos, Ph.D., LPC",
"username": "realdocv"
},
{
"created_at": "2016-11-20T22:24:37.000Z",
"description": "31, she/her",
"id": "800464894361382912",
"name": "Fluke \ud83d\udc99",
"username": "flukefancy"
},
{
"created_at": "2021-05-06T17:49:30.000Z",
"description": "",
"id": "1390363055150845959",
"name": "kbark",
"username": "kbark23500486"
},
{
"created_at": "2010-09-29T02:29:17.000Z",
"description": "",
"id": "196428643",
"name": "Bird Dog \u00d3 S\u00failleabh\u00e1in",
"username": "AntiqueSully"
},
{
"created_at": "2009-03-22T20:12:14.000Z",
"description": "I don't know what a Hoosier is, either.",
"id": "25871358",
"name": "Misty",
"username": "mialynneb"
},
{
"created_at": "2020-04-18T14:00:37.000Z",
"description": "#Rachlsbored #KittyKattKeee #Rebeccahansonn #BasedHabits #bxtchbabyy #Sexxcel #lizhomlesvoice #psychoness_xo #VOLTRON4444 #jessiprincey #MartinaMarkota: CEO",
"id": "1251510910390337536",
"name": "\u1587\u15e9\u15ea\u15ea\u01b3\u26a1\ufe0f\u26a1\ufe0f",
"username": "_RadicalReality"
},
{
"created_at": "2008-08-09T17:27:58.000Z",
"description": "Really. There are conservatives in New York! #MAGA",
"id": "15790644",
"name": "SueDinNY",
"username": "SueDinNY"
},
{
"created_at": "2010-01-23T01:24:09.000Z",
"description": "Army Vet, served in M.I. Unit. If you disagree with me, it\u2019s because you haven\u2019t seen what I\u2019ve seen. Proud Supporter of #realDonaldTrump #maga",
"id": "107575508",
"name": "Margaret Briem",
"username": "LivethLifeULove"
},
{
"created_at": "2009-03-24T05:09:49.000Z",
"description": "Dad. IT guy. Linux geek. TTRPG fan. Critter. Drone Pilot. Sometimes I go outside. (he/him)",
"id": "26182604",
"name": "Wayne Edgar",
"username": "zerovertex"
},
{
"created_at": "2011-02-28T03:06:07.000Z",
"description": "Author-Film Maker-Researcher-Artist-Peace Seeker\n",
"id": "258617217",
"name": "F\u04e8\u042fBIDD\u03a3\u041f FI\u1102\u03a3\u01a7 \u01acV",
"username": "TMV_intel"
}
]
},
"meta": {
"newest_id": "1476813641265549317",
"next_token": "b26v89c19zqg8o3fosqt4kos8ff8dfq3on3e08qcqvngd",
"oldest_id": "1379533675772186630",
"result_count": 10
}
},
{
"data": [
{
"attachments": {
"media_keys": [
"3_1311261760222101505"
]
},
"author_id": "395236271",
"created_at": "2020-09-30T11:10:19.000Z",
"id": "1311262093992132610",
"text": "Steve Knight didn't like it when I pointed out that his \"Trump didn't say nazis are fine people\" run against all visual evidence we have on the Unite the Right rally, and so the \"Free Speech Champion\" blocked me.\n\nFucking snowflake."
}
],
"includes": {
"media": [
{
"media_key": "3_1311261760222101505",
"type": "photo",
"url": "url here"
}
],
"users": [
{
"created_at": "2011-10-21T10:49:03.000Z",
"description": "Harmless but a bit insane.",
"id": "395236271",
"name": "Lu\u00eds Dias",
"username": "lmldias"
}
]
},
"meta": {
"newest_id": "1311262093992132610",
"next_token": "b26v89c19zqg8o3fn0mljncu0v5ci7xlbm3agsunyikxp",
"oldest_id": "1311262093992132610",
"result_count": 1
}
},
{
"data": [
{
"attachments": {
"media_keys": [
"3_1471578541368221703"
]
},
"author_id": "1442527297773326344",
"created_at": "2021-12-16T20:30:40.000Z",
"id": "1471578543385677830",
"text": "Hahaha #NancyPelosi #SpeakerPelosi staff has blocked me from tweeting to them! Why are they so afraid of the truth?"
},
{
"attachments": {
"media_keys": [
"3_1469091211038404613"
]
},
"author_id": "864826449601019905",
"created_at": "2021-12-09T23:48:02.000Z",
"id": "1469091500264935424",
"text": "I'm blocked by Elizabeth Warren, Nancy Pelosi and now Karlyn. Interesting pattern. ;)"
},
{
"attachments": {
"media_keys": [
"3_1465403503354990595"
]
},
"author_id": "1083551928821448710",
"created_at": "2021-11-29T19:33:16.000Z",
"id": "1465403505045393418",
"text": "I just realized that I've been blocked by Nancy Pelosi's daughter \ud83d\ude02"
},
{
"attachments": {
"media_keys": [
"3_1462066354568273930"
]
},
"author_id": "844569319409405958",
"created_at": "2021-11-20T14:32:40.000Z",
"id": "1462066368921100293",
"text": "Tried to tag Drunk Nancy Pelosi. She Blocked me. Or shall I say her assistant blocked me. LMAO. They don\u2019t want the truth out. I don\u2019t care who she is. I front her out."
},
{
"author_id": "3921070047",
"created_at": "2021-11-03T04:26:18.000Z",
"id": "1455753176322347009",
"text": "\"Nancy Pelosi is not going to change your lifestyle, I can, but you've blocked me and hald of mules...\""
},
{
"author_id": "345120618",
"created_at": "2021-10-03T02:04:28.000Z",
"id": "1444483458164670467",
"text": "Blocked by Nancy Pelosi? I'm jealous."
},
{
"author_id": "1227381497277095937",
"created_at": "2021-10-03T00:49:28.000Z",
"id": "1444464586506350595",
"text": "Gosh. I can only dream of being blocked by a trash receptacle like Nancy Pelosi. What a badge of honor \ud83c\udf96 it would be. I'll just have to keep trying.\ud83d\ude0e\ud83c\uddfa\ud83c\uddf8"
},
{
"attachments": {
"media_keys": [
"3_1444377454018138115"
]
},
"author_id": "918169011602386944",
"created_at": "2021-10-02T19:03:40.000Z",
"id": "1444377560385658880",
"text": "Anybody else blocked by Nancy Pelosi? \n\nI thought it was illegal for government people to block us?"
},
{
"author_id": "783746267222462464",
"created_at": "2021-08-09T21:23:09.000Z",
"id": "1424843718042001411",
"text": "\" This page has been blocked by Microsoft Edge\"\n--\nSidney Powell Discusses the FBI &amp; Nancy Pelosi\u2019s Role In The January 6th FALSE FLAG"
},
{
"author_id": "158064102",
"created_at": "2021-08-08T12:52:20.000Z",
"id": "1424352780777508864",
"text": "Nancy Pelosi's daughter blocked me?? sweet old little me!!"
},
{
"author_id": "9484732",
"created_at": "2021-08-02T18:30:09.000Z",
"id": "1422263466396733441",
"text": "Democratic leadership didn't have the votes for an extension of the eviction moratorium and were blocked by Republicans from attempting to get around their internal divisions by passing a shorter-term extension through Oct. 18. via #siobhanehughes"
},
{
"author_id": "1381073800624660484",
"created_at": "2021-07-31T22:47:32.000Z",
"id": "1421603462643535873",
"text": "Joe Biden\n> is spending a lot on defense that could be used to create a debt free design\n> is hiding behind Nancy Pelosi and other women in his life \n> can cancel student debt\n> if he's being blocked by the DoD then he actually can't do it"
},
{
"author_id": "1278119139601715201",
"created_at": "2021-07-27T18:41:31.000Z",
"id": "1420091998590099458",
"text": "What?? I got blocked by her because I said victim blaming about Elise stefanik blaming Nancy pelosi for Jan 6th. I went to answer her and I\u2019m blocked. Ppl are seriously reactionary. Geez!"
},
{
"author_id": "1367196589291364357",
"created_at": "2021-07-16T13:58:01.000Z",
"id": "1416034388400353281",
"text": "They were blocked by Nancy Pelosi"
},
{
"author_id": "3001635726",
"created_at": "2021-07-09T04:06:48.000Z",
"id": "1413348887855828997",
"text": "Blocked by Nancy Pelosi who then staged her laptop to be stolen"
},
{
"author_id": "19845473",
"created_at": "2021-07-02T03:54:13.000Z",
"id": "1410809005258256393",
"text": "Fox News #ChadPergram blocked me. Don't worry he didn't fail to ask Nancy Pelosi about 49ers. News."
},
{
"author_id": "979513121541967873",
"created_at": "2021-06-01T01:37:11.000Z",
"id": "1399540499552378881",
"text": "Unarmed Ashli Babbitt... Behind doors that were blocked by furniture.... what threat did she pose\u2049\ufe0f\nZero.... Zero... Zero Threat\u203c\ufe0f A scared, slimy POS backed by Nancy Pelosi took her life & has been protected\u203c\ufe0f"
},
{
"author_id": "1394830598087249924",
"created_at": "2021-05-31T17:56:40.000Z",
"id": "1399424603605323783",
"text": "Nancy Pelosi blocked me. Badge of honor"
},
{
"author_id": "969989169186557953",
"created_at": "2021-05-21T11:15:17.000Z",
"id": "1395699716659286018",
"text": "Nancy Pelosi\u2019s daughter blocked me on Twitter"
},
{
"attachments": {
"media_keys": [
"3_1385809440830484481"
]
},
"author_id": "803311702032850944",
"created_at": "2021-04-24T04:14:52.000Z",
"id": "1385809443015831557",
"text": "Just realized that big #danrodimer blocked me. How can he stand up to Nancy pelosi when he can't even stand up to me posting his old campaign video? #txlege #TXpolitics "
},
{
"author_id": "1213210549732855808",
"created_at": "2021-04-11T08:26:15.000Z",
"id": "1381161660593758212",
"text": "thinking about the fact that on my old account I was blocked by Nancy Pelosi's daughter"
},
{
"author_id": "2836412739",
"created_at": "2021-03-29T21:30:42.000Z",
"id": "1376648033430044679",
"text": "Lol, corrupt scumbag Nancy Pelosi blocked me. #Corruption She doesn\u2019t want her sleepy followers to see the truth."
},
{
"attachments": {
"media_keys": [
"3_1373636857561513987"
]
},
"author_id": "969989169186557953",
"created_at": "2021-03-21T14:05:23.000Z",
"id": "1373636860635983873",
"text": "Nancy Pelosi\u2019s daughter blocked me too. I honestly need to make a Hall of Fame for those who have blocked me"
},
{
"author_id": "1169798579768180736",
"created_at": "2021-03-18T00:50:21.000Z",
"id": "1372349621033443329",
"text": "FellowAMERICANS #BlackLivesMatter #NAACP_LDF #African #Muslim We #UMMAABroadcasting BLOCKED_by #Facebook #Gmail We_DEMAND #HumanRights of Work_Class(80% #USA One_BillionAfrican #Blacks 2.5Billion #Muslims )& #JoeBiden #KamalaHarris #NancyPelosi #POTUS #VP #SpeakerPelosi MUST_ACT"
}
],
"includes": {
"media": [
{
"media_key": "3_1471578541368221703",
"type": "photo",
"url": "url here"
},
{
"media_key": "3_1469091211038404613",
"type": "photo",
"url": "url here"
},
{
"media_key": "3_1465403503354990595",
"type": "photo",
"url": "url here"
},
{
"media_key": "3_1462066354568273930",
"type": "photo",
"url": "url here"
},
{
"media_key": "3_1444377454018138115",
"type": "photo",
"url": "url here"
},
{
"media_key": "3_1385809440830484481",
"type": "photo",
"url": "url here"
},
{
"media_key": "3_1373636857561513987",
"type": "photo",
"url": "url here"
}
],
"users": [
{
"created_at": "2021-09-27T16:31:37.000Z",
"description": "Don't Tread On Me! Trump 2024. Patriot, Anti-Socialist, Pro-1st & 2nd Amendment. Pro-FREEDOM. I AM MAGA! #IamMAGA\nMelting Snowflake Brains with my Salty Tweets!",
"id": "1442527297773326344",
"name": "Patriot USA \ud83c\uddfa\ud83c\uddf8",
"username": "I_am_MAGA_USA"
},
{
"created_at": "2017-05-17T12:54:28.000Z",
"description": "#LetsGoBrandon #FJB #WitchesForTrump #MagicalPersistence #LibertariansForTrump #PeaceLoveLiberty #PatriotPaganPride #Cult45",
"id": "864826449601019905",
"name": "The\u26e4Tower\u26e4Falls",
"username": "Gwenhwyfar7Aine"
},
{
"created_at": "2019-01-11T02:31:26.000Z",
"description": "GETTR - #TheJohnD \n\nGAB - #John_Deplorable",
"id": "1083551928821448710",
"name": "John D \u2022",
"username": "RedWingGrips"
},
{
"created_at": "2015-10-10T19:42:38.000Z",
"description": "\u3134\u3147\u3139 \u3142\u3137\u3145\u314c\u3134\u314c\u3139",
"id": "3921070047",
"name": "\u2728\ud83e\udd88\u2622\ud83d\udd1e",
"username": "Dystar924"
},
{
"created_at": "2011-07-30T02:54:55.000Z",
"description": "Living the good life in sunny Scottsdale, Arizona.",
"id": "345120618",
"name": "Bill Deegan",
"username": "RealBillDeegan"
},
{
"created_at": "2017-10-11T17:38:45.000Z",
"description": "Don't most of us rely on a single strand for happiness?\n\nAfter being a Single Mom & CFO, I was ready to LIVE!\n\nA drunk driver stole that \ud83d\udcaf",
"id": "918169011602386944",
"name": "Caren R \ud83c\uddfa\ud83c\uddf8\ud83c\uddee\ud83c\uddf1\ud83c\uddec\ud83c\udde7",
"username": "BritishCaren"
},
{
"created_at": "2015-01-15T17:32:12.000Z",
"description": "I did stuff in special education. I\u2019ll always defend the public schools. Progressive feminist and registered Democrat since before you were born.",
"id": "2984412230",
"name": "Kay D\u2019Antonio",
"username": "KayDA26"
},
{
"created_at": "2016-10-05T19:10:46.000Z",
"description": "GETTR handle: Murt32_1943 #murt32\n\nForever America First. Always MAGA\n\nAdjectives: Brilliant/Gorgeous \n\nSupports the LGBFJB community\n\nSorry, I don't do DM's.",
"id": "783746267222462464",
"name": "murt32\ud83c\uddfa\ud83c\uddf8 \ud83c\udf40",
"username": "murt32_1943"
},
{
"created_at": "2014-08-04T22:58:39.000Z",
"description": "Finance |Filmmaker\ud83c\udfac| 2A Advocate\ud83e\uddf9| Content Creator \ud83c\udf9e|Political Commentary\ud83d\udce1| Senior Director \u270f|Engineer | Humility is a journey we must all take.",
"id": "2733732880",
"name": "Somebody's Uncle",
"username": "Dariusr0berts"
},
{
"created_at": "2020-01-03T21:28:46.000Z",
"description": "18,Will buy Origami Angel Merch DM me!!!! (He/Him) Private // #SadHammyFan",
"id": "1213210549732855808",
"name": "Mess",
"username": "punk_matthew"
},
{
"created_at": "2014-10-18T19:41:25.000Z",
"description": "Musician, composer, luthier, digital warrior, Patriot, #MAGA\ud83c\uddfa\ud83c\uddf8\ud83c\uddfa\ud83c\uddf8\ud83c\uddfa\ud83c\uddf8 Q, #Trump 2020!, Save the children from the Peds!",
"id": "2836412739",
"name": "Truth Hurts",
"username": "TruthHurtu2"
},
{
"created_at": "2019-09-06T02:26:30.000Z",
"description": "JOURNALIST in MEMPHIS; Our WatsApp & 2Facebooks BLOCKED, by ENEMIES of our US Constitution.",
"id": "1169798579768180736",
"name": "Arshad Khan, UMMAA Broadcasting, Rolla, MO, USA",
"username": "arshad_usa"
}
]
},
"meta": {
"newest_id": "1471578543385677830",
"next_token": "b26v89c19zqg8o3fosqrfh7sqsqc9rs7aukssfoknvuyl",
"oldest_id": "1372349621033443329",
"result_count": 36
}
}
]
Code that should retrieve the URLs from ['includes']['media']
for each_dict in json_lite:
row = {} # empty dict for data
# 3. loop for user object
row['image_url'] = None # assuming user has no image url
for user in each_dict['includes']['media']:
# 5. user url
# check for url of the current user only
if 'url' in user['url']:
row['image_url'] = user.get('url') # if user has url
break # break the loop, as url is found
url_df = url_df.append(row, ignore_index=True) # append data to empty url_df

Not quite the way you asked for, but you might consider just using regex:
import re
urls = re.findall('"url": "([^"]*)"', json.dumps(data))
Output:
['url here',
'url here',
'url here',
'url here',
'url here',
'url here',
'url here',
'url here',
'url here',
'url here',
'url here',
'url here']

Error message KeyError: 'media' points that you should check, if each_dict['includes'] contains 'media' key. You could also use get method of dict, to skip those, which miss 'media' key. Try to replace
for user in each_dict['includes']['media']:
with
for user in each_dict['includes'].get('media',[]):
which should prevent your error.

Remove list element based on part of a string

I have a long list json_response containing Twitter data. Some of the 293 elements in the list do not contain any tweets indicated by 'result_count': 0 and I want to delete those elements from json_response
The following should remove all elements containing 'result_count': 0. However, nothing happens when the code is executed
json_response = [element for element in json_response if element != "'result_count': 0"]
A sample of json_response where only the second out of four elements contain tweets.
print(json.dumps(json_response[0:4], indent=4, sort_keys=True))
[
{
"meta": {
"next_token": "b26v89c19zqg8o3fo77fw18ex7m9tkxtn5jx8qokz8y2l",
"result_count": 0
}
},
{
"data": [
{
"author_id": "751651375407181824",
"created_at": "2019-12-16T02:10:22.000Z",
"id": "1206396117425852417",
"text": "Tarkanian libel lawsuit against Jacky Rosen, 2016 opponent, blocked by Nevada Supreme Court"
},
{
"author_id": "7568942",
"created_at": "2019-12-15T04:41:00.000Z",
"id": "1206071638166507520",
"text": "Tarkanian libel lawsuit against Jacky Rosen, 2016 opponent, blocked by Nevada Supreme Court Dismissed thanks to NV's anti-SLAPP law"
},
{
"author_id": "2404787642",
"created_at": "2019-12-13T18:40:32.000Z",
"id": "1205558134317568000",
"text": "Tarkanian libel lawsuit against Jacky Rosen, 2016 opponent, blocked by Nevada Supreme Court"
},
{
"author_id": "245630545",
"created_at": "2019-12-13T18:06:29.000Z",
"id": "1205549565513883648",
"text": "Attacks lobbed in the heat of a campaign don't end with the campaign, Part 2: Supreme Court puts an end to Danny Tarkanian's libel lawsuit against Jacky Rosen for ads from a 2016 congressional campaign, also via #RileySnyder:"
},
{
"author_id": "56440142",
"created_at": "2019-12-12T22:26:06.000Z",
"id": "1205252514070839296",
"text": ".#DannyTarkanian libel lawsuit against #SenJackyRosen, 2016 opponent, blocked by Nevada Supreme Court via #RileySnyder\u200b"
},
{
"author_id": "794407888567476224",
"created_at": "2019-12-12T22:08:08.000Z",
"id": "1205247991029755905",
"text": "Tarkanian libel lawsuit against Jacky Rosen, 2016 opponent, blocked by Supreme Court\nVia #RileySnyder\n"
}
],
"includes": {
"users": [
{
"created_at": "2016-07-09T05:37:07.000Z",
"description": "Towanda! from Fried Green Tomatoes",
"id": "751651375407181824",
"name": "Karen Gruber",
"username": "mail4ufromme1"
},
{
"created_at": "2007-07-18T20:09:04.000Z",
"description": "Full-time software engineering manager, part-time educator, constant student, backpacker and disliker of the Oxford comma.",
"id": "7568942",
"name": "Justin Yost",
"username": "justinyost"
},
{
"created_at": "2014-03-22T17:05:36.000Z",
"description": "",
"id": "2404787642",
"name": "James Egan",
"username": "JamesEganLaw"
},
{
"created_at": "2011-02-01T03:39:40.000Z",
"description": "Assistant editor and reporter #TheNVIndy covering statehouse elections and more. Co-host of #nvindyespanol's Cafecito. Email me: michelle#thenvindy.com",
"id": "245630545",
"name": "Michelle Rindels",
"username": "MichelleRindels"
},
{
"created_at": "2009-07-13T17:49:46.000Z",
"description": "Curious about Congress and the beautiful game. Following the Nevada delegation for #TheNVIndy",
"id": "56440142",
"name": "Humberto Sanchez",
"username": "hsanchez128"
},
{
"created_at": "2016-11-04T05:16:14.000Z",
"description": "Nonprofit news outlet reporting on Nevada politics, policy and people since 2017 | Your State. Your News. Your Voice. | ideas#thenvindy.com",
"id": "794407888567476224",
"name": "Nevada Independent",
"username": "TheNVIndy"
}
]
},
"meta": {
"newest_id": "1206396117425852417",
"next_token": "b26v89c19zqg8o3fn0po9zgvw98j7w7sec5wgoh0s0rr1",
"oldest_id": "1205247991029755905",
"result_count": 6
}
},
{
"meta": {
"next_token": "b26v89c19zqg8o3fosns35qj7v5486697crmsdhl6kku5",
"result_count": 0
}
},
{
"meta": {
"next_token": "b26v89c19zqg8o3fo77h5ma6xw9tghoz8z8l6hgq0shod",
"result_count": 0
}
}
]

Since your input is ultimately just a list of dictionaries with <key, dictionary> pairs, this should do it:
json_response = [element for element in json_response
if element['meta']['result_count'] > 0]

How to fetch next page from REST API responses

EDIT - Here is the 1st object/record in the response. I get a total of 20 records, including this:
Note: The response has been parsed to remove unicode i.e. u' and has ' replaced by " to make this into a valid json.
{
"jobs": {
"_total": 1811,
"_count": 20,
"_start": 0,
"values": [{
"siteJobUrl": "xxx",
"company": {
"id": 21836,
"name": "CyberCoders"
},
"postingDate": {
"year": 2013,
"day": 10,
"month": 6
},
"descriptionSnippet": "Software Engineer- Hadoop, HDFS, HBaseWe are a well known consumer product development company and we are looking to add a Hadoop Engineer to our Engineering team. You will be working with the latest and greatest technologies to design, develop, and implement connectivity products that allow efficient exchange of data between our core database engine and the Hadoop ecosystem.What you need for thi",
"expirationDate": {
"year": 2013,
"day": 10,
"month": 7
},
"position": {
"industries": {
"_total": 1,
"values": [{
"code": "4",
"id": 4,
"name": "Computer Software"
}]
},
"title": "Software Engineer- Hadoop, HDFS, HBase",
"experienceLevel": {
"code": "2",
"name": "Entry level"
},
"location": {
"country": {
"code": "us"
},
"name": "Greater Pittsburgh Area"
},
"jobFunctions": {
"_total": 1,
"values": [{
"code": "it",
"name": "Information Technology"
}]
},
"jobType": {
"code": "F",
"name": "Full-time"
}
},
"customerJobCode": "CCW-ssehadooppaccw",
"locationDescription": "Pittsburgh, PA",
"jobPoster": {
"headline": "Senior Executive Technical Recruiter at CyberCoders (949)885-5121 chelsea.whalen#cybercoders.com",
"lastName": "W.",
"id": "y2zfe5j76F",
"firstName": "Chelsea"
},
"id": 6007298
},
I am trying to use the below python library to make a job search api call to the linkedin REST API.
https://github.com/ozgur/python-linkedin/
I can access 1st page output just fine. But when I increment the start to point to next page, I still get the same response. What am I missing here?
Here is my code snippet:
authentication = LinkedInAuthentication(API_KEY, API_SECRET, RETURN_URL,
PERMISSIONS.enums.values())
.......
application = LinkedInApplication(authentication)
........
# This is my request
response = application.search_job(
selectors=[{'jobs':
['id',
'customer-job-code',
'posting-date','expiration-date',
{'company':['id','name']},
{'position':['title',
'location',
'job-functions.',
'industries',
'job-type',
'experience-level']},
'skills-and-experience',
'description-snippet',
'salary',
{'job-poster':['id',
'first-name',
'last-name',
'headline']},
'referral-bonus',
'site-job-url',
'location-description']}],
params={'keywords': 'hadoop','count': 20},
headers={'start':20})
I had posted the query on linkedin developer forum too - http://developer.linkedin.com/forum/new-python-client-oauth-20
But got no response. I am new to python which is making it further difficult for me.
Appreciate your help.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to access Twitter included data object? - python

Related

filter data based on a condition in json

Nested json files - Python

Extract urls from elements in a list

Remove list element based on part of a string

How to fetch next page from REST API responses

Categories

Resources