How to get the ID of the parent comment (facebook Graph API)? - python

When sending a request
https://graph.facebook.com/v2.1/123456898765432/comments?access_token=TOKEN&pretty=1&filter=stream&limit=1&summary=1
I get an answer
{
"data": [
{
"created_time": "2015-06-17T10:32:04+0000",
"from": {
"name": "First Name",
"id": "12345678987654"
},
"message": "Message",
"can_remove": true,
"like_count": 0,
"user_likes": false,
"id": "123456898765432_123456789765433"
}
],
"paging": {
"cursors": {
"before": "...",
"after": "..."
},
},
"summary": {
"order": "chronological",
"total_count": 2532
}
}
But if the comment of the second level, I do not know the ID of the parent comment, and I can not answer it programmatically.
Maybe there are some arguments that can be specified, and additional data comment?
I found that there is still an argument metadata = 1
But it shows additional information counter on the object, and there is also no parent ID

I just had this problem and it seems that you can get the parent comment.
request
111791572258237_803009099803144?fields=comments.filter(stream).limit(50){message,id,from,parent}
(the ACCESS_TOKEN was omitted)
response
{
"message": "Sim, está acontecendo com várias pessoas. A Valve vai arrumar logo, provavelmente",
"id": "803009099803144_803075496463171",
"from": {
"name": "Jonathan Gouvea",
"id": "1218897258138073"
},
"parent": {
"created_time": "2016-01-28T19:58:39+0000",
"from": {
"name": "César Rodryguês",
"id": "552640601571460"
},
"message": "Dota ta fechando o de vcs quando vai entra na partida ?",
"id": "803009099803144_803068649797189"
}
},

Related

How to extract a a specific object from JSON file using python json

Using Python I need to extract all the API (just the API endpoint names) endpoints by reading from a JSON file.
Below is the sample JSON code,
{
"swagger": "2.0",
"info": {
"title": "None",
"description": "some thing over here sample content\n",
"version": "0.1"
},
"produces": [
"application/json"
],
"basePath": "/busrouting",
"schemes": [
"https"
],
"definitions": {
"OAuth2": {
"description": "some thing over here sample content\n",
"type": "coauthor",
"flow": "implicit",
"authorization": "NONE",
"scopes": {
"Zero": "three",
"One": "two"
}
}
},
"paths": {
"/service-provider/details": {
"get": {
"description": "some thing over here sample content",
"security": [
{
"OAuth2": [
"None",
"None"
]
}
],
"parameters": [
{
"name": "service provider",
"in": "query",
"description": "List of service-provider ID's",
"required": false,
"type": "array",
"items": {
"type": "string"
}
},
{
"name": "limit",
"in": "query",
"description": "Record limit. Default is 20",
"required": false,
"type": "integer"
},
{
"name": "offset",
"in": "query",
"description": "Record offset. Default is 0.",
"required": false,
"type": "integer"
}
],
"tags": [
"Service Providers"
],
"responses": {
"200": {
"description": "OK.",
"headers": {
"Link": {
"description": "some thing over here sample content\n",
"type": "array",
"items": {
"type": "string"
}
}
},
"schema": {
"type": "object",
"properties": {
"names": {
"type": "array",
"items": {
"$ref": "#/definitions/service-provider"
}
}
} #Sample add
}
},
"400": {
"description": "Bad Request"
},
"401": {
"description": "Unauthorized"
},
"403": {
"description": "Forbidden"
},
"404": {
"description": "Zero"
},
"500": {
"description": "Internal Server Error",
"schema": {
"$ref": "#/definitions/Error"
}
}
}
}
},
"/Bus/{bus-id}/names": {
"get": {
"description": "some thing over here sample content\n",
"security": [
{
"OAuth2": [
"None",
"None"
]
}
],
"parameters": [
{
"name": "Bus-id",
"in": "path",
"required": true,
"type": "string"
}
],
"tags": [
"BusNames"
],
"responses": {
"200": {
"description": "OK.",
"schema": {
"$ref": "#/definitions/busnames"
}
},
"400": {
"description": "Bad Request"
},
"401": {
"description": "Unauthorized"
},
"403": {
"description": "Forbidden"
},
"404": {
"description": "Balances not found"
},
"500": {
"description": "Internal Server Error",
"schema": {
"$ref": "#/definitions/Error"
}
}
}
}
},
"/busoperator/{bus-id}/names/routes": {
"get": {
"description": "some thing over here sample content.\n",
"security": [
{
"OAuth2": [
"None",
"None"
]
}
],
"parameters": [
{
"name": "bus-id",
"in": "path",
"required": true,
"type": "string"
},
{
"name": "limit",
"in": "query",
"description": "Record limit. Default is 10",
"required": false,
"type": "integer"
},
{
"name": "offset",
"in": "query",
"description": "Record offset. Default is 0.",
"required": false,
"type": "integer"
}
],
"tags": [
"bus route mapping"
],
"responses": {
"200": {
"description": "Recent Route",
"headers": {
"Link": {
"description": "some thing over here sample content.\n",
"type": "array",
"items": {
"type": "string"
}
}
},
"schema": {
"type": "object",
"properties": {
"Route details": {
"type": "array",
"items": {
"$ref": "#/definitions/Route"
}
}
}
}
},
"400": {
"description": "Bad Request"
},
"401": {
"description": "Unauthorized"
},
"403": {
"description": "Not entitled to this account and its transactions"
},
"404": {
"description": "User has no recent transactions"
},
"500": {
"description": "Internal Server Error",
"schema": {
"$ref": "#/definitions/Error"
}
}
}
}
},
"/operator/{id}/route/path": {
"get": {
"description": "some thing over here sample content.\n",
"security": [
{
"OAuth2": [
"None",
"None"
]
}
],
"parameters": [
{
"name": "bus-id",
"in": "path",
"required": true,
"type": "string"
},
{
"name": "limit",
"in": "query",
"description": "Record limit. Default is 10",
"required": false,
"type": "integer"
},
{
"name": "offset",
"in": "query",
"description": "Record offset. Default is 0.",
"required": false,
"type": "integer"
}
],
"tags": [
"Sample Reporting"
],
"responses": {
"200": {
"description": "sample actions",
"headers": {
"Link": {
"description": "some thing over here sample content\n",
"type": "array",
"items": {
"type": "string"
}
}
},
"schema": {
"type": "object",
"properties": {
"Boisterous": {
"type": "array",
"items": {
"$ref": "#/definitions/salesperson"
}
}
}
}
},
"400": {
"description": "Bad Request"
},
"401": {
"description": "Unauthorized"
},
"403": {
"description": "Not entitled to this account and its transactions"
},
"404": {
"description": "User has no recent transactions"
},
"500": {
"description": "Internal Server Error",
"schema": {
"$ref": "#/definitions/Error"
}
}
}
}
}
},
"definitions": {
"sample": {
"required": [
"demo",
"request"
],
"description": "Holds identifying attributes for an sample\n",
"properties": {
"request": {
"description": "some thing over here sample content\n",
"type": "string"
},
"sample one": {
"type": "string"
},
"sample two": {
"type": "string"
},
"number": {
"description": "some thing over here sample content",
"type": "string"
},
"name": {
"type": "string",
"description": "some thing over here sample content"
}
}
},
"dummy": {
"description": "some thing over here sample content\n",
"properties": {
"dummy": {
"$ref": "#/definitions/samples"
},
"data": {
"type": "string",
"description": "sample data "
},
"country": {
"type": "string",
"description": "some thing over here sample content"
},
"dataone": {
"$ref": "#/definitions/samples"
},
"error": {
"$ref": "#/definitions/Error"
},
"links": {
"description": "some thing over here sample content",
"type": "array",
"items": {
"$ref": "#/definitions/Links"
}
}
}
},
"reference": {
"description": "only description\n",
"properties": {
"available": {
"type": "number",
"format": "double",
"description": "some thing over here sample content"
},
"availableFormatted": {
"type": "string",
"description": "some thing over here sample content"
},
"held": {
"type": "number",
"format": "double",
"description": "some thing over here sample content."
},
"formatted": {
"type": "string",
"description": "some thing over here sample content"
},
"asst": {
"type": "string",
"format": "date-time",
"description": "Timestamp of , in UTC"
},
"sampler": {
"type": "string",
"format": "date-time",
"description": "Timestamp of , in users preferred TZ"
}
}
},
"dummy": {
"description": "Transaction done on an account",
"required": [
"because",
"versioned",
"postdates"
],
"properties": {
"tintype": {
"description": "sample content",
"type": "string"
},
"cringed": {
"description": "some thing over here sample content",
"type": "string",
"menu": [
"D",
"C"
]
},
"reference": {
"description": "Reference information",
"type": "string"
},
"amount": {
"description": "some thing over here sample content\n",
"type": "number",
"format": "double"
},
"formatted": {
"description": "formatted according to user's preferences\n"
},
"post Date": {
"type": "string",
"format": "date-time",
"description": "Date-time in UTC"
},
"postdate": {
"type": "string",
"format": "date-time",
"description": "Post Date-time in user-preferred TZ"
},
"narrative": {
"type": "string"
}
}
},
"Links": {
"description": "Related Links for the resource\n",
"properties": {
"rel": {
"description": "relationship to the resource",
"type": "string"
},
"ref": {
"description": "URL of the related link",
"type": "string"
}
}
},
"Error": { Sample
"description": "Error descriptor",
"properties": {
"code": {
"type": "integer",
"format": "intent"
},
"message": {
"type": "string"
},
"description": {
"type": "string"
},
"severity": {
"type": "string"
},
"location": {
"type": "string"
}
}
}
}
}
I have tried the below code:
import json
with open("example.json", "r") as reading:
data = json.load(reading)
print(data["paths"])
From here i need to go further the code to capture all the API endpoint name only.
In the sample json file, Under paths i need to capture all the endpoints and method type of an API as below,
In addition I would also like to capture the values for the below keys from the sample JSON code.
Under "parameters" i also need to capture the below keys,
name:
in:
required:
Expected output (just an example):
/service-provider/details
get
bus (comment - refers to the key - name under 'parameters')
query (comment - refers to the key - in under 'parameters')
false (comment - refers to the key - required under 'parameters')
/Bus/{bus-id}/names
get
bus-id (comment - refers to the key - name under 'parameters')
query (comment - refers to the key - in under 'parameters')
false (comment - refers to the key - required under 'parameters')
/busoperator/{bus-id}/names/routes
get
routes (comment - refers to the key - name under 'parameters')
query (comment - refers to the key - in under 'parameters')
false (comment - refers to the key - required under 'parameters')
/operator/{id}/route/path
get
id (comment - refers to the key - name under 'parameters')
query (comment - refers to the key - in under 'parameters')
false (comment - refers to the key - required under 'parameters')
Try this:
with open('example.json', 'r') as f:
data = json.load(f)
for path, values in data['paths'].items():
print(path)
for value in values:
print(value)
This should also get all the endpoints functions in a path if there are multiple.
To get a list of two-tuples (endpoint, type) use:
[(endpoint, value.keys[0]) for endpoint, value in data["paths"].items()]

Loading JSON from a Beautifulsoup Object

I am currently making a scraper app, but before going full out with the app, using other frameworks like Discord.py, I had to first scrape the site first. It proved quite difficult to scrape the site. The site that I am trying to scrape from is Fiverr. Anyways, long story short, I had to get some cookies to login with Python Requests. The big issue now is that the data I need to scrape comes in the form of JSON, which I don't know much about. I managed to select the javascript in question, but once I load it it gives an error: "TypeError: the JSON object must be str, bytes or bytearray, not Tag". I specifically need the "rows" part which is part of the JSON data.
I'm not quite certain how to fix this and have read and tried some similar questions here. I will appreciate any help.
import requests
from bs4 import BeautifulSoup
import re
import json
# Irrelevant to the question
class JobClass:
def __init__(self, date=None, buyer=None, request=None, duration=None, budget=None, link="https://www.fiverr.com/users/myusername/requests", id=None):
self.date = date
self.buyer = buyer
self.request = request
self.duration = duration
self.budget = budget
self.link = link
self.id = id
# Irrelevant to the question
duplicateSet = set()
scrapedSet = set()
jobObjArr = []
headers = {
# Some private cookies. To get them you just need to use a site like https://curl.trillworks.com/ it is really a life saver
# This is used to tell the site who you are to be logged in (which is why I deleted this part out of the code)
}
# Please note that I used "myusername" in the URL. This is going to be different depending on user
# Using the requests module, we use the "get" function
# provided to access the webpage provided as an
# argument to this function:
result = requests.get(
'https://www.fiverr.com/users/myusername/requests', headers=headers)
# Now, let us store the page content of the website accessed
# from requests to a variable:
src = result.content
# Now that we have the page source stored, we will use the
# BeautifulSoup module to parse and process the source.
# To do so, we create a BeautifulSoup object based on the
# source variable we created above:
soup = BeautifulSoup(src, "lxml")
data = soup.select("[type='text/javascript']")[1]
print(data)
# TypeError: the JSON object must be str, bytes or bytearray, not Tag
jsonObject = json.loads(data)
# Here is the output of print(data):
<script type="text/javascript">
document.viewData = {
"dds": {
"subCats": {
"current": {
"text": "All Subcategories",
"val": "-1"
},
"options": [{
"text": "Web \u0026 Mobile Design",
"val": 151
}, {
"text": "Web Programming",
"val": 140
}]
}
},
"results": {
"rows": [{
"type": "none",
"identifier": "5cf132b55e08360011efe633",
"cells": [{
"text": "May 31, 2019",
"type": "date",
"withText": true
}, {
"userPict": "\u003cspan class=\"missing-image-user \"\u003ec\u003c/span\u003e",
"type": "profile-40",
"cssClass": "height95"
}, {
"hintBottom": false,
"text": "My website was hacked and deleted. Need to have it recreated ",
"type": "text-wide",
"tags": [],
"attachment": false
}, {
"text": 1,
"type": "applications",
"alignCenter": true
}, {
"text": "3 days",
"type": "hidden-action",
"actionVisible": false,
"alignCenter": true,
"withText": true,
"buttons": [{
"type": "span",
"text": "3 days",
"class": "duration"
}, {
"type": "button",
"text": "Remove Request",
"class": "remove-request js-remove-request",
"meta": {
"requestId": "5cf132b55e08360011efe633",
"isProfessional": false
}
}]
}, {
"text": "---",
"type": "hidden-action",
"actionVisible": false,
"alignCenter": true,
"withText": true,
"buttons": [{
"type": "span",
"text": "---",
"class": "budget"
}, {
"type": "button",
"text": "Send Offer",
"class": "btn-standard btn-green-grad js-send-offer",
"meta": {
"username": "conto217",
"category": 3,
"subCategory": 151,
"requestId": "5cf132b55e08360011efe633",
"requestText": "My website was hacked and deleted. Need to have it recreated ",
"userPict": "\u003cspan class=\"missing-image-user \"\u003ec\u003c/span\u003e",
"isProfessional": false,
"buyerId": 32969684
}
}]
}]
}, {
"type": "none",
"identifier": "5cf12f641b6e99000edf1b60",
"cells": [{
"text": "May 31, 2019",
"type": "date",
"withText": true
}, {
"userPict": "\u003cimg src=\"https://fiverr-res.cloudinary.com/t_profile_small,q_auto,f_auto/attachments/profile/photo/648ceb417a85844b25e8bf070a70d9a0-254781561534997516.9743/MyFileName\" alt=\"muazamkhokher\" width=\"40\" height=\"40\"\u003e",
"type": "profile-40",
"cssClass": "height95"
}, {
"hintBottom": false,
"text": "Need mobile ui/ux designer from marvel wireframes",
"type": "text-wide",
"tags": [],
"attachment": false
}, {
"text": 4,
"type": "applications",
"alignCenter": true
}, {
"text": "5 days",
"type": "hidden-action",
"actionVisible": false,
"alignCenter": true,
"withText": true,
"buttons": [{
"type": "span",
"text": "5 days",
"class": "duration"
}, {
"type": "button",
"text": "Remove Request",
"class": "remove-request js-remove-request",
"meta": {
"requestId": "5cf12f641b6e99000edf1b60",
"isProfessional": false
}
}]
}, {
"text": "$50",
"type": "hidden-action",
"actionVisible": false,
"alignCenter": true,
"withText": true,
"buttons": [{
"type": "span",
"text": "$50",
"class": "budget"
}, {
"type": "button",
"text": "Send Offer",
"class": "btn-standard btn-green-grad js-send-offer",
"meta": {
"username": "muazamkhokher",
"category": 3,
"subCategory": 151,
"requestId": "5cf12f641b6e99000edf1b60",
"requestText": "Need mobile ui/ux designer from marvel wireframes",
"userPict": "\u003cimg src=\"https://fiverr-res.cloudinary.com/t_profile_small,q_auto,f_auto/attachments/profile/photo/648ceb417a85844b25e8bf070a70d9a0-254781561534997516.9743/MyFileName\" alt=\"muazamkhokher\" width=\"100\" height=\"100\"\u003e",
"isProfessional": false,
"buyerId": 25478156
}
}]
}]
....
I expect the JSON to be loaded in jsonObject, but I get an error: "TypeError: the JSON object must be str, bytes or bytearray, not Tag"
Edit: Here is some code at the end of the print statement. It randomly cuts off for some reason with no ending script tag:
}, {
"type": "none",
"identifier": "5cf1236a959aa5000f1ce094",
"cells": [{
"text": "May 31, 2019",
"type": "date",
"withText": true
}, {
"userPict": "\u003cimg src=\"https://fiverr-res.cloudinary.com/t_profile_small,q_auto,f_auto/profile/photos/30069758/original/Universalco_2a_Cloud.png\" alt=\"clarky2000\" width=\"40\" height=\"40\"\u003e",
"type": "profile-40",
"cssClass": "height95"
}, {
"hintBottom": false,
"text": "Slider revolution slider. 3 slides for a music festival. I can supply a copy what each slide should look like (see attached) and all the individual objects. Anyone can create basic RS slides, but I want this to be dynamic as its for a music festival. We are using the free version of RS if were are required to use the paid version of SL for addons please let us know. Bottom line this must be 3 dynamic slides (using the same background) for a music festival audience. Unlimited revisions is a must.",
"type": "see-more",
"tags": [{
"text": "Graphic UI"
}, {
"text": "Landing Pages"
}],
"attachment": {
"url": "/download/file/1559260800%2Fgig_requests%2Fattachment_f2a5f51b9fb473e8fc7f498929f39e3f",
"name": "Outwith Rotator_1920x1080_1.jpg",
"size": "2.68 MB"
}
}, {
"text": 2,
"type": "applications",
"alignCenter": true
}, {
"text": "24 hours",
"type": "hidden-action",
"actionVisible": false,
"alignCenter": true,
"withText": true,
"buttons": [{
"type": "span",
"text": "24 hours",
"class": "duration"
}, {
"type": "button",
"text": "Remove Request",
"class": "remove-request js-remove-request",
"meta": {
"requestId": "5cf1236a959aa5000f1ce094",
"isProfessional": false
}
}]
}, {
"text": "$23",
"type": "hidden-action",
"actionVisible": false,
"alignCenter": true,
"withText": true,
"buttons": [{
"type": "span",
"text": "$23",
"class": "budget"
}, {
"type": "button",
"text": "Send Of

django - iterate between json response objects

I have a response object that I am receiving from an api call. The response has several objects that are returned in a single call. What I want to do is grab information from each of the objects returned and store them in varialbes to use them within the application. I know to grab info from a json response when it returns a single objects but I am getting confused with multiples objects... I know how to automate the iteration process through something like a forloop... it wont iterate.
here is a sample response that I am getting:
I want to grab the _id from both items.
{
'user':"<class 'synapse_pay_rest.models.users.user.User'>(id=..622d)",
'json':{
'_id':'..6e80',
'_links':{
'self':{
'href':'https://uat-api.synapsefi.com/v3.1/users/..22d/nodes/..56e80'
}
},
'allowed':'CREDIT-AND-DEBIT',
'client':{
'id':'..26a34',
'name':'Charlie Brown LLC'
},
'extra':{
'note':None,
'other':{
},
'supp_id':''
},
'info':{
'account_num':'8902',
'address':'PO BOX 85139, RICHMOND, VA, US',
'balance':{
'amount':'750.00',
'currency':'USD'
},
'bank_long_name':'CAPITAL ONE N.A.',
'bank_name':'CAPITAL ONE N.A.',
'class':'SAVINGS',
'match_info':{
'email_match':'not_found',
'name_match':'not_found',
'phonenumber_match':'not_found'
},
'name_on_account':' ',
'nickname':'SynapsePay Test Savings Account - 8902',
'routing_num':'6110',
'type':'BUSINESS'
},
<class 'synapse_pay_rest.models.nodes.ach_us_node.AchUsNode'>({
'user':"<class 'synapse_pay_rest.models.users.user.User'>(id=..622d)",
'json':{
'_id':'..56e83',
'_links':{
'self':{
'href':'https://uat-api.synapsefi.com/v3.1/users/..d622d/nodes/..6e83'
}
},
'allowed':'CREDIT-AND-DEBIT',
'client':{
'id':'599378ec6aef1b0021026a34',
'name':'Charlie Brown LLC'
},
'extra':{
'note':None,
'other':{
},
'supp_id':''
},
'info':{
'account_num':'8901',
'address':'PO BOX 85139, RICHMOND, VA, US',
'balance':{
'amount':'800.00',
'currency':'USD'
},
'bank_long_name':'CAPITAL ONE N.A.',
'bank_name':'CAPITAL ONE N.A.',
'class':'CHECKING',
'match_info':{
'email_match':'not_found',
'name_match':'not_found',
'phonenumber_match':'not_found'
},
'name_on_account':' ',
'nickname':'SynapsePay Test Checking Account - 8901',
'routing_num':'6110',
'type':'BUSINESS'
},
})
Here is the code that I have:
It wont grab any values...
the iteration needs to be done to the nodes variable which is hte json response object.
def listedLinkAccounts(request):
currentUser = loggedInUser(request)
currentProfile = Profile.objects.get(user = currentUser)
user_id = currentProfile.synapse_id
synapseUser = SynapseUser.by_id(client, str(user_id))
options = {
'page':1,
'per_page':20,
'type': 'ACH-US',
}
nodes = Node.all(synapseUser, **options)
print(nodes)
response = nodes
_id = response["_id"]
print(_id)
return nodes
here is a sample api response from the api documenation:
{
"error_code": "0",
"http_code": "200",
"limit": 20,
"node_count": 5,
"nodes": [
{
"_id": "594e5c694d1d62002f17e3dc",
"_links": {
"self": {
"href": "https://uat-api.synapsefi.com/v3.1/users/594e0fa2838454002ea317a0/nodes/594e5c694d1d62002f17e3dc"
}
},
"allowed": "CREDIT-AND-DEBIT",
"client": {
"id": "589acd9ecb3cd400fa75ac06",
"name": "SynapseFI"
},
"extra": {
"other": {},
"supp_id": "ABC124"
},
"info": {
"account_num": "7443",
"address": "PLACE DE LA REPUBLIQUE 4 CROIX 59170 FR",
"balance": {
"amount": "0.00",
"currency": "USD"
},
"bank_long_name": "3 SUISSES INTERNATIONAL",
"bank_name": "3 SUISSES INTERNATIONAL",
"name_on_account": " ",
"nickname": "Some Account"
},
"is_active": true,
"timeline": [
{
"date": 1498307689471,
"note": "Node created."
},
{
"date": 1498307690130,
"note": "Unable to send micro deposits as node type is not ACH-US."
}
],
"type": "WIRE-INT",
"user_id": "594e0fa2838454002ea317a0"
},
{
...
},
{
...
},
...
],
"page": 1,
"page_count": 1,
"success": true
}

Convert deeply nested json from facebook to dataframe in python

I am trying to get user details of persons who has put likes, comments on Facebook posts. I am using python facebook-sdk package. Code is as follows.
import facebook as fi
import json
graph = fi.GraphAPI('Access Token')
data = json.dumps(graph.get_object('DSIfootcandy/posts'))
From the above, I am getting a highly nested json. Here I will put only a json string for one post in the fb.
{
"paging": {
"next": "https://graph.facebook.com/v2.0/425073257683630/posts?access_token=&limit=25&until=1449201121&__paging_token=enc_AdD0DL6sN3aDZCwfYY25rJLW9IZBZCLM1QfX0venal6rpjUNvAWZBOoxTjbOYZAaFiBImzMqiv149HPH5FBJFo0nSVOPqUy78S0YvwZDZD",
"previous": "https://graph.facebook.com/v2.0/425073257683630/posts?since=1450843741&access_token=&limit=25&__paging_token=enc_AdCYobFJpcNavx6STzfPFyFe6eQQxRhkObwl2EdulwL7mjbnIETve7sJZCPMwVm7lu7yZA5FoY5Q4sprlQezF4AlGfZCWALClAZDZD&__previous=1"
},
"data": [
{
"picture": "https://fbcdn-photos-e-a.akamaihd.net/hphotos-ak-xfa1/v/t1.0-0/p130x130/1285_5066979392443_n.png?oh=b37a42ee58654f08af5abbd4f52b1ace&oe=570898E7&__gda__=1461440649_aa94b9ec60f22004675c4a527e8893f",
"is_hidden": false,
"likes": {
"paging": {
"cursors": {
"after": "MTU3NzQxODMzNTg0NDcwNQ==",
"before": "MTU5Mzc1MjA3NDE4ODgwMA=="
}
},
"data": [
{
"id": "1593752074188800",
"name": "Maduri Priyadarshani"
},
{
"id": "427605680763414",
"name": "Darshi Mashika"
},
{
"id": "599793563453832",
"name": "Shakeer Nimeshani Shashikala"
},
{
"id": "1577418335844705",
"name": "Däzlling Jalali Muishu"
}
]
},
"from": {
"category": "Retail and Consumer Merchandise",
"name": "Footcandy",
"category_list": [
{
"id": "2239",
"name": "Retail and Consumer Merchandise"
}
],
"id": "425073257683630"
},
"name": "Timeline Photos",
"privacy": {
"allow": "",
"deny": "",
"friends": "",
"description": "",
"value": ""
},
"is_expired": false,
"comments": {
"paging": {
"cursors": {
"after": "WTI5dGJXVnVkRjlqZFhKemIzSUVXdNVFExTURRd09qRTBOVEE0TkRRNE5EVT0=",
"before": "WTI5dGJXVnVkRjlqZFhKemIzNE16Y3dNVFExTVRFNE9qRTBOVEE0TkRRME5UVT0="
}
},
"data": [
{
"from": {
"name": "NiFû Shafrà",
"id": "1025030640553"
},
"like_count": 0,
"can_remove": false,
"created_time": "2015-12-23T04:20:55+0000",
"message": "wow lovely one",
"id": "50018692683829_500458145118",
"user_likes": false
},
{
"from": {
"name": "Shamnaz Lukmanjee",
"id": "160625809961884"
},
"like_count": 0,
"can_remove": false,
"created_time": "2015-12-23T04:27:25+0000",
"message": "Nice",
"id": "500186926838929_500450145040",
"user_likes": false
}
]
},
"actions": [
{
"link": "https://www.facebook.com/425073257683630/posts/5001866838929",
"name": "Comment"
},
{
"link": "https://www.facebook.com/42507683630/posts/500186926838929",
"name": "Like"
}
],
"updated_time": "2015-12-23T04:27:25+0000",
"link": "https://www.facebook.com/DSIFootcandy/photos/a.438926536298302.1073741827.4250732576630/50086926838929/?type=3",
"object_id": "50018692838929",
"shares": {
"count": 3
},
"created_time": "2015-12-23T04:09:01+0000",
"message": "Reach new heights in the cute and extremely comfortable \"Silviar\" www.focandy.lk",
"type": "photo",
"id": "425077683630_50018926838929",
"status_type": "added_photos",
"icon": "https://www.facebook.com/images/icons/photo1.gif"
}
]
}
Now I need to get this data into a dataframe as follows(no need to get all).
item | Like_id |Like_username | comments_userid |comments_username|comment(msg)|
-----+---------+--------------+-----------------+-----------------+------------+
Bag | 45546 | noel | 641 | James | nice work |
-----+---------+--------------+-----------------+-----------------+------------+
Any Help will be Highly Appreciated.
Not exactly like your intended format, but here is the making of a solution :
import pandas
DictionaryObject_as_List = str(mydict).replace("{","").replace("}","").replace("[","").replace("]","").split(",")
newlist = []
for row in DictionaryObject_as_List :
row = row.replace('https://',' ').split(":")
exec('newlist.append ( ' + "[" + " , ".join(row)+"]" + ')')
DataFrame_Object = pandas.DataFrame(newlist)
print DataFrame_Object

Python - Find value anywhere within JSON and return location

In Python I'm currently working with a very large JSON file with some deep dictionaries and arrays. I'm having an issue where it's not constant. For example that's below, it's essentially countries, with regions/states, cities, and suburbs. The issue is that if there is only one suburb, it'll return a dictionary, though if there's more than one, it's a array with a dictionary making me have to add another line of code to go deeper. Sure, can ifelse/for it, but this is only a very small portion of the inconstancy and it's just not proper going ifelse all the time.
What I'd like to do is simply search anything within Belgium for the dictionary entry "code": "8400" and return it's location within the JSON file. What would be my best approach in order to do something like this? Thanks!
***SNIP***
{
"code": "BE",
"name": "Belgium",
"regions": {
"region": [
{
"code": "45",
"name": "Flanders",
"places": {
"place": [
{
"code": "1790",
"name": "Affligem"
},
{
"code": "8570",
"name": "Anzegem"
},
{
"code": "8630",
"name": "Diksmuide"
},
{
"code": "9600",
"name": "Ronse"
}
]
},
"subregions": {
"subregion": [
{
"code": "46",
"name": "Coast",
"places": {
"place": [
{
"code": "8300",
"name": "Knokke-Heist"
},
{
"code": "8400",
"name": "Oostende",
"subplaces": {
"subplace": {
"code": "8450",
"name": "Bredene"
}
}
},
{
"code": "8420",
"name": "De Haan"
},
{
"code": "8430",
"name": "Middelkerke"
},
{
"code": "8434",
"name": "Westende-Bad"
},
{
"code": "8490",
"name": "Jabbeke"
},
{
"code": "8660",
"name": "De Panne"
},
{
"code": "8670",
"name": "Oostduinkerke"
}
]
}
},
{
"code": "47",
"name": "Cities",
"places": {
"place": [
{
"code": "1000",
"name": "Brussels"
},
{
"code": "2000",
"name": "Antwerp"
},
{
"code": "8000",
"name": "Bruges"
},
{
"code": "8340",
"name": "Damme"
},
{
"code": "9000",
"name": "Gent"
}
]
}
},
{
"code": "48",
"name": "Interior",
"places": {
"place": [
{
"code": "2260",
"name": "Westerlo"
},
{
"code": "2400",
"name": "Mol"
},
{
"code": "2590",
"name": "Berlaar"
},
{
"code": "8500",
"name": "Kortrijk",
"subplaces": {
"subplace": {
"code": "8940",
"name": "Wervik"
}
}
},
{
"code": "8610",
"name": "Handzame"
},
{
"code": "8755",
"name": "Ruiselede"
},
{
"code": "8900",
"name": "Ieper"
},
{
"code": "8970",
"name": "Poperinge"
}
]
}
},
EDIT:
I was asked to show how I'm currently getting through this JSON file. Root is a dictionary containing numbers that equal the city/suburb I'm trying to search for. It doesn't define whether it is a city or suburb before hand. Below is my lazyly coded search while I was trying to learn how to dig through this JSON file, until I realized how complicated it was getting and got a bit stuck.
SNIP
for k in dataDict['countries']['country']:
if k['code'] == root['country']:
for y in k['regions']['region']['places']['place']:
if y['code'] == root['place']:
city = y['name']
else:
try:
for p in y['subplaces']['subplace']:
if p['code'] == root['place']:
city = p['name']
except:
pass
If I understand well, each dictionary has the following structure:
{"code": # some int
"name": # some str
none / "country" / "place" / whatever # some dict or list
You can write a recursive function that handle one and only one dict:
def foo(my_dict):
if my_dict['code'] == root['place']:
city = my_dict['name']
elif "country" in my_dict:
city = foo(my_dict['country'])
elif "place" in my_dict:
#
# and so on...
else:
city = None
return city
Hope this example will help you.

Categories

Resources