Parse JSON .txt file using RegEx in Python

Parse JSON .txt file using RegEx in Python - python

I need help with understanding how to parse json text for my assignment, and I can't use JSON library as it is not permitted, hence only 're'.
This is part of the file (wrapped for legibility, originally it's all on one line):
{
"unit": {
"title": "Web Development courses",
"raw_title": "{} courses",
"source_objects": [{
"type": "subcategory",
"id": 8,
"description": "Learn web development skills to build fully functioning websites.",
"url": "/courses/development/web-development/",
"title": "Web Development"
}],
"item_type": "course",
"items": [{
"_class": "course",
"id": 625204,
"title": "The Web Developer Bootcamp 2021",
"url": "/course/the-web-developer-bootcamp/",
"is_paid": true,
"visible_instructors": [{
"image_100x100": "https://img-c.udemycdn.com/user/100x100/4466306_6fd8_3.jpg",
"id": 4466306,
"url": "/user/coltsteele/",
"initials": "CS",
"display_name": "Colt Steele",
"image_50x50": "https://img-c.udemycdn.com/user/50x50/4466306_6fd8_3.jpg",
"job_title": "Developer and Bootcamp Instructor",
"name": "Colt",
"_class": "user",
"title": "Colt Steele"
}],
"image_125_H": "https://img-c.udemycdn.com/course/125_H/625204_436a_3.jpg",
"image_240x135": "https://img-c.udemycdn.com/course/240x135/625204_436a_3.jpg",
"is_practice_test_course": false,
"image_480x270": "https://img-c.udemycdn.com/course/480x270/625204_436a_3.jpg",
"published_title": "the-web-developer-bootcamp",
"tracking_id": "2zX-e8qKRUGgpwfF5AQx3w",
"headline": "COMPLETELY REDONE - The only course you need to learn web development - HTML, CSS, JS, Node, and More!",
"num_subscribers": 713744,
"avg_rating": 4.6973553,
"avg_rating_recent": 4.6947985,
"rating": 4.6947985,
"num_reviews": 215075,
"is_wishlisted": false,
"num_published_lectures": 610,
"num_published_practice_tests": 0,
"image_50x50": "https://img-c.udemycdn.com/course/50x50/625204_436a_3.jpg",
"image_100x100": "https://img-c.udemycdn.com/course/100x100/625204_436a_3.jpg",
"image_304x171": "https://img-c.udemycdn.com/course/304x171/625204_436a_3.jpg",
"image_750x422": "https://img-c.udemycdn.com/course/750x422/625204_436a_3.jpg",
"is_in_user_subscription": false,
"locale": {
"english_title": "English (US)",
"locale": "en_US",
"_class": "locale",
"simple_english_title": "English",
"title": "English (US)"
},
"has_closed_caption": true,
"caption_languages": ["English [Auto]", "French [Auto]", "German [Auto]", "Italian [Auto]", "Polish [Auto]", "Portuguese [Auto]", "Spanish [Auto]"],
"created": "2015-09-28T21:32:19Z",
"instructional_level": "All Levels",
"instructional_level_simple": "All Levels",
"content_length_practice_test_questions": 0,
"is_user_subscribed": false,
"buyable_object_type": "course",
"published_time": "2015-11-02T21:13:27Z",
"objectives_summary": ["The ins and outs of HTML5, CSS3, and Modern JavaScript for 2021", "Make REAL web applications using cutting-edge technologies", "Create responsive, accessible, and beautiful layouts"],
"is_recently_published": false,
"last_update_date": "2021-09-09",
"preview_url": "/course/625204/preview/",
"learn_url": "/course/the-web-developer-bootcamp/learn/",
"predictive_score": null,
"relevancy_score": null,
"input_features": null,
"lecture_search_result": null,
"curriculum_lectures": [],
"order_in_results": null,
"curriculum_items": [],
"instructor_name": null,
"content_info": "63.5 total hours",
"content_info_short": "63.5 hours",
"bestseller_badge_content": null,
"badges": [],
"free_course_subscribe_url": null,
"context_info": {
"subcategory": null,
"category": {
"tracking_object_type": "cat",
"id": 288,
"url": "/courses/development/",
"title": "Development"
},
"label": {
"id": 8322,
"display_name": "Web Development",
"topic_channel_url": "/topic/web-development/",
"url": "/topic/web-development/",
"title": "Web Development",
"tracking_object_type": "cl"
}
}
},
{
"_class": "course",
"title": "etc",
}]
}
}
I have about 200 more lines like these, each filled with course/student etc etc info and I need to extract info from certain keywords from this json file.
For example, for course information, I need to extract course_id, 100x100 img, course_title, subscribers etc, and similarly for students, and instructors I need to extract some more information.
My questions are:
since the three classes (course, students, instructors) share some keywords like ID, how do I pull courseID, studentID, instructorID separately?
for course ID,
I checked another link on stackoverflow and tried this but it didn't work so please help me out with this:
import re, os
raw_data_read = open('raw_data.txt', 'r')
regex_pattern = r'.*id\":\"([^\"]+)\".*'
match = re.match(regex_pattern, str(raw_data_read))
courseId= match.group(1)
print('courseId: {}'.format(courseId))

Related

Problems importing, printing and analyzing JSON file in python

I have to write a Jupyter Notebook to perform an analysis of the Q&A threads from the category Profile. Data to analyze are contained in a JSON file. This file is vey big and it contains more than one discussion. When I import the file and i try to visualize it in the terminal, i visualize this error:
JSONDecodeError: Extra data: line 855 column 1 (char 26418)
Analyzing the imported file I noticed that line 855 corresponds with the end of a discussion and the beginning of the next one.
Following a part of the json file I'm talking about.
{
"Title": "How to get a new badge?",
"Number": "18294",
"Category": "Profile",
"Author": "deeperwhales",
"Date": "2022-06-10T18:42:30Z",
"State": "Answered",
"Answered_by": "wavescats",
"Body": "How to get a new badge?",
"Upvotes": 140,
"Labels": [
"Profile"
],
"Participants": 31,
"Answer_count": 80,
"Reply_count": 502,
"Answers": [
{
"IsOffTopic": false,
"Author": "wavescats",
"Date": "2022-06-10T18:43:14Z",
"Body": "After answering two discussions, You will get Galaxy Brain badge More details here: https://github.com/Schweinepriester/github-profile-achievements",
"Upvotes": 59,
"Accepted": true,
"Reply_count": 239,
"Replies": [
{
"Author": "pajeeh",
"Body": "Use the most number of languages.",
"Date": "2022-10-08T09:06:56Z",
"IsAuthor": false,
"Sentiment": "neutral"
},
{
"Author": "Khairul989",
"Body": "thanks",
"Date": "2022-10-09T16:05:54Z",
"IsAuthor": false,
"Sentiment": "positive"
},
{
"Author": "ibrahimmemonn",
"Body": "Thanks",
"Date": "2022-10-18T10:10:48Z",
"IsAuthor": false,
"Sentiment": "positive"
}
],
"Sentiment": "positive"
},
{
"IsOffTopic": false,
"Author": "akbar-ardiansyah",
"Date": "2022-06-10T19:44:45Z",
"Body": "pull shrark was opened when you opened pull requests that have been merged.",
"Upvotes": 6,
"Accepted": false,
"Reply_count": 16,
"Replies": [
{
"Author": "deividepaulino1",
"Body": "thanks",
"Date": "2022-07-08T19:33:35Z",
"IsAuthor": false,
"Sentiment": "positive"
},
{
"Author": "darkhorse-coder",
"Body": "Exactly, if you approach 100+ pr merged, you will get Silver Pull Shark. ;)",
"Date": "2022-07-20T16:55:13Z",
"IsAuthor": false,
"Sentiment": "neutral"
},
{
"Author": "Splayfery",
"Body": "How can I get different levels of this achievment?",
"Date": "2022-07-20T18:24:17Z",
"IsAuthor": false,
"Sentiment": "neutral"
},
{
"Author": "wizardigor",
"Body": "Quais outros emblemas est\u00e3o disponiveis?",
"Date": "2022-08-25T13:08:52Z",
"IsAuthor": false,
"Sentiment": "neutral"
},
{
"Author": "burhancan-stack",
"Body": "thanks.",
"Date": "2022-09-20T12:09:56Z",
"IsAuthor": false,
"Sentiment": "positive"
}
],
"Sentiment": "neutral"
}
],
"Sentiment": "neutral"
} ********************************Line 855*********************************
{
"Title": "feed back on achievement badges",
"Number": "21073",
"Category": "Profile",
"Author": "SteveALee",
"Date": "2022-07-22T13:51:48Z",
"State": "Unanswered",
"Answered_by": null,
"Body": "Please turn these off by default. Gamification has no place here. Useless twaddle.",
"Upvotes": 32,
"Labels": [
"Profile"
],
"Participants": 13,
"Answer_count": 13,
"Reply_count": 10,
"Answers": [
{
"IsOffTopic": false,
"Author": "jgmac1106",
"Date": "2022-07-22T16:04:07Z",
"Body": "i agree on off by default. Always default to privacy. I disagree on utility. If the achievements had useful metadata that complied with current industry recommendations the information could be ingested to track role based training requirements of developers, aid in portfolio reviews, and allow users to control their learning data outside of employers. Granted the achievements (little disappointing) are just images for now, but it could be easily extendable to allow parsing, ignestiong, and recording in an immutable ledger.",
"Upvotes": 3,
"Accepted": false,
"Reply_count": 2,
"Replies": [
{
"Author": "SteveALee",
"Body": "That's an interesting idea but a big \"if\" to get the badges representing meaningful development metrics rather than feel good.",
"Date": "2022-07-22T16:18:47Z",
"IsAuthor": true,
"Sentiment": "positive"
},
{
"Author": "seek-dev",
"Body": "Gamifying something just to sell more metadata to corporations who parasitize human privacy is inherently exploitation. The purpose is to manifest addictive behaviour with immaterial rewards. Which is a manipulative function of industrial psychology, though common practice in today's ecosystems of consumer spyware and vacuous social media.",
"Date": "{{datetime}}",
"IsAuthor": false,
"Sentiment": "negative"
}
],
"Sentiment": "negative"
},
{
"IsOffTopic": false,
"Author": "MrSarno",
"Date": "2022-07-22T16:50:03Z",
"Body": "I'm not sure whether you're aware, but there is a setting to disable them here. I don't feel strongly about your suggestion one way or the other. I think people are more likely to check for settings to disable features they dislike than they are to search for features they might hypothetially like and wish to enable. In addition, some people may not think to look for settings, and so there would likely be a significant number of people reporting the lack of achievements as a bug, and / or creating discussions to ask why they don't appear to be working. At least the setting's there for those who want it.",
"Upvotes": 13,
"Accepted": false,
"Reply_count": 2,
"Replies": [
{
"Author": "MlgmXyysd",
"Body": "I agree with this point",
"Date": "2022-07-24T17:37:27Z",
"IsAuthor": false,
"Sentiment": "neutral"
},
{
"Author": "mark-i-m",
"Body": "Thanks for this! I would never have found that setting on my own.",
"Date": "2022-07-25T17:49:57Z",
"IsAuthor": false,
"Sentiment": "neutral"
}
],
"Sentiment": "neutral"
}
This is the code i wrote
import json
file_json = open("/content/drive/MyDrive/Lab_SC/gh_discussions_badges.json")
data = json.load(file_json)
print (data)
This is the error
JSONDecodeError: Extra data: line 855 column 1 (char 26418)

Your data are JSON objects separated by a new line. We need to turn into a JSON array. Maybe this will work?
import json
with open("/content/drive/MyDrive/Lab_SC/gh_discussions_badges.json") as f:
text = f.read()
# Find all line break, where first line is title
text = text.replace('}\n{\n"Title"','},\n{\n"Title"')
# Wrapping in array
json_text = '[' + text + ']'
data = json.loads(json_text)
print(data)

Your file is not JSON.
You're essentially doing the following:
import json
json.loads("{}{}")
Which gives
JSONDecodeError: Extra data: line 1 column 3 (char 2)
Either you need an array of objects [{},{}] or a single object {}.

Loading JSON from a Beautifulsoup Object

I am currently making a scraper app, but before going full out with the app, using other frameworks like Discord.py, I had to first scrape the site first. It proved quite difficult to scrape the site. The site that I am trying to scrape from is Fiverr. Anyways, long story short, I had to get some cookies to login with Python Requests. The big issue now is that the data I need to scrape comes in the form of JSON, which I don't know much about. I managed to select the javascript in question, but once I load it it gives an error: "TypeError: the JSON object must be str, bytes or bytearray, not Tag". I specifically need the "rows" part which is part of the JSON data.
I'm not quite certain how to fix this and have read and tried some similar questions here. I will appreciate any help.
import requests
from bs4 import BeautifulSoup
import re
import json
# Irrelevant to the question
class JobClass:
def __init__(self, date=None, buyer=None, request=None, duration=None, budget=None, link="https://www.fiverr.com/users/myusername/requests", id=None):
self.date = date
self.buyer = buyer
self.request = request
self.duration = duration
self.budget = budget
self.link = link
self.id = id
# Irrelevant to the question
duplicateSet = set()
scrapedSet = set()
jobObjArr = []
headers = {
# Some private cookies. To get them you just need to use a site like https://curl.trillworks.com/ it is really a life saver
# This is used to tell the site who you are to be logged in (which is why I deleted this part out of the code)
}
# Please note that I used "myusername" in the URL. This is going to be different depending on user
# Using the requests module, we use the "get" function
# provided to access the webpage provided as an
# argument to this function:
result = requests.get(
'https://www.fiverr.com/users/myusername/requests', headers=headers)
# Now, let us store the page content of the website accessed
# from requests to a variable:
src = result.content
# Now that we have the page source stored, we will use the
# BeautifulSoup module to parse and process the source.
# To do so, we create a BeautifulSoup object based on the
# source variable we created above:
soup = BeautifulSoup(src, "lxml")
data = soup.select("[type='text/javascript']")[1]
print(data)
# TypeError: the JSON object must be str, bytes or bytearray, not Tag
jsonObject = json.loads(data)
# Here is the output of print(data):
<script type="text/javascript">
document.viewData = {
"dds": {
"subCats": {
"current": {
"text": "All Subcategories",
"val": "-1"
},
"options": [{
"text": "Web \u0026 Mobile Design",
"val": 151
}, {
"text": "Web Programming",
"val": 140
}]
}
},
"results": {
"rows": [{
"type": "none",
"identifier": "5cf132b55e08360011efe633",
"cells": [{
"text": "May 31, 2019",
"type": "date",
"withText": true
}, {
"userPict": "\u003cspan class=\"missing-image-user \"\u003ec\u003c/span\u003e",
"type": "profile-40",
"cssClass": "height95"
}, {
"hintBottom": false,
"text": "My website was hacked and deleted. Need to have it recreated ",
"type": "text-wide",
"tags": [],
"attachment": false
}, {
"text": 1,
"type": "applications",
"alignCenter": true
}, {
"text": "3 days",
"type": "hidden-action",
"actionVisible": false,
"alignCenter": true,
"withText": true,
"buttons": [{
"type": "span",
"text": "3 days",
"class": "duration"
}, {
"type": "button",
"text": "Remove Request",
"class": "remove-request js-remove-request",
"meta": {
"requestId": "5cf132b55e08360011efe633",
"isProfessional": false
}
}]
}, {
"text": "---",
"type": "hidden-action",
"actionVisible": false,
"alignCenter": true,
"withText": true,
"buttons": [{
"type": "span",
"text": "---",
"class": "budget"
}, {
"type": "button",
"text": "Send Offer",
"class": "btn-standard btn-green-grad js-send-offer",
"meta": {
"username": "conto217",
"category": 3,
"subCategory": 151,
"requestId": "5cf132b55e08360011efe633",
"requestText": "My website was hacked and deleted. Need to have it recreated ",
"userPict": "\u003cspan class=\"missing-image-user \"\u003ec\u003c/span\u003e",
"isProfessional": false,
"buyerId": 32969684
}
}]
}]
}, {
"type": "none",
"identifier": "5cf12f641b6e99000edf1b60",
"cells": [{
"text": "May 31, 2019",
"type": "date",
"withText": true
}, {
"userPict": "\u003cimg src=\"https://fiverr-res.cloudinary.com/t_profile_small,q_auto,f_auto/attachments/profile/photo/648ceb417a85844b25e8bf070a70d9a0-254781561534997516.9743/MyFileName\" alt=\"muazamkhokher\" width=\"40\" height=\"40\"\u003e",
"type": "profile-40",
"cssClass": "height95"
}, {
"hintBottom": false,
"text": "Need mobile ui/ux designer from marvel wireframes",
"type": "text-wide",
"tags": [],
"attachment": false
}, {
"text": 4,
"type": "applications",
"alignCenter": true
}, {
"text": "5 days",
"type": "hidden-action",
"actionVisible": false,
"alignCenter": true,
"withText": true,
"buttons": [{
"type": "span",
"text": "5 days",
"class": "duration"
}, {
"type": "button",
"text": "Remove Request",
"class": "remove-request js-remove-request",
"meta": {
"requestId": "5cf12f641b6e99000edf1b60",
"isProfessional": false
}
}]
}, {
"text": "$50",
"type": "hidden-action",
"actionVisible": false,
"alignCenter": true,
"withText": true,
"buttons": [{
"type": "span",
"text": "$50",
"class": "budget"
}, {
"type": "button",
"text": "Send Offer",
"class": "btn-standard btn-green-grad js-send-offer",
"meta": {
"username": "muazamkhokher",
"category": 3,
"subCategory": 151,
"requestId": "5cf12f641b6e99000edf1b60",
"requestText": "Need mobile ui/ux designer from marvel wireframes",
"userPict": "\u003cimg src=\"https://fiverr-res.cloudinary.com/t_profile_small,q_auto,f_auto/attachments/profile/photo/648ceb417a85844b25e8bf070a70d9a0-254781561534997516.9743/MyFileName\" alt=\"muazamkhokher\" width=\"100\" height=\"100\"\u003e",
"isProfessional": false,
"buyerId": 25478156
}
}]
}]
....
I expect the JSON to be loaded in jsonObject, but I get an error: "TypeError: the JSON object must be str, bytes or bytearray, not Tag"
Edit: Here is some code at the end of the print statement. It randomly cuts off for some reason with no ending script tag:
}, {
"type": "none",
"identifier": "5cf1236a959aa5000f1ce094",
"cells": [{
"text": "May 31, 2019",
"type": "date",
"withText": true
}, {
"userPict": "\u003cimg src=\"https://fiverr-res.cloudinary.com/t_profile_small,q_auto,f_auto/profile/photos/30069758/original/Universalco_2a_Cloud.png\" alt=\"clarky2000\" width=\"40\" height=\"40\"\u003e",
"type": "profile-40",
"cssClass": "height95"
}, {
"hintBottom": false,
"text": "Slider revolution slider. 3 slides for a music festival. I can supply a copy what each slide should look like (see attached) and all the individual objects. Anyone can create basic RS slides, but I want this to be dynamic as its for a music festival. We are using the free version of RS if were are required to use the paid version of SL for addons please let us know. Bottom line this must be 3 dynamic slides (using the same background) for a music festival audience. Unlimited revisions is a must.",
"type": "see-more",
"tags": [{
"text": "Graphic UI"
}, {
"text": "Landing Pages"
}],
"attachment": {
"url": "/download/file/1559260800%2Fgig_requests%2Fattachment_f2a5f51b9fb473e8fc7f498929f39e3f",
"name": "Outwith Rotator_1920x1080_1.jpg",
"size": "2.68 MB"
}
}, {
"text": 2,
"type": "applications",
"alignCenter": true
}, {
"text": "24 hours",
"type": "hidden-action",
"actionVisible": false,
"alignCenter": true,
"withText": true,
"buttons": [{
"type": "span",
"text": "24 hours",
"class": "duration"
}, {
"type": "button",
"text": "Remove Request",
"class": "remove-request js-remove-request",
"meta": {
"requestId": "5cf1236a959aa5000f1ce094",
"isProfessional": false
}
}]
}, {
"text": "$23",
"type": "hidden-action",
"actionVisible": false,
"alignCenter": true,
"withText": true,
"buttons": [{
"type": "span",
"text": "$23",
"class": "budget"
}, {
"type": "button",
"text": "Send Of

Getting Deeper Level JSON Values in Python

I have a Python script that make an API call to retrieve data from Zendesk. (Using Python 3.x) The JSON object has a structure like this:
{
"id": 35436,
"url": "https://company.zendesk.com/api/v2/tickets/35436.json",
"external_id": "ahg35h3jh",
"created_at": "2009-07-20T22:55:29Z",
"updated_at": "2011-05-05T10:38:52Z",
"type": "incident",
"subject": "Help, my printer is on fire!",
"raw_subject": "{{dc.printer_on_fire}}",
"description": "The fire is very colorful.",
"priority": "high",
"status": "open",
"recipient": "support#company.com",
"requester_id": 20978392,
"submitter_id": 76872,
"assignee_id": 235323,
"organization_id": 509974,
"group_id": 98738,
"collaborator_ids": [35334, 234],
"forum_topic_id": 72648221,
"problem_id": 9873764,
"has_incidents": false,
"due_at": null,
"tags": ["enterprise", "other_tag"],
"via": {
"channel": "web"
},
"custom_fields": [
{
"id": 27642,
"value": "745"
},
{
"id": 27648,
"value": "yes"
}
],
"satisfaction_rating": {
"id": 1234,
"score": "good",
"comment": "Great support!"
},
"sharing_agreement_ids": [84432]
}
Where I am running into issues is in the "custom_fields" section specifically. I have a particular custom field inside of each ticket I need the value for, and I only want that particular value.
To spare you too many specifics of the Python code, I am reading through each value below for each ticket and adding it to an output variable before writing that output variable to a .csv. Here is the particular place the breakage is occuring:
output += str(ticket['custom_fields'][id:23825198]).replace(',', '')+','
All the replace nonsense is to make sure that since it is going into a comma delimited file, any commas inside of the values are removed. Anyway, here is the error I am getting:
output += str(ticket['custom_fields'][id:int(23825198)]).replace(',', '')+','
TypeError: slice indices must be integers or None or have an __index__ method
As you can see I have tried a couple different variations of this to try and resolve the issue, and have yet to find a fix. I could use some help!
Thanks...

Are you using json.loads()? If so you can then get the keys, and do an if statement against the keys. An example on how to get the keys and their respective values is shown below.
import json
some_json = """{
"id": 35436,
"url": "https://company.zendesk.com/api/v2/tickets/35436.json",
"external_id": "ahg35h3jh",
"created_at": "2009-07-20T22:55:29Z",
"updated_at": "2011-05-05T10:38:52Z",
"type": "incident",
"subject": "Help, my printer is on fire!",
"raw_subject": "{{dc.printer_on_fire}}",
"description": "The fire is very colorful.",
"priority": "high",
"status": "open",
"recipient": "support#company.com",
"requester_id": 20978392,
"submitter_id": 76872,
"assignee_id": 235323,
"organization_id": 509974,
"group_id": 98738,
"collaborator_ids": [35334, 234],
"forum_topic_id": 72648221,
"problem_id": 9873764,
"has_incidents": false,
"due_at": null,
"tags": ["enterprise", "other_tag"],
"via": {
"channel": "web"
},
"custom_fields": [
{
"sid": 27642,
"value": "745"
},
{
"id": 27648,
"value": "yes"
}
],
"satisfaction_rating": {
"id": 1234,
"score": "good",
"comment": "Great support!"
},
"sharing_agreement_ids": [84432]
}"""
# load the json object
zenJSONObj = json.loads(some_json)
# Shows a list of all custom fields
print("All the custom field data")
print(zenJSONObj['custom_fields'])
print("----")
# Tells you all the keys in the custom_fields
print("How keys and the values")
for custom_field in zenJSONObj['custom_fields']:
print("----")
for key in custom_field.keys():
print("key:",key," value: ",custom_field[key])
You can then modify the JSON object by doing something like
print(zenJSONObj['custom_fields'][0])
zenJSONObj['custom_fields'][0]['value'] = 'something new'
print(zenJSONObj['custom_fields'][0])
Then re-encode it using the following:
newJSONObject = json.dumps(zenJSONObj, sort_keys=True, indent=4)
I hope this is of some help.

How to fetch next page from REST API responses

EDIT - Here is the 1st object/record in the response. I get a total of 20 records, including this:
Note: The response has been parsed to remove unicode i.e. u' and has ' replaced by " to make this into a valid json.
{
"jobs": {
"_total": 1811,
"_count": 20,
"_start": 0,
"values": [{
"siteJobUrl": "xxx",
"company": {
"id": 21836,
"name": "CyberCoders"
},
"postingDate": {
"year": 2013,
"day": 10,
"month": 6
},
"descriptionSnippet": "Software Engineer- Hadoop, HDFS, HBaseWe are a well known consumer product development company and we are looking to add a Hadoop Engineer to our Engineering team. You will be working with the latest and greatest technologies to design, develop, and implement connectivity products that allow efficient exchange of data between our core database engine and the Hadoop ecosystem.What you need for thi",
"expirationDate": {
"year": 2013,
"day": 10,
"month": 7
},
"position": {
"industries": {
"_total": 1,
"values": [{
"code": "4",
"id": 4,
"name": "Computer Software"
}]
},
"title": "Software Engineer- Hadoop, HDFS, HBase",
"experienceLevel": {
"code": "2",
"name": "Entry level"
},
"location": {
"country": {
"code": "us"
},
"name": "Greater Pittsburgh Area"
},
"jobFunctions": {
"_total": 1,
"values": [{
"code": "it",
"name": "Information Technology"
}]
},
"jobType": {
"code": "F",
"name": "Full-time"
}
},
"customerJobCode": "CCW-ssehadooppaccw",
"locationDescription": "Pittsburgh, PA",
"jobPoster": {
"headline": "Senior Executive Technical Recruiter at CyberCoders (949)885-5121 chelsea.whalen#cybercoders.com",
"lastName": "W.",
"id": "y2zfe5j76F",
"firstName": "Chelsea"
},
"id": 6007298
},
I am trying to use the below python library to make a job search api call to the linkedin REST API.
https://github.com/ozgur/python-linkedin/
I can access 1st page output just fine. But when I increment the start to point to next page, I still get the same response. What am I missing here?
Here is my code snippet:
authentication = LinkedInAuthentication(API_KEY, API_SECRET, RETURN_URL,
PERMISSIONS.enums.values())
.......
application = LinkedInApplication(authentication)
........
# This is my request
response = application.search_job(
selectors=[{'jobs':
['id',
'customer-job-code',
'posting-date','expiration-date',
{'company':['id','name']},
{'position':['title',
'location',
'job-functions.',
'industries',
'job-type',
'experience-level']},
'skills-and-experience',
'description-snippet',
'salary',
{'job-poster':['id',
'first-name',
'last-name',
'headline']},
'referral-bonus',
'site-job-url',
'location-description']}],
params={'keywords': 'hadoop','count': 20},
headers={'start':20})
I had posted the query on linkedin developer forum too - http://developer.linkedin.com/forum/new-python-client-oauth-20
But got no response. I am new to python which is making it further difficult for me.
Appreciate your help.

How to Convert Json Value of Http Post Parameter to Python Dict in Django?

I am using Django to receive and process push notifications from the foursquare real-time api. Each checkin is pushed as a POST request to my server containing a single parameter named checkin. I am trying to grab the value of the checkin parameter and convert it to a python dict. However, calling json.loads always results in the following error:
NameError: name 'true' is not defined
I know the json is valid, so I must be doing something wrong.
The code is:
import json
def push(request):
if request.is_secure():
checkin_json = request.POST['checkin']
checkin = json.load(request.POST)
The body of the post request is:
"checkin =
{
"id": "4e6fe1404b90c00032eeac34",
"createdAt": 1315955008,
"type": "checkin",
"timeZone": "America/New_York",
"user": {
"id": "1",
"firstName": "Jimmy",
"lastName": "Foursquare",
"photo": "https://foursquare.com/img/blank_boy.png",
"gender": "male",
"homeCity": "New York, NY",
"relationship": "self"
},
"venue": {
"id": "4ab7e57cf964a5205f7b20e3",
"name": "foursquare HQ",
"contact": {
"twitter": "foursquare"
},
"location": {
"address": "East Village",
"lat": 40.72809214560253,
"lng": -73.99112284183502,
"city": "New York",
"state": "NY",
"postalCode": "10003",
"country": "USA"
},
"categories": [
{
"id": "4bf58dd8d48988d125941735",
"name": "Tech Startup",
"pluralName": "Tech Startups",
"shortName": "Tech Startup",
"icon": "https://foursquare.com/img/categories/building/default.png",
"parents": [
"Professional & Other Places",
"Offices"
],
"primary": true
}
],
"verified": true,
"stats": {
"checkinsCount": 7313,
"usersCount": 565,
"tipCount": 128
},
"url": "http://foursquare.com"
}
}"

Try json.loads(checkin_json) instead of json.load(request.POST). Notice the extra 's'.

change checkin = json.load(request.POST) to checkin = json.loads(checkin_json)

On python, boolean values are Capitalized (first letter is uppercase): True/False.
Check this.
EDIT:
Pay attentiot at this lines:
"primary": true
}
],
"verified": true,
Both "true" values are lowercase and need to be capitalized

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Parse JSON .txt file using RegEx in Python - python

Related

Problems importing, printing and analyzing JSON file in python

Loading JSON from a Beautifulsoup Object

Getting Deeper Level JSON Values in Python

How to fetch next page from REST API responses

How to Convert Json Value of Http Post Parameter to Python Dict in Django?

Categories

Resources