I've looked over this many times, and cant seem to find the problem with it.
I am trying to pull 3 fields from a JSON response (engagements, shares, comments), sum them together, then print the sum.
It seems to be returning the fields correctly, but it returns zero in the final print.
I'm very new to this stuff, but would appreciate any help anyone can give me. I'm guessing there is something fundamental I am missing here.
import urllib2,time,csv,json,requests,urlparse,pdb
SEARCH_URL = urllib2.unquote("http://soyuz.elastic.tubularlabs.net:9200/intelligence/video/_search?q=channel_youtube_id:""%s""%20AND%20published:%3E20150715T000000Z%20AND%20published:%3C20150716T000000Z")
reader = csv.reader(open('input.csv', 'r+U'), delimiter=',', quoting=csv.QUOTE_NONE)
cookie = {"user": "2|1:0|10:1438908462|4:user|36:eyJhaWQiOiA1Njk3LCAiaWQiOiA2MzQ0fQ==|b5c4b3adbd96e54833bf8656625aedaf715d4905f39373b860c4b4bc98655e9e"}
# idsToProcess = []
# for row in reader:
# if len(row)>0:
# idsToProcess.append(row[0])
idsToProcess = ['qdGW_m8Rim4FeMM29keDEg']
for userID in idsToProcess:
# print "fetching for %s.." % fbid
url = SEARCH_URL % userID
soyuzResponse = None
response = requests.request("GET", url, cookies=cookie)
ret = response.json()
soyuzResponse = ret['hits']['hits'][0]['_source']['youtube']
print soyuzResponse
totalDelta = 0
totalEngagementsVal = 0
totalSharesVal = 0
totalCommentsVal = 0
valuesArr = []
for entry in valuesArr:
arrEngagements = entry['engagements']
arrShares = entry['shares']
arrComments = entry['comments']
if len(arrEngagements)>0:
totalEngagementsVal = arrEngagements
elif len(arrShares)>0:
totalSharesVal = arrShares
elif len(arrComments)>0:
totalCommentsVal = arrComments
print "%s,%s" % (userID,totalEngagementsVal+totalSharesVal+totalCommentsVal)
totalDelta += totalEngagementsVal+totalSharesVal+totalCommentsVal
time.sleep(0)
print "%s,%s" % (userID,totalDelta)
exit()
Here is the json I am parsing:
took: 371,
timed_out: false,
_shards: {
total: 128,
successful: 128,
failed: 0
},
hits: {
total: 1,
max_score: 9.335125,
hits: [
{
_index: "intelligence_v2",
_type: "video",
_id: "jW7mjVdzR_U",
_score: 9.335125,
_source: {
claim: [
"Blucollection%2Buser"
],
topics: [
{
title_non: "Toy",
topic_id: "/m/0138tl",
title: "Toy"
},
{
title_non: "Sofia the First",
topic_id: "/m/0ncq483",
title: "Sofia the First"
}
],
likes: 1045,
duration: 318,
channel_owner_type: "influencer",
category: "Entertainment",
imported: "20150809T230652Z",
title: "Princess Sofia Cash Register Toy Surprise - Play Doh Caja Registradora Disney Sofia the First",
audience_location: [
{
country: "US",
value: 100
}
],
comments: 10,
twitter: {
tweets: 6,
engagements: 6
},
description: "Disney Princess "Sofia Cash Register" toy unboxing review by DisneyCollector. This is the authentic Royal toy of Sofia fit for a little Princess of Enchantia. Young Girls learn early on how mathematics is important in our lives, and learn to do math, developing creativity with a super fun game! Thx 4 watching this "Disney Princess Sofia cash register" unboxing review. In this video i also used Disney Frozen Princess Anna, Nickelodeon Peppa Pig blind bag and plastilina Play-Doh. Revisión del juguete Princesita Sofía Caja Registradora Real para niños y niñas. Las niñas aprenden desde muy temprano cómo las matemáticas es importante en nuestras vidas, y aprenden a hacer matemáticas, el desarrollo de la creatividad con un juego súper divertido! Here's how to say Princess in other languages: printzesa, 公主, prinses, prenses, printsess, princesse, Prinzessin, puteri, banphrionsa, Principesse, principessa, プリンセス, princese, puteri, prinsessa,prinsesse, princesa, công chúa, tywysoges, Princesses Disney, Prinzessinen, 공주, Princesas Disney, Disney πριγκίπισσες, Дисней принцесс, 디즈니 공주, ディズニーのお姫様, Vorstin, koningsdochter, Fürstin, πριγκίπισσα, księżniczka, królewna, принцесса. Here's how register is called in other languages: Caja Registradora de Princesa Sofía, Caisse Enregistreuse Princesse Sofia, Kassa, Registrierkasse Sofia die Erste Auf einmal Prinzessin, Registratore di Cassa di La Principessa Sofia, Caixa Registadora da Princesa Sofia, ηλεκτρονική ταμειακή μηχανή Σοφία η Πριγκίπισσα, 電子式金銭登録機 ちいさなプリンセス ソフィア, София Прекрасная кассовый аппарат, 디즈니주니어 리틀 프린세스 소피아 전자 금전 등록기, máy tính tiền điện tử, daftar uang elektronik, elektronik yazarkasa, Sofia den första kassaapparat leksak, Jej Wysokość Zosia kasa zabawki, Sofia het prinsesje kassa speelgoed, София Първа касов апарат играчка, casa de marcat jucărie Sofia Întâi. Princess Sofia SLEEPOVER Slumber Party - Princesita Sofía Pijamada Real. https://www.youtube.com/watch?v=WSa-Tp7HfyQ Princesita Sofía Castillo Mágico Parlante juguete de niñas. https://www.youtube.com/watch?v=ALQm_3uhIyg Sofia the First Magical Talking Castle Royal Prep Academy. https://www.youtube.com/watch?v=gcUiY0Suzrc Play-Doh Meal Makin' Kitchen with Princess Sofia the First. https://www.youtube.com/watch?v=x_-OxnRXj6g Sofia the First Royal Prep Academy Dolls Character Collection. https://www.youtube.com/watch?v=_kNY6AkSp9g Peppa Pig Picnic Adventure Car With Princess Sofia the First. https://www.youtube.com/watch?v=KIPH3txlq1o Watch "Sofia the First Talking Magic Castle" talking Clover: https://www.youtube.com/watch?v=ALQm_3uhIyg Play-Doh Sofia the First Magic Talking Castle w/ Peppa Pig: https://www.youtube.com/watch?v=-slXqMiDrY0 Play-Doh Sofia the First Going to School Portable Classroom http://www.youtube.com/watch?v=0R-dkVAIUlA",
views: 941726,
channel_network: null,
channel_subscribers: 5054024,
youtube_id: "jW7mjVdzR_U",
facebook: {
engagements: 9,
likes: 2,
shares: 7,
comments: 0
},
location_demo_count: 1,
is_public: true,
engagements: 1070,
channel_country: "US",
demo_count: null,
monetizable: true,
youtube: {
engagements: 1055,
likes: 1045,
comments: 10
},
published: "20150715T100003Z",
channel_youtube_id: "qdGW_m8Rim4FeMM29keDEg"
}
}
]
}
}
Response from terminal after running script:
{u'engagements': 1055, u'likes': 1045, u'comments': 10}
qdGW_m8Rim4FeMM29keDEg,0
qdGW_m8Rim4FeMM29keDEg,0
Your problem is these two lines:
valuesArr = []
for entry in valuesArr:
Because valuesArr is empty, the for loop never iterates, and that's where your totals are being summed.
Related
I know there's a lot of similar questions like mine, but none of them worked for me.
My json file has array for actors, directors and genre. I'm having difficult to deal if this arrays while building the xml.
This is the json file:
[
{
"title":"The Kissing Booth",
"year":"2018",
"actors":[
"Megan du Plessis",
"Lincoln Pearson",
"Caitlyn de Abrue",
"Jack Fokkens",
"Stephen Jennings",
"Chloe Williams",
"Michael Miccoli",
"Juliet Blacher",
"Jesse Rowan-Goldberg",
"Chase Dallas",
"Joey King",
"Joel Courtney",
"Jacob Elordi",
"Carson White",
"Hilton Pelser"
],
"genre":[
"Comedy",
"Romance"
],
"description":"A high school student is forced to confront her secret crush at a kissing booth.",
"directors":[
"Vince Marcello"
]
},
{
"title":"Dune",
"year":"2020",
"actors":[
"Rebecca Ferguson",
"Zendaya",
"Jason Momoa",
"Timoth\u00e9e Chalamet",
"Dave Bautista",
"Josh Brolin",
"Oscar Isaac",
"Stellan Skarsg\u00e5rd",
"Javier Bardem",
"Charlotte Rampling",
"David Dastmalchian",
"Stephen McKinley Henderson",
"Sharon Duncan-Brewster",
"Chen Chang",
"Babs Olusanmokun"
],
"genre":[
"Adventure",
"Drama",
"Sci-Fi"
],
"description":"Feature adaptation of Frank Herbert's science fiction novel, about the son of a noble family entrusted with the protection of the most valuable asset and most vital element in the galaxy.",
"directors":[
"Denis Villeneuve"
]
},
{
"title":"Parasite",
"year":"2019",
"actors":[
"Kang-ho Song",
"Sun-kyun Lee",
"Yeo-jeong Jo",
"Woo-sik Choi",
"So-dam Park",
"Jeong-eun Lee",
"Hye-jin Jang",
"Myeong-hoon Park",
"Ji-so Jung",
"Hyun-jun Jung",
"Keun-rok Park",
"Jeong Esuz",
"Jo Jae-Myeong",
"Ik-han Jung",
"Kim Gyu Baek"
],
"genre":[
"Comedy",
"Drama",
"Thriller"
],
"description":"Greed and class discrimination threaten the newly formed symbiotic relationship between the wealthy Park family and the destitute Kim clan.",
"directors":[
"Bong Joon Ho"
]
},
{
"title":"Money Heist",
"year":null,
"actors":[
"\u00darsula Corber\u00f3",
"\u00c1lvaro Morte",
"Itziar Itu\u00f1o",
"Pedro Alonso",
"Miguel Herr\u00e1n",
"Jaime Lorente",
"Esther Acebo",
"Enrique Arce",
"Darko Peric",
"Alba Flores",
"Fernando Soto",
"Mario de la Rosa",
"Juan Fern\u00e1ndez",
"Rocco Narva",
"Paco Tous",
"Kiti M\u00e1nver",
"Hovik Keuchkerian",
"Rodrigo De la Serna",
"Najwa Nimri",
"Luka Peros",
"Roberto Garcia",
"Mar\u00eda Pedraza",
"Fernando Cayo",
"Antonio Cuellar Rodriguez",
"Anna Gras",
"Aitana Rinab Perez",
"Olalla Hern\u00e1ndez",
"Carlos Su\u00e1rez",
"Mari Carmen S\u00e1nchez",
"Antonio Romero",
"Pep Munn\u00e9"
],
"genre":[
"Action",
"Crime",
"Mystery",
"Thriller"
],
"description":"An unusual group of robbers attempt to carry out the most perfect robbery in Spanish history - stealing 2.4 billion euros from the Royal Mint of Spain."
},
{
"title":"The Vampire Diaries",
"year":null,
"actors":[
"Paul Wesley",
"Ian Somerhalder",
"Kat Graham",
"Candice King",
"Zach Roerig",
"Michael Trevino",
"Nina Dobrev",
"Steven R. McQueen",
"Matthew Davis",
"Michael Malarkey"
],
"genre":[
"Drama",
"Fantasy",
"Horror",
"Mystery",
"Romance",
"Thriller"
],
"description":"The lives, loves, dangers and disasters in the town, Mystic Falls, Virginia. Creatures of unspeakable horror lurk beneath this town as a teenage girl is suddenly torn between two vampire brothers."
}
]
I want to convert my json file to xml, and I my python code:
import json as j
import xml.etree.ElementTree as ET
with open("imdb_movie_sample.json") as json_format_file:
data = j.load(json_format_file)
root = ET.Element("movie")
ET.SubElement(root,"title").text = data["title"]
ET.SubElement(root,"year").text = str(data["year"])
actors = ET.SubElement(root,"actors") #.text = data["actors"]
actors.text = ''
for i in jsondata[0]['movie'][0]['actors']:
actors.text = actors.text + '\n\t\t' + i
genre = ET.SubElement(root,"genre") #.text = data["genre"]
genre.text = ''
for i in jsondata[0]['movie'][0]['genre']:
genre.text = genre.text + '\n\t\t' + i
ET.SubElement(root,"description").text = data["description"]
directors = ET.SubElement(root,"directors") #.text = data["directors"]
directors.text = ''
for i in jsondata[0]['movie'][0]['directors']:
directors.text = directors.text + '\n\t\t' + i
tree = ET.ElementTree(root)
tree.write("imdb_sample.xml")
Does anyone know how to help me doing this? Thanks.
I found this on pypi. I would always try looking on pypi to see what exists before asking others. Its an awesome resource with python packages created by tons of developers.
https://pypi.org/project/json2xml/
In the future, (maybe still far away, due to the fact that I'm still a novice) I want to do data analysis, based on the content of the news I get from the Google News RSS, but for that, I need to have access to that content, and that is my problem.
Using the URL "https://news.google.cl/news/rss" I have access to data like the title, and the URL of each news item, but the URL is in a format that does not allow me to scrape it (https://news.google.com/__i/rss/rd/articles/CBMilgFod...).
news_url="https://news.google.cl/news/rss"
Client=urlopen(news_url)
xml_page=Client.read()
Client.close()
soup_page=soup(xml_page,"xml")
news_list=soup_page.findAll("item")
for news in news_list:
print(news.title.text)
print("-"*60)
response = urllib.request.urlopen(news.link.text)
html = response.read()
soup = soup(html,"html.parser")
text = soup.get_text(strip=True)
print(text)
The last print(text) prints some code like:
if(typeof bbclAM === 'undefined' || !bbclAM.isAM()) {
googletag.display('div-gpt-ad-1418416256666-0');
} else {
document.getElementById('div-gpt-ad-1418416256666-0').st
yle.display = 'none'
}
});(function(s, p, d) {
var h=d.location.protocol, i=p+"-"+s,
e=d.getElementById(i), r=d.getElementById(p+"-root"),
u=h==="https:"?"d1z2jf7jlzjs58.cloudfront.net"
:"static."+p+".com";
if (e) return;
I expect to print the title and the content of each news item from the RSS
This script can get you something to start with (prints title, url, short description and content from the site). Parsing the content from the site is in basic form - each site has different format/styling etc. :
import textwrap
import requests
from bs4 import BeautifulSoup
news_url="https://news.google.cl/news/rss"
rss_text=requests.get(news_url).text
soup_page=BeautifulSoup(rss_text,"xml")
def get_items(soup):
for news in soup.findAll("item"):
s = BeautifulSoup(news.description.text, 'lxml')
a = s.select('a')[-1]
a.extract() # extract lat 'See more on Google News..' link
html = requests.get(news.link.text)
soup_content = BeautifulSoup(html.text,"lxml")
# perform basic sanitization:
for t in soup_content.select('script, noscript, style, iframe, nav, footer, header'):
t.extract()
yield news.title.text.strip(), html.url, s.text.strip(), str(soup_content.select_one('body').text)
width = 80
for (title, url, shorttxt, content) in get_items(soup_page):
title = '\n'.join(textwrap.wrap(title, width))
url = '\n'.join(textwrap.wrap(url, width))
shorttxt = '\n'.join(textwrap.wrap(shorttxt, width))
content = '\n'.join(textwrap.wrap(textwrap.shorten(content, 1024), width))
print(title)
print(url)
print('-' * width)
print(shorttxt)
print()
print(content)
print()
Prints:
WWF califica como inaceptable y condenable adulteración de información sobre
salmones de Nova Austral - El Mostrador
https://m.elmostrador.cl/dia/2019/06/30/wwf-califica-como-inaceptable-y-
condenable-adulteracion-de-informacion-sobre-salmones-de-nova-austral/
--------------------------------------------------------------------------------
El MostradorLa organización pide investigar los centros de cultivo de la
salmonera de capitales noruegos y abrirá un proceso formal de quejas. La empresa
ubicada en la ...
01:41:28 WWF califica como inaceptable y condenable adulteración de información
sobre salmones de Nova Austral - El Mostrador País PAÍS WWF califica como
inaceptable y condenable adulteración de información sobre salmones de Nova
Austral por El Mostrador 30 junio, 2019 La organización pide investigar los
centros de cultivo de la salmonera de capitales noruegos y abrirá un proceso
formal de quejas. La empresa ubicada en la Patagonia chilena es acusada de
falsear información oficial ante Sernapesca. 01:41:28 Compartir esta Noticia
Enviar por mail Rectificar Tras una investigación periodística de varios meses,
El Mostrador accedió a abundante información reservada, que incluye correos
electrónicos de la gerencia de producción de la compañía salmonera Nova Austral
–de capitales noruegos– a sus jefes de área, donde se instruye manipular las
estadísticas de mortalidad de los salmones para ocultar las verdaderas cifras a
Sernapesca –la entidad fiscalizadora–, a fin de evitar multas y ver disminuir
las [...]
...and so on.
Clone this project,
git clone git#github.com:philipperemy/google-news-scraper.git gns
cd gns
sudo pip install -r requirements.txt
python main_no_vpn.py
Out put will be
{
"content": "............",
"datetime": "...",
"keyword": "...",
"link": "...",
"title": "..."
},
{
"content": "............",
"datetime": "...",
"keyword": "...",
"link": "...",
"title": "..."
}
Source : Here
In order to access data such as title and others, you first need to collect all the news in a list. Each news item is located in the iter tag, and they are in the channel tag. So let's use this sample:
soup.channel.find_all('item')
After that, you can extract the necessary data for each news.
for result in soup.channel.find_all('item'):
title = result.title.text
link = result.link.text
date = result.pubDate.text
source = result.source.get("url")
print(title, link, date, source, sep='\n', end='\n\n')
Also, make sure you're using request headers user-agent to act as a "real" user visit. Because default requests user-agent is python-requests and websites understand that it's most likely a script that sends a request. Check what's your user-agent.
Code and full example in online IDE:
from bs4 import BeautifulSoup
import requests
# https://docs.python-requests.org/en/master/user/quickstart/#passing-parameters-in-urls
params = {
"hl": "en-US", # language
"gl": "US", # country of the search, US -> USA
"ceid": "US:en",
}
# https://docs.python-requests.org/en/master/user/quickstart/#custom-headers
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.88 Safari/537.36",
}
html = requests.get("https://news.google.com/rss", params=params, headers=headers, timeout=30)
soup = BeautifulSoup(html.text, "xml")
for result in soup.channel.find_all('item'):
title = result.title.text
link = result.link.text
date = result.pubDate.text
source = result.source.get("url")
print(title, link, date, source, sep='\n', end='\n\n')
Output:
UK and Europe Heat Wave News: Live Updates - The New York Times
https://news.google.com/__i/rss/rd/articles/CBMiRGh0dHBzOi8vd3d3Lm55dGltZXMuY29tL2xpdmUvMjAyMi8wNy8xOS93b3JsZC91ay1ldXJvcGUtaGVhdC13ZWF0aGVy0gEA?oc=5
Tue, 19 Jul 2022 11:56:58 GMT
https://www.nytimes.com
... other results
Another way to achieve the same thing is to scrape Google News from the HTML instead.
I want to demonstrate how to scrape Google News using pagination. Оne of the ways is to use the start URL parameter which is equal to 0 by default. 0 means the first page, 10 is for the second, and so on.
Also, default search results return about ~10-15 pages. To increase the number of returned pages, you need to set the filter parameter to 0 and pass it to the URL which will return 10+ pages. Basically, this parameter defines the filters for Similar Results and Omitted Results.
While the next button exists, you need to increment the ["start"] parameter value by 10 to access the next page if it's present, otherwise we need to break out of the while loop.
And here is the code:
from bs4 import BeautifulSoup
import requests, lxml
# https://docs.python-requests.org/en/master/user/quickstart/#passing-parameters-in-urls
params = {
"q": "Elon Musk",
"hl": "en-US", # language
"gl": "US", # country of the search, US -> USA
"tbm": "nws", # google news
"start": 0, # number page by default up to 0
# "filter": 0 # shows more than 10 pages. By default up to ~10-15 if filter = 1.
}
# https://docs.python-requests.org/en/master/user/quickstart/#custom-headers
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.88 Safari/537.36",
}
page_num = 0
while True:
page_num += 1
print(f"{page_num} page:")
html = requests.get("https://www.google.com/search", params=params, headers=headers, timeout=30)
soup = BeautifulSoup(html.text, "lxml")
for result in soup.select(".WlydOe"):
source = result.select_one(".NUnG9d").text
title = result.select_one(".mCBkyc").text
link = result.get("href")
try:
snippet = result.select_one(".GI74Re").text
except AttributeError:
snippet = None
date = result.select_one(".ZE0LJd").text
print(source, title, link, snippet, date, sep='\n', end='\n\n')
if soup.select_one('.d6cvqb a[id=pnnext]'):
params["start"] += 10
else:
break
Output:
1 page:
BuzzFeed News
Elon Musk’s Viral Shirtless Photos Have Sparked A Conversation Around
Body-Shaming After Some People Argued That He “Deserves” To See The Memes
Mocking His Physique
https://www.buzzfeednews.com/article/leylamohammed/elon-musk-shirtless-yacht-photos-memes-body-shaming
None
18 hours ago
People
Elon Musk Soaks Up Sun While Spending Time with Pals Aboard Luxury Yacht in
Greece
https://people.com/human-interest/elon-musk-spends-time-with-friends-aboard-luxury-yacht-in-greece/
None
2 days ago
New York Post
Elon Musk jokes shirtless pictures in Mykonos are 'good motivation' to hit
gym
https://nypost.com/2022/07/21/elon-musk-jokes-shirtless-pics-in-mykonos-are-good-motivation/
None
14 hours ago
... other results from the 1st and subsequent pages.
10 page:
Vanity Fair
A Reminder of Just Some of the Terrible Things Elon Musk Has Said and Done
https://www.vanityfair.com/news/2022/04/elon-musk-twitter-terrible-things-hes-said-and-done
... yesterday's news with “shock and dismay,” a lot of people are not
enthused about the idea of Elon Musk buying the social media network.
Apr 26, 2022
CNBC
Elon Musk is buying Twitter. Now what?
https://www.cnbc.com/2022/04/27/elon-musk-just-bought-twitter-now-what.html
Elon Musk has finally acquired Twitter after a weekslong saga during which
he first became the company's largest shareholder, then offered...
Apr 27, 2022
New York Magazine
11 Weird and Upsetting Facts About Elon Musk
https://nymag.com/intelligencer/2022/04/11-weird-and-upsetting-facts-about-elon-musk.html
3. Elon allegedly said some pretty awful things to his first wife · While
dancing at their wedding reception, Musk told Justine, “I am the alpha...
Apr 30, 2022
... other results from 10th page.
If you need more information about Google News, have a look at Web Scraping Google News with Python blog post.
I have the following Python code to read a JSON file:
import json
from pprint import pprint
with open('traveladvisory.json') as json_data:
print 'json data ',json_data
d = json.load(json_data)
json_data.close()
Below is a piece of the 'traveladvisory.json' file opened with this code. The variable 'd' does print out all the JSON data. But I can't seem to get the syntax correct to read all of the 'country-eng' and 'advisory-text' fields and their data and print it out. Can someone assist? Here's a piece of the json data (sorry, can't get it pretty printed):
{
"metadata":{
"generated":{
"timestamp":1475854624,
"date":"2016-10-07 11:37:04"
}
},
"data":{
"AF":{
"country-id":1000,
"country-iso":"AF",
"country-eng":"Afghanistan",
"country-fra":"Afghanistan",
"advisory-state":3,
"date-published":{
"timestamp":1473866215,
"date":"2016-09-14 11:16:55",
"asp":"2016-09-14T11:16:55.000000-04:00"
},
"has-advisory-warning":1,
"has-regional-advisory":0,
"has-content":1,
"recent-updates-type":"Editorial change",
"eng":{
"name":"Afghanistan",
"url-slug":"afghanistan",
"friendly-date":"September 14, 2016 11:16 EDT",
"advisory-text":"Avoid all travel",
"recent-updates":"The Health tab was updated - travel health notices (Public Health Agency of Canada)."
},
"fra":{
"name":"Afghanistan",
"url-slug":"afghanistan",
"friendly-date":"14 septembre 2016 11:16 HAE",
"advisory-text":"\u00c9viter tout voyage",
"recent-updates":"L'onglet Sant\u00e9 a \u00e9t\u00e9 mis \u00e0 jour - conseils de sant\u00e9 aux voyageurs (Agence de la sant\u00e9 publique du Canada)."
}
},
"AL":{
"country-id":4000,
"country-iso":"AL",
"country-eng":"Albania",
"country-fra":"Albanie",
"advisory-state":0,
"date-published":{
"timestamp":1473350931,
"date":"2016-09-08 12:08:51",
"asp":"2016-09-08T12:08:51.8301256-04:00"
},
"has-advisory-warning":0,
"has-regional-advisory":1,
"has-content":1,
"recent-updates-type":"Editorial change",
"eng":{
"name":"Albania",
"url-slug":"albania",
"friendly-date":"September 8, 2016 12:08 EDT",
"advisory-text":"Exercise normal security precautions (with regional advisories)",
"recent-updates":"An editorial change was made."
},
"fra":{
"name":"Albanie",
"url-slug":"albanie",
"friendly-date":"8 septembre 2016 12:08 HAE",
"advisory-text":"Prendre des mesures de s\u00e9curit\u00e9 normales (avec avertissements r\u00e9gionaux)",
"recent-updates":"Un changement mineur a \u00e9t\u00e9 apport\u00e9 au contenu."
}
},
"DZ":{
"country-id":5000,
"country-iso":"DZ",
"country-eng":"Algeria",
"country-fra":"Alg\u00e9rie",
"advisory-state":1,
"date-published":{
"timestamp":1475593497,
"date":"2016-10-04 11:04:57",
"asp":"2016-10-04T11:04:57.7727548-04:00"
},
"has-advisory-warning":0,
"has-regional-advisory":1,
"has-content":1,
"recent-updates-type":"Full TAA review",
"eng":{
"name":"Algeria",
"url-slug":"algeria",
"friendly-date":"October 4, 2016 11:04 EDT",
"advisory-text":"Exercise a high degree of caution (with regional advisories)",
"recent-updates":"This travel advice was thoroughly reviewed and updated."
},
"fra":{
"name":"Alg\u00e9rie",
"url-slug":"algerie",
"friendly-date":"4 octobre 2016 11:04 HAE",
"advisory-text":"Faire preuve d\u2019une grande prudence (avec avertissements r\u00e9gionaux)",
"recent-updates":"Les pr\u00e9sents Conseils aux voyageurs ont \u00e9t\u00e9 mis \u00e0 jour \u00e0 la suite d\u2019un examen minutieux."
}
},
}
}
Assuming d contains the json Data
for country in d["data"]:
print "Country :",country
#country.get() gets the value of the key . the second argument is
#the value returned in case the key is not present
print "country-eng : ",country.get("country-eng",0)
print "advisory-text(eng) :",country["eng"].get("advisory-text",0)
print "advisory-text(fra) :",country["fra"].get("advisory-text",0)
This worked for me:
for item in d['data']:
print d['data'][item]['country-eng'], d['data'][item]['eng']['advisory-text']
If I am understanding your question. Here's how to do it:
import json
with open('traveladvisory.json') as json_data:
d = json.load(json_data)
# print(json.dumps(d, indent=4)) # pretty-print data read
for country in d['data']:
print(country)
print(' country-eng: {}'.format(d['data'][country]['country-eng']))
print(' advisory-state: {}'.format(d['data'][country]['advisory-state']))
Output:
DZ
country-eng: Algeria
advisory-state: 1
AL
country-eng: Albania
advisory-state: 0
AF
country-eng: Afghanistan
advisory-state: 3
Then the code bellow work for python2 (if you need the python3 version please ask)
You have to use the load function from the json module
#-*- coding: utf-8 -*-
import json # import the module we need
with open("traveladvisory.json") as f: # f for file
d = json.load(f) # d is a dictionnary
for key in d['data']:
print d['data'][key]['country-eng']
print d['data'][key]['eng']['advisory-text']
It is good practice to use the with statement when dealing with file objects.
This has the advantage that the file is properly closed after its suite finishes, even if an exception is raised on the way.
Also, the json is wrong, you have to remove to comma from line 98:
{
"metadata":{
"generated":{
"timestamp":1475854624,
"date":"2016-10-07 11:37:04"
}
},
"data":{
"AF":{
"country-id":1000,
"country-iso":"AF",
"country-eng":"Afghanistan",
"country-fra":"Afghanistan",
"advisory-state":3,
"date-published":{
"timestamp":1473866215,
"date":"2016-09-14 11:16:55",
"asp":"2016-09-14T11:16:55.000000-04:00"
},
"has-advisory-warning":1,
"has-regional-advisory":0,
"has-content":1,
"recent-updates-type":"Editorial change",
"eng":{
"name":"Afghanistan",
"url-slug":"afghanistan",
"friendly-date":"September 14, 2016 11:16 EDT",
"advisory-text":"Avoid all travel",
"recent-updates":"The Health tab was updated - travel health notices (Public Health Agency of Canada)."
},
"fra":{
"name":"Afghanistan",
"url-slug":"afghanistan",
"friendly-date":"14 septembre 2016 11:16 HAE",
"advisory-text":"\u00c9viter tout voyage",
"recent-updates":"L'onglet Sant\u00e9 a \u00e9t\u00e9 mis \u00e0 jour - conseils de sant\u00e9 aux voyageurs (Agence de la sant\u00e9 publique du Canada)."
}
},
"AL":{
"country-id":4000,
"country-iso":"AL",
"country-eng":"Albania",
"country-fra":"Albanie",
"advisory-state":0,
"date-published":{
"timestamp":1473350931,
"date":"2016-09-08 12:08:51",
"asp":"2016-09-08T12:08:51.8301256-04:00"
},
"has-advisory-warning":0,
"has-regional-advisory":1,
"has-content":1,
"recent-updates-type":"Editorial change",
"eng":{
"name":"Albania",
"url-slug":"albania",
"friendly-date":"September 8, 2016 12:08 EDT",
"advisory-text":"Exercise normal security precautions (with regional advisories)",
"recent-updates":"An editorial change was made."
},
"fra":{
"name":"Albanie",
"url-slug":"albanie",
"friendly-date":"8 septembre 2016 12:08 HAE",
"advisory-text":"Prendre des mesures de s\u00e9curit\u00e9 normales (avec avertissements r\u00e9gionaux)",
"recent-updates":"Un changement mineur a \u00e9t\u00e9 apport\u00e9 au contenu."
}
},
"DZ":{
"country-id":5000,
"country-iso":"DZ",
"country-eng":"Algeria",
"country-fra":"Alg\u00e9rie",
"advisory-state":1,
"date-published":{
"timestamp":1475593497,
"date":"2016-10-04 11:04:57",
"asp":"2016-10-04T11:04:57.7727548-04:00"
},
"has-advisory-warning":0,
"has-regional-advisory":1,
"has-content":1,
"recent-updates-type":"Full TAA review",
"eng":{
"name":"Algeria",
"url-slug":"algeria",
"friendly-date":"October 4, 2016 11:04 EDT",
"advisory-text":"Exercise a high degree of caution (with regional advisories)",
"recent-updates":"This travel advice was thoroughly reviewed and updated."
},
"fra":{
"name":"Alg\u00e9rie",
"url-slug":"algerie",
"friendly-date":"4 octobre 2016 11:04 HAE",
"advisory-text":"Faire preuve d\u2019une grande prudence (avec avertissements r\u00e9gionaux)",
"recent-updates":"Les pr\u00e9sents Conseils aux voyageurs ont \u00e9t\u00e9 mis \u00e0 jour \u00e0 la suite d\u2019un examen minutieux."
}
}
}
}
How could I print which rooms are connected to the "de Lobby"? The things I tried returned string erros or other errors.
kamers = {
1 : { "naam" : "de Lobby" ,
"trap" : 2,
"gangrechtdoor" : 3 } ,
2 : { "naam" : "de Trap" ,
"lobby" : 1,
"note" : "Terwijl je de trap oploopt hoor je in de verte Henk van Ommen schreeuwen" } ,
3 : { "naam" : "de Gang rechtdoor" ,
"lobby" : 1,
"gymzaal" : 4,
"concergie" : 5,
"gangaula" : 6 } ,
This prints where you are, but as you can see, not which rooms are connected.
print("Hier ben je: " + kamers[currentKamer]["naam"])
print("hier kan je naartoe: ")
Does this do what you want?
kamers = {
1: {"naam": "de Lobby",
"trap": 2,
"gangrechtdoor": 3},
2: {"naam": "de Trap",
"lobby": 1,
"note": "Terwijl je de trap oploopt hoor je in de verte Henk van Ommen schreeuwen"},
3: {"naam": "de Gang rechtdoor",
"lobby": 1,
"gymzaal": 4,
"concergie": 5,
"gangaula": 6}}
def find_connected_rooms(room_name, rooms):
room_number = next(room_number for room_number, props in rooms.items() if props['naam'] == room_name)
for room_props in rooms.values():
if room_number in room_props.values():
yield room_props['naam']
if __name__ == '__main__':
for connected_room in find_connected_rooms('de Lobby', kamers):
print(connected_room)
Output
de Trap
de Gang rechtdoor
Question is not quite clear but I assume you are looking for the items which has lobby key or any key with 1 as value
kamers[1] is lobby and it is "naam" is "de Lobby".
so This gets by if items inside has value 1 (Lobby's key)
[i for i in kamers.values() if 1 in i.values()]
or you can check if the key 'lobby' exists
[i for i in kamers.values() if i.get('lobby',None) ]
to get the name of the rooms you can replace for "i" with i['naam']
[i['naam'] for i in kamers.values() if i.get('lobby',None) ]
which returns
['de Trap', 'de Gang rechtdoor']
initially i had to create a function that receives the person's attributes and returns a structure that looks like that:
Team:
Name: Real Madrid
President:
Name: Florentino Perez
Age: 70
Country: Spain
Office: 001
Coach:
Name: Carlo Ancelotti
Age: 55
Country: Italy
Office: 006
Coach License: 456789545678
Players:
- Name: Cristiano Ronaldo
Age: 30
Country: Portugal
Number: 7
Position: Forward
Golden Balls: 1
- Name: Chicharito
Age: 28
Country: Mexico
Number: 14
Position: Forward
- Name: James Rodriguez
Age: 22
Country: Colombia
Number: 10
Position: Midfielder
- Name: Lucas Modric
Age: 28
Country: Croatia
Number: 19
Position: Midfielder
This structure also contains info about other clubs . I managed to do this with the following function:
def create_person(name, age, country, **kwargs):
info={"Name": name, "Age": age, "Country": country}
for k,v in kwargs.iteritems():
info[k]=v
return info
I used this function to create a list of nested dictionaries and display the right structure for each team. Example:
teams = [
{
"Club Name": "Real Madrid",
"Club President": create_person("Florentino Perez", 70, "Spain", Office="001"),
"Club's Coach": create_person("Carlo Angelotii", 60, "Italy", Office="006", CoachLicense="456789545678"),
"Players": {
"Real_Player1": create_person("Cristiani Ronaldo", 30, "Portugal", Number="7", Position="Forward", GoldenBalls="1"),
"Real_Player2": create_person("Chicharito", 28, "Mexic", Number="14", Position="Forward"),
"Real_Player3": create_person("James Rodriguez", 22, "Columbia", Number="10", Position="Midfilder"),
"Real_Player4": create_person("Lucas Modric", 28, "Croatia", Number="19", Position="Midfilder")
}
},
{
"Club Name": "Barcelona",
"Club President": create_person("Josep Maria Bartolomeu", 60, "Spain", Office="B123"),
"Club's Coach": create_person("Luis Enrique Martinez", 43, "Spain", Office="B405", CoachLicense="22282321231"),
"Players": {
"Barcelona_Player1": create_person("Lionel Messi", 28, "Argentina", Number="10", Position="Forward", GoldenBalls="3"),
"Barcelona_Player2": create_person("Xavi Hernandez", 34, "Spain", Number="6", Position="Midfilder"),
"Barcelona_Player3": create_person("Dani Alvez", 28, "Brasil", Number="22", Position="Defender"),
"Barcelona_Player4": create_person("Gerard Pique", 29, "Spain", Number="22", Position="Defender")
}
}
]
Everything fine so far.
The part where I got stuck is this: Create a function print_president that receives the team name prints the following output:
Team: Real Madrid
President: Florentino Perez
Age: 70
Country: Spain
Office: 001
I could use a variable to display this but i need a function and I don't know how to work around this. Please help!
When you're trying to solve a problem (or ask a question) first simplify as much as you can. Your print_president() function takes a team name and then prints various pieces of information about the team. Each team is a dictionary with various attributes. So a simplified version of the problem might look like this:
teams = [
{
'name': 'Real Madrid',
'pres': 'Florentino',
},
{
'name': 'Barcelona',
'pres': 'Josep',
},
]
def print_president(team_name):
for t in teams:
# Now, you finish the rest. What should we check here?
...
print_president('Barcelona')
I can't think of a way to do this with just a team name, as you will have to know which dict to look at. I think something like this:
def print_president(team):
print 'Team: {team} President: {president} Age: {age} Country: {country} Office: {office}'.format(
team=team['Club Name'],
president=team['Club President']['Name'],
age=team['Club President']['Age'],
country=team['Club President']['Country'],
office=team['Club President']['Office']
)
If you are thinking of looking through all the teams in the list, then pass in two arguments: teams_list and team_name:
def print_president(teams_list,team_name):
for team in teams_list:
if team_name in team.values():
print 'Team: {team} President: {president} Age: {age} Country: {country} Office: {office}'.format(
team=team['Club Name'],
president=team['Club President']['Name'],
age=team['Club President']['Age'],
country=team['Club President']['Country'],
office=team['Club President']['Office']
)