Scrape only selected text from tables using Python/Beautiful soup/pandas

Scrape only selected text from tables using Python/Beautiful soup/pandas - python

I am new to Python and am using beautiful soup for web scraping for a project.
I am hoping to only get parts of the text in a list/dictionary. I started with the following code:
url = "http://eng.mizon.co.kr/productlist.asp"
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
tables = soup.find_all('table')
This helped me parse data into tables and ONE of the items from table looked as below:
<table border="0" cellpadding="0" cellspacing="0" width="235">
<tr>
<td align="center" height="238"><img alt="LL IN ONE SNAIL REPAIR CREAM, SNAIL REPAIR BLEMISH BALM, WATERMAX MOISTURE B.B CREAM, WATERMAX AQUA GEL CREAM, CORRECT COMBO CREAM, GOLD STARFISH ALL IN ONE CREAM, S-VENOM WRINKLE TOX CREAM, BLACK SNAIL ALL IN ONE CREAM, APPLE SMOOTHIE PEELING GEL, REAL SOYBEAN DEEP CLEANSING OIL, COLLAGEN POWER LIFTING CREAM, SNAIL RECOVERY GEL CREAM" border="0" src="http://www.mizon.co.kr/images/upload/product/20150428113514_3.jpg" width="240"/></td>
</tr>
<tr>
<td align="center" height="43" valign="middle"><a href="javascript:fnMoveDetail(7499)" onfocus="this.blur()"><span class="style3">ENJOY VITAL-UP TIME Lift Up Mask <br/>
Volume:25ml</span></a></td>
</tr>
</table>
For each such item in the table, I would like to extract only the following from the last data cell in table above:
1) The four digit number in a href = javascript:fnMoveDetail(7499)
2) Name under class:style3
3) volume under class:style3
The next lines in my code were as follows:
df = pd.read_html(str(tables), skiprows={0}, flavor="bs4")[0]
a_links = soup.find_all('a', attrs={'class':'style3'})
stnid_dict = {}
for a_link in a_links:
cid = ((a_link['href'].split("javascript:fnMoveDetail("))[1].split(")")[0])
stnid_dict[a_link.text] = cid
My objective is to use the numbers to go to individual links and then match the info scraped on this page to each link.
What would be the best way to approach this?

use a tag which contains javascript href as anchor, find all span and then get it's parent tag.
url = "http://eng.mizon.co.kr/productlist.asp"
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
spans = soup.select('td > a[href*="javascript:fnMoveDetail"] > span')
for span in spans:
href = span.find_parent('a').get('href').strip('javascript:fnMoveDetail()')
name, volume = span.get_text(strip=True).split('Volume:')
print(name, volume, href)
out:
Dust Clean up Peeling Toner 150ml 8235
Collagen Power Lifting EX Toner 150ml 8067
Collagen Power Lifting EX Emulsion 150ml 8068
Barrier Oil Toner 150ml 8059
Barrier Oil Emulsion 150ml 8060
BLACK CLEAN UP PORE WATER FINISHER 150ml 7650
Vita Lemon Sparkling Toner 150ml 7356
INTENSIVE SKIN BARRIER TONER 150ml 7110
INTENSIVE SKIN BARRIER EMULSION 150ml 7111

Related

Extracting values from HTML in python

My project involves web scraping using python. In my project I need to get data about a given its registration. I have managed to get the html from the site into python but I am struggling to extract the values.
I am using this website: https://www.carcheck.co.uk/audi/N18CTN
from bs4 import BeautifulSoup
import requests
url = "https://www.carcheck.co.uk/audi/N18CTN"
r= requests.get(url)
soup = BeautifulSoup(r.text)
print(soup)
I need to get this information about the vehicle
<td>AUDI</td>
</tr>
<tr>
<th>Model</th>
<td>A3</td>
</tr>
<tr>
<th>Colour</th>
<td>Red</td>
</tr>
<tr>
<th>Year of manufacture</th>
<td>2017</td>
</tr>
<tr>
<th>Top speed</th>
<td>147 mph</td>
</tr>
<tr>
<th>Gearbox</th>
<td>6 speed automatic</td>
How would I go about doing this?

Since you don't have extensive experience with BeautifulSoup, you can effortlessly match the table containing the car information using a CSS selector and then you can extract the header and data rows to combine them into a dictionary:
import requests
from bs4 import BeautifulSoup
url = "https://www.carcheck.co.uk/audi/N18CTN"
soup = BeautifulSoup(requests.get(url).text, "lxml")
# Select the table containing the car information using CSS selector
table = soup.select_one("div.page:nth-child(2) > div:nth-child(4) > div:nth-child(1) > table:nth-child(1)")
# Extract header rows from the table and store them in a list
headers = [th.text for th in table.select("th")]
# Extract data rows from the table and store them in a list
data = [td.text for td in table.select("td")]
# Combine header rows and data rows into a dictionary using a dict comprehension
car_info = {key: value for key, value in zip(headers, data)}
print(car_info)
Ouput:
{'Make': 'AUDI', 'Model': 'A3', 'Colour': 'Red', 'Year of manufacture': '2017', 'Top speed': '147 mph', 'Gearbox': '6 speed automatic'}
In order to obtain the CSS selector pattern of the table you can use the devtools of your browser:

You can use this example to get you started how to get information from this page:
import requests
import pandas as pd
from bs4 import BeautifulSoup
url = 'https://www.carcheck.co.uk/audi/N18CTN'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
all_data = []
for row in soup.select('tr:has(th):has(td):not(:has(table))'):
header = row.find_previous('h1').text.strip()
title = row.th.text.strip()
text = row.td.text.strip()
all_data.append((header, title, text))
df = pd.DataFrame(all_data, columns = ['Header', 'Title', 'Value'])
print(df.head(20).to_markdown(index=False))
Prints:
Header
Title
Value
General information
Make
AUDI
General information
Model
A3
General information
Colour
Red
General information
Year of manufacture
2017
General information
Top speed
147 mph
General information
Gearbox
6 speed automatic
Engine & fuel consumption
Power
135 kW / 184 HP
Engine & fuel consumption
Engine capacity
1.968 cc
Engine & fuel consumption
Cylinders
4
Engine & fuel consumption
Fuel type
Diesel
Engine & fuel consumption
Consumption city
42.0 mpg
Engine & fuel consumption
Consumption extra urban
52.3 mpg
Engine & fuel consumption
Consumption combined
48.0 mpg
Engine & fuel consumption
CO2 emission
129 g/km
Engine & fuel consumption
CO2 label
D
MOT history
MOT expiry date
2023-10-27
MOT history
MOT pass rate
83 %
MOT history
MOT passed
5
MOT history
Failed MOT tests
1
MOT history
Total advice items
11

Script produces wrong results when linebreak comes into play

I've written a script in python to scrape some disorganized content located within b tags and thier next_sibling from a webpage. The thing is my script fails when linebreaks come between. I'm trying to extract the title's and their concerning description from that page starting from CHIEF COMPLAINT: Bright red blood per rectum to just before Keywords:.
Website address
I've tried so far with:
import requests
from bs4 import BeautifulSoup
url = 'https://www.mtsamples.com/site/pages/sample.asp?Type=24-Gastroenterology&Sample=941-BloodperRectum'
res = requests.get(url)
soup = BeautifulSoup(res.text,'lxml')
for item in soup.select_one("hr").find_next_siblings('b'):
print(item.text,item.next_sibling)
The portion of output giving me unwanted results are like:
LABS: <br/>
CBC: <br/>
CHEM 7: <br/>
How can I get the titles and their concerning description accordingly?

Here's a scraper that's more robust compared to yesterday's solutions.
How to loop through scraping multiple documents on multiple web pages using BeautifulSoup?
How can I grab the entire body text from a web page using BeautifulSoup?
It extracts, title, description and all sections properly
import re
import copy
import requests
from bs4 import BeautifulSoup, Tag, Comment, NavigableString
from urllib.parse import urljoin
from pprint import pprint
import itertools
import concurrent
from concurrent.futures import ThreadPoolExecutor
BASE_URL = 'https://www.mtsamples.com'
def make_soup(url: str) -> BeautifulSoup:
res = requests.get(url)
res.raise_for_status()
html = res.text
soup = BeautifulSoup(html, 'html.parser')
return soup
def clean_soup(soup: BeautifulSoup) -> BeautifulSoup:
soup = copy.copy(soup)
h1 = soup.select_one('h1')
kw_re = re.compile('.*Keywords.*', flags=re.IGNORECASE)
kw = soup.find('b', text=kw_re)
for el in (*h1.previous_siblings, *kw.next_siblings):
el.extract()
kw.extract()
for ad in soup.select('[id*="ad"]'):
ad.extract()
for script in soup.script:
script.extract()
for c in h1.parent.children:
if isinstance(c, Comment):
c.extract()
return h1.parent
def extract_meta(soup: BeautifulSoup) -> dict:
h1 = soup.select_one('h1')
title = h1.text.strip()
desc_parts = []
desc_re = re.compile('.*Description.*', flags=re.IGNORECASE)
desc = soup.find('b', text=desc_re)
hr = soup.select_one('hr')
for s in desc.next_siblings:
if s is hr:
break
if isinstance(s, NavigableString):
desc_parts.append(str(s).strip())
elif isinstance(s, Tag):
desc_parts.append(s.text.strip())
description = '\n'.join(p.strip() for p in desc_parts if p.strip())
return {
'title': title,
'description': description
}
def extract_sections(soup: BeautifulSoup) -> list:
titles = [b for b in soup.select('b') if b.text.isupper()]
parts = []
for t in titles:
title = t.text.strip(': ').title()
text_parts = []
for s in t.next_siblings:
# walk forward until we see another title
if s in titles:
break
if isinstance(s, Comment):
continue
if isinstance(s, NavigableString):
text_parts.append(str(s).strip())
if isinstance(s, Tag):
text_parts.append(s.text.strip())
text = '\n'.join(p for p in text_parts if p.strip())
p = {
'title': title,
'text': text
}
parts.append(p)
return parts
def extract_page(url: str) -> dict:
soup = make_soup(url)
clean = clean_soup(soup)
meta = extract_meta(clean)
sections = extract_sections(clean)
return {
**meta,
'sections': sections
}
url = 'https://www.mtsamples.com/site/pages/sample.asp?Type=24-Gastroenterology&Sample=941-BloodperRectum'
page = extract_page(url)
pprint(page, width=2000)
output:
{'description': 'Status post colonoscopy. After discharge, experienced bloody bowel movements and returned to the emergency department for evaluation.\n(Medical Transcription Sample Report)',
'sections': [{'text': 'Bright red blood per rectum', 'title': 'Chief Complaint'},
# some elements removed for brevity
{'text': '', 'title': 'Labs'},
{'text': 'WBC count: 6,500 per mL\nHemoglobin: 10.3 g/dL\nHematocrit:31.8%\nPlatelet count: 248 per mL\nMean corpuscular volume: 86.5 fL\nRDW: 18%', 'title': 'Cbc'},
{'text': 'Sodium: 131 mmol/L\nPotassium: 3.5 mmol/L\nChloride: 98 mmol/L\nBicarbonate: 23 mmol/L\nBUN: 11 mg/dL\nCreatinine: 1.1 mg/dL\nGlucose: 105 mg/dL', 'title': 'Chem 7'},
{'text': 'PT 15.7 sec\nINR 1.6\nPTT 29.5 sec', 'title': 'Coagulation Studies'},
{'text': 'The patient receive ... ula.', 'title': 'Hospital Course'}],
'title': 'Sample Type / Medical Specialty: Gastroenterology\nSample Name: Blood per Rectum'}

Code:
from urllib.request import urlopen
from bs4 import BeautifulSoup
url = 'https://www.mtsamples.com/site/pages/sample.asp?Type=24-Gastroenterology& Sample=941-BloodperRectum'
res = urlopen(url)
html = res.read()
soup = BeautifulSoup(html,'html.parser')
# Cut the division containing required text,used Right Click and Inspect element in broweser to find the respective div/tag
sampletext_div = soup.find('div', {'id': "sampletext"})
print(sampletext_div.find('h1').text) # TO print header
Output:
Sample Type / Medical Specialty: Gastroenterology
Sample Name: Blood per Rectum
Code:
# Find all the tag
b_all=sampletext_div.findAll('b')
for b in b_all[4:]:
print(b.text, b.next_sibling)
Output:
CHIEF COMPLAINT: Bright red blood per rectum
HISTORY OF PRESENT ILLNESS: This 73-year-old woman had a recent medical history significant for renal and bladder cancer, deep venous thrombosis of the right lower extremity, and anticoagulation therapy complicated by lower gastrointestinal bleeding. Colonoscopy during that admission showed internal hemorrhoids and diverticulosis, but a bleeding site was not identified. Five days after discharge to a nursing home, she again experienced bloody bowel movements and returned to the emergency department for evaluation.
REVIEW OF SYMPTOMS: No chest pain, palpitations, abdominal pain or cramping, nausea, vomiting, or lightheadedness. Positive for generalized weakness and diarrhea the day of admission.
PRIOR MEDICAL HISTORY: Long-standing hypertension, intermittent atrial fibrillation, and hypercholesterolemia. Renal cell carcinoma and transitional cell bladder cancer status post left nephrectomy, radical cystectomy, and ileal loop diversion 6 weeks prior to presentation, postoperative course complicated by pneumonia, urinary tract infection, and retroperitoneal bleed. Deep venous thrombosis 2 weeks prior to presentation, management complicated by lower gastrointestinal bleeding, status post inferior vena cava filter placement.
MEDICATIONS: Diltiazem 30 mg tid, pantoprazole 40 mg qd, epoetin alfa 40,000 units weekly, iron 325 mg bid, cholestyramine. Warfarin discontinued approximately 10 days earlier.
ALLERGIES: Celecoxib (rash).
SOCIAL HISTORY: Resided at nursing home. Denied alcohol, tobacco, and drug use.
FAMILY HISTORY: Non-contributory.
PHYSICAL EXAM: 
LABS: 
CBC: 
CHEM 7: 
COAGULATION STUDIES: 
HOSPITAL COURSE: The patient received 1 liter normal saline and diltiazem (a total of 20 mg intravenously and 30 mg orally) in the emergency department. Emergency department personnel made several attempts to place a nasogastric tube for gastric lavage, but were unsuccessful. During her evaluation, the patient was noted to desaturate to 80% on room air, with an increase in her respiratory rate to 34 breaths per minute. She was administered 50% oxygen by nonrebreadier mask, with improvement in her oxygen saturation to 89%. Computed tomographic angiography was negative for pulmonary embolism.
Keywords:
gastroenterology, blood per rectum, bright red, bladder cancer, deep venous thrombosis, colonoscopy, gastrointestinal bleeding, diverticulosis, hospital course, lower gastrointestinal bleeding, nasogastric tube, oxygen saturation, emergency department, rectum, thrombosis, emergency, department, gastrointestinal, blood, bleeding, oxygen,
NOTE : These transcribed medical transcription sample reports and examples are provided by various users and
are for reference purpose only. MTHelpLine does not certify accuracy and quality of sample reports.
These transcribed medical transcription sample reports may include some uncommon or unusual formats;
this would be due to the preference of the dictating physician. All names and dates have been
changed (or removed) to keep confidentiality. Any resemblance of any type of name or date or
place or anything else to real world is purely incidental.

extract <label> tag on html with python

I would like to extract webpage like:
https://www.glassdoor.com/Overview/Working-at-Apple-EI_IE1138.11,16.htm,so I would like to return the result as the following format.
Website Headquarters Size Revenue Type
www.apple.com Cupertino, CA 10000+ employees $10+ billion (USD) per year Company - Public (AAPL)
I then use the following code with beatifulsoup to get this.
all_href = com_soup.find_all('span', {'class': re.compile('value')})
all_href = list(set(all_href))
It returns tag with <span>. Also, it didn't show tag under <label>
[<span class="value"> Computer Hardware & Software</span>,
<span class="value"> Company - Public (AAPL) </span>,
<span class="value">10000+ employees</span>,
<span class="value"> $10+ billion (USD) per year</span>,
<span class="value-title" title="4.0"></span>,
<span class="value">Cupertino, CA</span>,
<span class="value"> 1976</span>,
<span class="value-title" title="5.0"></span>,
<span class="value website"><a class="link" href="http://www.apple.com" rel="nofollow noreferrer" target="_blank">www.apple.com</a></span>]

Your beautifulsoup pull is too specific. You're catching all the "span" tags, where the class = value.
When you look at the HTML, you can find that section quickly by searching for the text of some of the fields. What you should do is get everything inside any of the div tags where class = 'infoEntity', which contains all 7 fields you're interested in grabbing from that "Overview" section.
Within that, there is a label tag for each field, which has attributes correlating to the labels you want above, and that are in that Overview section.
So, start with:
from bs4 import BeautifulSoup
data = """
<div class="eep-pill"><p class="tightVert h2 white"><strong>Enhanced</strong> Profile <span class="round ib"><i class="icon-star-white"></i></span></p></div></header><section class="center flex-grid padVertLg eepModal"><h2>Try Enhanced Profile Free for a Month</h2><p>Explore the many benefits of having a premium branded profile on Glassdoor, like increased influence and advanced analytics.</p><div class="margBot"><i class="feaIllustration"></i></div>www.apple.com</span></div><div class='infoEntity'><label>Headquarters</label><span class='value'>Cupertino, CA</span></div><div class='infoEntity'><label>Size</label><span class='value'>10000+ employees</span></div><div class='infoEntity'><label>Founded</label><span class='value'> 1976</span></div><div class='infoEntity'><label>Type</label><span class='value'> Company - Public (AAPL) </span></div><div class='infoEntity'><label>Industry</label><span class='value'> Computer Hardware & Software</span></div><div class='infoEntity'><label>Revenue</label><span class='value'> $10+ billion (USD) per year</span></div></div></div><div class=''><div data-full="We&rsquo;re a diverse collection of thinkers and doers, continually reimagining what&rsquo;s possible to help us all do what we love in new ways. The people who work here have reinvented entire industries with the Mac, iPhone, iPad, and Apple Watch, as well as with services, including iTunes, the App Store, Apple Music, and Apple Pay. And the same passion for innovation that goes into our products also applies to our practices &mdash; strengthening our commitment to leave the world better than we found it." class='margTop empDescription'> We’re a diverse collection of thinkers and doers, continually reimagining what’s possible to help us all do what we love in new ways. The people who work here have reinvented entire industries with the Mac, iPhone, iPad, and Apple Watch, as well as with ... <span class='link minor moreLink' id='ExpandDesc'>Read more</span></div><div class='hr'><hr/></div><h3 class='margTop'>Glassdoor Awards</h3>
"""
items = []
soup = BeautifulSoup(data, 'lxml')
get_info = iter(soup.find_all("div", {"class" : "infoEntity"}))
for item in get_info:
label = item.find("label")
value = item.find("span")
items.append((label.string, value.string))
With that, you get a list of tuples in items, that prints out as:
[('Website', 'www.apple.com'), ('Headquarters', 'Cupertino, CA'), ('Size', '10000+ employees'), ('Founded', ' 1976'), ('Type', ' Company - Public (AAPL) '), ('Industry', ' Computer Hardware & Software'), ('Revenue', ' $10+ billion (USD) per year')]
From there, you can print out that list in any format you like.

As I notice in https://www.glassdoor.com/Overview/Working-at-Apple-EI_IE1138.11,16.htm
You should find the <div class="infoEntity"> instead of <span class="value"> so as to get what you want.
all_href = com_soup.find_all('div', {'class': re.compile('infoEntity')}).find_all(['span','label'])
all_href = list(set(all_href))
It will returns you all <span> and <label> you want.
What if you want to have <span> and <label> come together, than change it to
all_href = [x.decode_contents(formatter="html") for x in com_soup.find_all('div', {'class': re.compile('infoEntity')})]
#or
all_href = [[x.find('span'), x.find('label')] for x in com_soup.find_all('div', {'class': re.compile('infoEntity')})]

Further Probing in BeautifulSoup

so i'm pretty new to BeautifulSoup and web scraping in general. I am currently running the code:
attraction_names_full = soup.find_all('td', class_='alt2', align = 'right', height = '28')
Which is returning a list comprising of objects which look like this:
<td align="right" class="alt2" height="28">
A Pirate's Adventure - Treasures of the Seven Seas
<br/>
<span style="font-size: 9px; color: #627DAD; font-style: italic;">
12:00pm to 6:00pm
</span>
</td>
What I am trying to get from this is just the line containing the text, which in this case would be
A Pirate's Adventure - Treasures of the Seven Seas
however I'm not sure how to go about this as it doesn't seem to have any tags surrounding just the text.
I have attempted to see if I can interact with the elements as strings however the object type seems to be:
<class 'bs4.element.Tag'>
Which i'm not sure how to manipulate and am sure there must be a much more efficient way of achieving this.
Any ideas on how to achieve this? - For reference the webpage i'm looking at is
url = 'https://www.thedibb.co.uk/forums/wait_times.php?a=MK'

You could extract the <span> element and then get the stripped text as follows:
from bs4 import BeautifulSoup
import requests
html = requests.get('https://www.thedibb.co.uk/forums/wait_times.php?a=MK').content
soup = BeautifulSoup(html, "html.parser")
for attraction in soup.find_all('td', class_='alt2', align='right', height='28'):
attraction.span.extract()
print attraction.get_text(strip=True)
Which would give you output starting:
A Pirate's Adventure - Treasures of the Seven Seas
Captain Jack Sparrow's Pirate Tutorial
Jungle Cruise
Meet Characters from Aladdin in Adventureland
Pirates of the Caribbean
Swiss Family Treehouse

html = urllib.request.urlopen("https://www.thedibb.co.uk/forums/wait_times.php?a=MK").read()
soup = BeautifulSoup(html, 'html.parser')
listElem = list(soup.find_all('td', class_='alt2', align = 'right', height = '28'))
print(listElem[1].contents[0])
You can use .contents , it works for me, the output is "Captain Jack Sparrow's Pirate Tutorial"

Scrapy or BeautifulSoup to scrape links and text from various websites

I am trying to scrape the links from an inputted URL, but its only working for one url (http://www.businessinsider.com). How can it be adapted to scrape from any url inputted? I am using BeautifulSoup, but is Scrapy better suited for this?
def WebScrape():
linktoenter = input('Where do you want to scrape from today?: ')
url = linktoenter
html = urllib.request.urlopen(url).read()
soup = BeautifulSoup(html, "lxml")
if linktoenter in url:
print('Retrieving your links...')
links = {}
n = 0
link_title=soup.findAll('a',{'class':'title'})
n += 1
links[n] = link_title
for eachtitle in link_title:
print(eachtitle['href']+","+eachtitle.string)
else:
print('Please enter another Website...')

You could make a more generic scraper, searching for all tags and all links within those tags. Once you have the list of all links, you can use a regular expression or similar to find the links that match your desired structure.
import requests
from bs4 import BeautifulSoup
import re
response = requests.get('http://www.businessinsider.com')
soup = BeautifulSoup(response.content)
# find all tags
tags = soup.find_all()
links = []
# iterate over all tags and extract links
for tag in tags:
# find all href links
tmp = tag.find_all(href=True)
# append masters links list with each link
map(lambda x: links.append(x['href']) if x['href'] else None, tmp)
# example: filter only careerbuilder links
filter(lambda x: re.search('[w]{3}\.careerbuilder\.com', x), links)

code:
def WebScrape():
url = input('Where do you want to scrape from today?: ')
html = urllib.request.urlopen(url).read()
soup = bs4.BeautifulSoup(html, "lxml")
title_tags = soup.findAll('a', {'class': 'title'})
url_titles = [(tag['href'], tag.text)for tag in title_tags]
if title_tags:
print('Retrieving your links...')
for url_title in url_titles:
print(*url_title)
out:
Where do you want to scrape from today?: http://www.businessinsider.com
Retrieving your links...
http://www.businessinsider.com/trump-china-drone-navy-2016-12 Trump slams China's capture of a US Navy drone as 'unprecedented' act
http://www.businessinsider.com/trump-thank-you-rally-alabama-2016-12 'This is truly an exciting time to be alive'
http://www.businessinsider.com/how-smartwatch-pioneer-pebble-lost-everything-2016-12 How the hot startup that stole Apple's thunder wound up in Silicon Valley's graveyard
http://www.businessinsider.com/china-will-return-us-navy-underwater-drone-2016-12 Pentagon: China will return US Navy underwater drone seized in South China Sea
http://www.businessinsider.com/what-google-gets-wrong-about-driverless-cars-2016-12 Here's the biggest thing Google got wrong about self-driving cars
http://www.businessinsider.com/sheriff-joe-arpaio-still-wants-to-investigate-obamas-birth-certificate-2016-12 Sheriff Joe Arpaio still wants to investigate Obama's birth certificate
http://www.businessinsider.com/rents-dropping-in-new-york-bubble-pop-2016-12 Rents are finally dropping in New York City, and a bubble might be about to pop
http://www.businessinsider.com/trump-david-friedman-ambassador-israel-2016-12 Trump's ambassador pick could drastically alter 2 of the thorniest issues in the US-Israel relationship
http://www.businessinsider.com/can-hackers-be-caught-trump-election-russia-2016-12 Why Trump's assertion that hackers can't be caught after an attack is wrong
http://www.businessinsider.com/theres-a-striking-commonality-between-trump-and-nixon-2016-12 There's a striking commonality between Trump and Nixon
http://www.businessinsider.com/tesla-year-in-review-2016-12 Tesla's biggest moments of 2016
http://www.businessinsider.com/heres-why-using-uber-to-fill-public-transportation-gaps-is-a-bad-idea-2016-12 Here's why using Uber to fill public transportation gaps is a bad idea
http://www.businessinsider.com/useful-hard-adopt-early-morning-rituals-productive-exercise-2016-12 4 morning rituals that are hard to adopt but could really pay off
http://www.businessinsider.com/most-expensive-champagne-bottles-money-can-buy-2016-12 The 11 most expensive Champagne bottles money can buy
http://www.businessinsider.com/innovations-in-radiology-2016-11 5 innovations in radiology that could impact everything from the Zika virus to dermatology
http://www.businessinsider.com/ge-healthcare-mr-freelium-technology-2016-11 A new technology is being developed using just 1% of the finite resource needed for traditional MRIs

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.