Related
I'm trying to scrape a list from EDGAR.
The information I need (such as "entity-name") are in the "td" class. However, the code I currently have doesn't return anything. I would appreciate any help. Thanks in advance!
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup
s = Service('/PATH/chromedriver')
driver = webdriver.Chrome(service=s)
driver.get("https://www.sec.gov/edgar/search/#/q=%2522cyber%2520insurance%2522&dateRange=custom&category=form-cat1&startdt=2011-01-01&enddt=2022-03-12&filter_forms=10-K")
try:
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CLASS_NAME, 'entity-name')))
except TimeoutException:
print('Page timed out after 10 secs.')
page = BeautifulSoup(driver.page_source,'html.parser')
print(page)
To extract the texts from the entity-name column instead of presence_of_all_elements_located() you have to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following locator strategies:
Using CSS_SELECTOR and text attribute:
driver.get('https://www.sec.gov/edgar/search/#/q=%2522cyber%2520insurance%2522&dateRange=custom&category=form-cat1&startdt=2011-01-01&enddt=2022-03-12&filter_forms=10-K')
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "td.entity-name")))])
Using XPATH and get_attribute("innerHTML"):
driver.get('https://www.sec.gov/edgar/search/#/q=%2522cyber%2520insurance%2522&dateRange=custom&category=form-cat1&startdt=2011-01-01&enddt=2022-03-12&filter_forms=10-K')
print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//td[#class='entity-name']")))])
Console Output:
['Excel Corp ', 'PROGRESSIVE CORP/OH/ (PGR) ', 'Electromed, Inc. (ELMD) ', 'HOOKER FURNITURE CORP (HOFT) ', 'HOOKER FURNITURE CORP (HOFT) ', 'SOUTHERN CO (SO, SOJA, SOJB, SOJC, SOJD, SOLN) <br> ALABAMA POWER CO (ALPVN, APRCP, APRDM, APRDN, APRDO, APRDP, ALP-PQ) <br> GEORGIA POWER CO (GPJA) <br> MISSISSIPPI POWER CO <br> SOUTHERN Co GAS <br> SOUTHERN POWER CO ', 'HOOKER FURNITURE CORP (HOFT) ', 'SOUTHERN CO (SO, SOJA, SOJB, SOJC, SOJD, SOLN) <br> ALABAMA POWER CO (ALPVN, APRCP, APRDM, APRDN, APRDO, APRDP, ALP-PQ) <br> GEORGIA POWER CO (GPJA) <br> MISSISSIPPI POWER CO <br> SOUTHERN Co GAS <br> SOUTHERN POWER CO ', 'BENCHMARK ELECTRONICS INC (BHE) ', 'MARRIOTT INTERNATIONAL INC /MD/ (MAR) ', 'Sprouts Farmers Market, Inc. (SFM) ', 'CF BANKSHARES INC. (CFBK) ', 'Repay Holdings Corp (RPAY) ', 'Sprouts Farmers Market, Inc. (SFM) ', 'MARRIOTT INTERNATIONAL INC /MD/ (MAR) ', 'Sprouts Farmers Market, Inc. (SFM) ', 'Albertsons Companies, Inc. (ACI) ', 'MARRIOTT INTERNATIONAL INC /MD/ (MAR) ', 'MARRIOTT INTERNATIONAL INC /MD/ (MAR) ', 'HENNESSY ADVISORS INC (HNNA) ', 'Repay Holdings Corp (RPAY, RPAYW) ', 'Repay Holdings Corp (RPAY, RPAYW, TBRGU) ', 'Arlo Technologies, Inc. (ARLO) ', 'Repay Holdings Corp (RPAY, RPAYW) ', 'NATIONAL HEALTH INVESTORS INC (NHI) ', 'MOTORCAR PARTS AMERICA INC (MPAA) ', 'RGC RESOURCES INC (RGCO) ', 'Arlo Technologies, Inc. (ARLO) ', 'CRYOLIFE INC (CRY) ', 'Mimecast Ltd (MIME) ', 'RGC RESOURCES INC (RGCO) ', 'MOTORCAR PARTS AMERICA INC (MPAA) ', 'NOODLES & Co (NDLS) ', 'PAPA JOHNS INTERNATIONAL INC (PZZA) ', 'MOTORCAR PARTS AMERICA INC (MPAA) ', 'MOTORCAR PARTS AMERICA INC (MPAA) ', 'PAPA JOHNS INTERNATIONAL INC (PZZA) ', 'MOTORCAR PARTS AMERICA INC (MPAA) ', 'Sprouts Farmers Market, Inc. (SFM) ', 'MOTORCAR PARTS AMERICA INC (MPAA) ', 'GARMIN LTD (GRMN) ', 'Sprouts Farmers Market, Inc. (SFM) ', 'nDivision Inc. (NDVN) ', 'nDivision Inc. (NDVN) ', 'nDivision Inc. (NDVN) ', 'WEYCO GROUP INC (WEYS) ', 'DiamondRock Hospitality Co (DRH) ', 'Pebblebrook Hotel Trust (PEB, PEB-PC, PEB-PD, PEB-PE, PEB-PF) ', 'Sprouts Farmers Market, Inc. (SFM) ', 'MYR GROUP INC. (MYRG) ', 'Chatham Lodging Trust (CLDT, CLDT-PA) ', 'WEYCO GROUP INC (WEYS) ', 'INFINITE GROUP INC (IMCI) ', 'DiamondRock Hospitality Co (DRH) ', 'Pebblebrook Hotel Trust (PEB, PEB-PC, PEB-PD, PEB-PE, PEB-PF) ', 'DiamondRock Hospitality Co (DRH, DRH-PA) ', 'Pebblebrook Hotel Trust (PEB, PEB-PC, PEB-PD, PEB-PE, PEB-PF) ', 'DLH Holdings Corp. (DLHC) ', 'Summit Hotel Properties, Inc. (INN) ', 'BOYD GAMING CORP (BYD) ', 'Summit Hotel Properties, Inc. (INN) ', 'DiamondRock Hospitality Co (DRH, DRH-PA) ', 'CINCINNATI FINANCIAL CORP (CINF) ', 'Summit Hotel Properties, Inc. (INN) ', 'Pebblebrook Hotel Trust (PEB, PEB-PC, PEB-PD, PEB-PE, PEB-PF) ', 'ARTIVION, INC. (AORT) ', 'STAR GROUP, L.P. (SGU) ', 'Pebblebrook Hotel Trust (PEB, PEB-PE, PEB-PF, PEB-PG, PEB-PH) ', 'RGC RESOURCES INC (RGCO) ', 'INFINITE GROUP INC (IMCI) ', 'LEGGETT & PLATT INC (LEG) ', 'RGC RESOURCES INC (RGCO) ', 'COSTCO WHOLESALE CORP /NEW (COST) ', 'DLH Holdings Corp. (DLHC) ', 'CANTERBURY PARK HOLDING CORP ', 'WEYCO GROUP INC (WEYS) ', 'DLH Holdings Corp. (DLHC) ', 'WEYCO GROUP INC (WEYS) ', 'Canterbury Park Holding Corp (CPHC) ', 'RGC RESOURCES INC (RGCO) ', 'IEC ELECTRONICS CORP (IEC) ', 'INFINITE GROUP INC (IMCI) ', 'Canterbury Park Holding Corp (CPHC) ', 'WEYCO GROUP INC (WEYS) ', 'Canterbury Park Holding Corp (CPHC) ', 'AMERICAN STATES WATER CO (AWR) <br> Golden State Water CO ', 'LEGGETT & PLATT INC (LEG) ', 'Vy Global Growth (VYGG, VYGG-UN, VYGG-WT) ', 'Summit Hotel Properties, Inc. (INN) ', 'Vy Global Growth (VYGG, VYGG-UN, VYGG-WT) ', 'Sunstone Hotel Investors, Inc. (SHO, SHO-PE, SHO-PF) ', 'CRYOLIFE INC (CRY) ', 'BOYD GAMING CORP (BYD) ', 'Sunstone Hotel Investors, Inc. (SHO, SHO-PE, SHO-PF) ', 'Summit Hotel Properties, Inc. (INN, INN-PE, INN-PF) ', 'Green Bancorp, Inc. (GNBC) ', 'TELKONET INC (TKOI) ', 'COHEN & STEERS INC (CNS) ', 'Sunstone Hotel Investors, Inc. (SHO, SHO-PE, SHO-PF) ', 'Green Bancorp, Inc. (GNBC) ']
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
so I am working on a project of web scraping. My goal is to web scrape the shanghai university ranking to get name, country and rank. Right now I am only focusing on the name.
import requests
from bs4 import BeautifulSoup
arwu = open('arwu.txt', 'a')
arwu.truncate()
universities = []
#Gets the url from which it should web scrape
url = 'https://www.shanghairanking.com/rankings/arwu/2021.html'
response = requests.get(url)
#initializes the bs4 html parser
soup = BeautifulSoup(response.text, "html.parser")
#retrieves all the university names that are displayed and formats them
def find_universities():
for university in range(len(soup.findAll(class_ ='global-univ'))):
one_a_tag = str(soup.findAll(class_ = 'global-univ')[university].text)
one_a_tag=one_a_tag[len(one_a_tag)//2+16:]
universities.append(str(one_a_tag))
return universities
universities=find_universities()
for x in range(len(universities)):
arwu.write(universities[x]+ "\n")
arwu.close()
As of right now, this only retrieves the first 30 universities displayed on the first page. How can I access the other pages?
The data from the next pages are loaded dynamically by javascript that's why only the BeautifulSoup can't parse it. To grab the next pages data, you must need an automation tool something like selenium. Here I use selenium with BeautifulSoup to extract data from the next pages and it's working fine.
import time
from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd
driver = webdriver.Chrome('chromedriver.exe')
driver.maximize_window()
time.sleep(8)
url = 'https://www.shanghairanking.com/rankings/arwu/2021'
driver.get(url)
time.sleep(4)
universities = []
while True:# while for the next page
soup = BeautifulSoup(driver.page_source, 'lxml')
for university in soup.select('div.link-container > a'):
un = university.select_one('span.univ-name')
versity = un.get_text(strip=True) if un else None
print(versity)
print("-" * 85)
next_page=driver.find_element_by_xpath('(//a[#class="ant-pagination-item-link"])[3]')#next page
if next_page:
next_page.click()
time.sleep(2)
else:
break
Output:
Harvard University
Stanford University
University of Cambridge
Massachusetts Institute of Technology (MIT)
University of California, Berkeley
Princeton University
University of Oxford
Columbia University
California Institute of Technology
University of Chicago
Yale University
Cornell University
Paris-Saclay University
University of California, Los Angeles
University of Pennsylvania
Johns Hopkins University
University College London
University of California, San Diego
University of Washington
University of California, San Francisco
ETH Zurich
University of Toronto
Washington University in St. Louis
The University of Tokyo
Imperial College London
New York University
Tsinghua University
University of North Carolina at Chapel Hill
University of Copenhagen
None
-------------------------------------------------------------------------------------
University of Wisconsin - Madison
Duke University
The University of Melbourne
Northwestern University
Sorbonne University
The University of Manchester
Kyoto University
PSL University
The University of Edinburgh
University of Minnesota, Twin Cities
The University of Texas at Austin
Karolinska Institute
Rockefeller University
University of British Columbia
Peking University
University of Colorado at Boulder
King's College London
The University of Texas Southwestern Medical Center at Dallas
University of Munich
Utrecht University
The University of Queensland
Technical University of Munich
Zhejiang University
University of Zurich
University of Illinois at Urbana-Champaign
University of Maryland, College Park
Heidelberg University
University of California, Santa Barbara
Shanghai Jiao Tong University
University of Geneva
None
-------------------------------------------------------------------------------------
University of Oslo
University of Southern California
University of Science and Technology of China
University of Groningen
The University of New South Wales
Vanderbilt University
McGill University
The University of Texas M. D. Anderson Cancer Center
University of Sydney
University of California, Irvine
Aarhus University
Ghent University
University of Paris
Stockholm University
National University of Singapore
The Australian National University
Fudan University
University of Bristol
Uppsala University
Monash University
Nanyang Technological University
University of Helsinki
Leiden University
Nagoya University
University of Bonn
Purdue University - West Lafayette
KU Leuven
University of Basel
Sun Yat-sen University
The Hebrew University of Jerusalem
None
-------------------------------------------------------------------------------------
Swiss Federal Institute of Technology Lausanne
McMaster University
Weizmann Institute of Science
Technion-Israel Institute of Technology
Boston University
The University of Western Australia
Carnegie Mellon University
Moscow State University
University of Florida
University of California, Davis
Aix Marseille University
Arizona State University
Brown University
Case Western Reserve University
Emory University
Erasmus University Rotterdam
Georgia Institute of Technology
Huazhong University of Science and Technology
Icahn School of Medicine at Mount Sinai
Indiana University Bloomington
King Abdulaziz University
King Saud University
Mayo Clinic Alix School of Medicine
Michigan State University
Nanjing University
Norwegian University of Science and Technology - NTNU
Pennsylvania State University - University Park
Radboud University Nijmegen
Rice University
Rutgers, The State University of New Jersey - New Brunswick
None
-------------------------------------------------------------------------------------
Seoul National University
The Chinese University of Hong Kong
The Ohio State University - Columbus
The University of Adelaide
The University of Hong Kong
The University of Sheffield
Tokyo Institute of Technology
Université Grenoble Alpes
Université libre de Bruxelles (ULB)
University of Alberta
University of Amsterdam
University of Arizona
University of Bern
University of Birmingham
University of Freiburg
University of Goettingen
University of Gothenburg
University of Lausanne
University of Leeds
University of Liverpool
University of Montreal
University of Nottingham
University of Pittsburgh
University of Sao Paulo
University of Strasbourg
University of Utah
University of Warwick
Vrije Universiteit Amsterdam
Wageningen University & Research
Xi'an Jiaotong University
None
-------------------------------------------------------------------------------------
University of Houston
University of Illinois at Chicago
University of Innsbruck
University of Iowa
University of Kansas
University of Kiel
University of Leipzig
University of Lisbon
University of Lorraine
University of Mainz
University of Massachusetts Amherst
University of Massachusetts Medical School - Worcester
University of Miami
University of Missouri - Columbia
University of Nebraska - Lincoln
University of Ottawa
University of Science and Technology Beijing
University of South Florida
University of Technology Sydney
University of Tennessee - Knoxville
University of Tsukuba
University of Turin
University of Wollongong
University of Wuerzburg
Virginia Commonwealth University
Virginia Polytechnic Institute and State University
Vrije Universiteit Brussel (VUB)
Western University
Xiamen University
Yonsei University
None
-------------------------------------------------------------------------------------
Indian Institute of Science
Istanbul University
Jagiellonian University
Jinan University
Kansas State University
King Fahd University of Petroleum & Minerals
Kobe University
Kyung Hee University
Mahidol University
Medical University of Innsbruck
Nanjing Normal University
Nanjing University of Information Science & Technology
National University of Ireland, Galway
National Yang Ming Chiao Tung University
Northern Arizona University
Okayama University
Pohang University of Science and Technology
Pompeu Fabra University
Pusan National University
Qingdao University
Queen's University Belfast
Rensselaer Polytechnic Institute
Saint Louis University
Scuola Normale Superiore - Pisa
Shandong University of Science and Technology
ShanghaiTech University
South China Agricultural University
Southern Medical University
None
-------------------------------------------------------------------------------------
University of Kent
University of Konstanz
University of Ljubljana
University of Navarra
University of Nevada - Reno
University of New Hampshire
University of Oklahoma - Norman
University of Palermo
University of Parma
University of Plymouth
University of Portsmouth
University of Regensburg
University of Rennes 1
University of Roma - Tor Vergata
University of Rostock
University of Salerno
University of Sherbrooke
University of Siena
Tampere University
University of Tromso
University of Ulsan
University of Verona
University of Vigo
University of Zaragoza
University Rovira i Virgili
Waseda University
Wenzhou Medical University
Yunnan University
Zhejiang University of Technology
None
-------------------------------------------------------------------------------------
Dalian Maritime University
Dokuz Eylul University
Federal University of Sao Carlos
Fluminense Federal University
Fujian Agriculture and Forestry University
Fujian Medical University
Fujian Normal University
Graz University of Technology
Guangxi University
Hacettepe University
Henan University
Indian Institute of Technology Delhi
Indian Institute of Technology Kharagpur
Indian Institute of Technology Madras
INHA University
Jawaharlal Nehru University
Kanazawa University
Kaohsiung Medical University
Kindai University
Kunming University of Science and Technology
Lincoln University
Mansoura University
Massey University
Medical University of Warsaw
National Research Nuclear University MEPhI (Moscow Engineering Physics Institute)
New Jersey Institute of Technology
New Mexico State University
Ningbo University
North China Electric Power University
None
-------------------------------------------------------------------------------------
Uniformed Services University of the Health Sciences
Universidad Andrés Bello
Universidad de Las Palmas de Gran Canaria
Universidad Pablo de Olavide
Université Gustave Eiffel
University of Agriculture Faisalabad
University of Alcalá
University of Cagliari
University of Concepcion
University of Cordoba
University of Engineering and Technology (UET)
University of Girona
University of Greenwich
University of Hull
University of L'Aquila
University of North Carolina at Greensboro
University of Savoy
University of St. Gallen
University of Stirling
University of Tabriz
University of the Punjab
University of Thessaly
University of Urbino
University of Veterinary Medicine Vienna
Vellore Institute of Technology
Warsaw University of Life Sciences
Westlake University
Wroclaw Medical University
Yanshan University
Zagazig University
None
-------------------------------------------------------------------------------------
University of Ulster
University of Valladolid
University of Wuppertal
University Paris Est Creteil
Vilnius Gediminas Technical University
Warsaw University of Technology
Williams College
Wroclaw University of Science and Technology
Wuhan University of Science and Technology
Yantai University
None
I'm having a weird problem that I'm not sure how to approach. I basically am calling a json endpoint and I get this:
{'Address': 'One Apple Park Way, Cupertino, CA, United States, 95014',
'AddressData': {'City': 'Cupertino',
'Country': 'United States',
'State': 'CA',
'Street': 'One Apple Park Way',
'ZIP': '95014'},
'CIK': '0000320193',
'CUSIP': '037833100',
'Code': 'AAPL',
'CountryISO': 'US',
'CountryName': 'USA',
'CurrencyCode': 'USD',
'CurrencyName': 'US Dollar',
'CurrencySymbol': '$',
'Description': 'Apple Inc. designs, manufactures, and markets smartphones, personal computers, tablets, wearables, and accessories worldwide. It also sells various related services. The company offers iPhone, a line of smartphones; Mac, a line of personal computers; iPad, a line of multi-purpose tablets; and wearables, home, and accessories comprising AirPods, Apple TV, Apple Watch, Beats products, HomePod, iPod touch, and other Apple-branded and third-party accessories. It also provides AppleCare support services; cloud services store services; and operates various platforms, including the App Store, that allow customers to discover and download applications and digital content, such as books, music, video, games, and podcasts. In addition, the company offers various services, such as Apple Arcade, a game subscription service; Apple Music, which offers users a curated listening experience with on-demand radio stations; Apple News+, a subscription news and magazine service; Apple TV+, which offers exclusive original content; Apple Card, a co-branded credit card; and Apple Pay, a cashless payment service, as well as licenses its intellectual property. The company serves consumers, and small and mid-sized businesses; and the education, enterprise, and government markets. It sells and delivers third-party applications for its products through the App Store. The company also sells its products through its retail and online stores, and direct sales force; and third-party cellular network carriers, wholesalers, retailers, and resellers. Apple Inc. was founded in 1977 and is headquartered in Cupertino, California.',
'EmployerIdNumber': '94-2404110',
'Exchange': 'NASDAQ',
'FiscalYearEnd': 'September',
'FullTimeEmployees': 147000,
'GicGroup': 'Technology Hardware & Equipment',
'GicIndustry': 'Technology Hardware, Storage & Peripherals',
'GicSector': 'Information Technology',
'GicSubIndustry': 'Technology Hardware, Storage & Peripherals',
'HomeCategory': 'Domestic',
'IPODate': '1980-12-12',
'ISIN': 'US0378331005',
'Industry': 'Consumer Electronics',
'InternationalDomestic': 'International/Domestic',
'IsDelisted': False,
'Listings': {'0': {'Code': '0R2V', 'Exchange': 'LSE', 'Name': '0R2V'},
'1': {'Code': 'AAPL', 'Exchange': 'BA', 'Name': 'Apple Inc. CEDEAR'},
'2': {'Code': 'AAPL34', 'Exchange': 'SA', 'Name': 'Apple Inc'}},
'LogoURL': '/img/logos/US/aapl.png',
'Name': 'Apple Inc',
'Officers': {'0': {'Name': 'Mr. Timothy D. Cook',
'Title': 'CEO & Director',
'YearBorn': '1961'},
'1': {'Name': 'Mr. Luca Maestri',
'Title': 'CFO & Sr. VP',
'YearBorn': '1964'},
'2': {'Name': 'Mr. Jeffrey E. Williams',
'Title': 'Chief Operating Officer',
'YearBorn': '1964'},
'3': {'Name': 'Ms. Katherine L. Adams',
'Title': 'Sr. VP, Gen. Counsel & Sec.',
'YearBorn': '1964'},
'4': {'Name': "Ms. Deirdre O'Brien",
'Title': 'Sr. VP of People & Retail',
'YearBorn': '1967'},
'5': {'Name': 'Mr. Chris Kondo',
'Title': 'Sr. Director of Corp. Accounting',
'YearBorn': 'NA'},
'6': {'Name': 'Mr. James Wilson',
'Title': 'Chief Technology Officer',
'YearBorn': 'NA'},
'7': {'Name': 'Ms. Mary Demby',
'Title': 'Chief Information Officer',
'YearBorn': 'NA'},
'8': {'Name': 'Ms. Nancy Paxton',
'Title': 'Sr. Director of Investor Relations & Treasury',
'YearBorn': 'NA'},
'9': {'Name': 'Mr. Greg Joswiak',
'Title': 'Sr. VP of Worldwide Marketing',
'YearBorn': 'NA'}},
'Phone': '408-996-1010',
'Sector': 'Technology',
'Type': 'Common Stock',
'UpdatedAt': '2021-02-25',
'WebURL': 'http://www.apple.com'}
you can see it mostly flat except two keys have nested values (address/officers). When I convert it to a dataframe I'm getting :
Code Type Name Exchange CurrencyCode CurrencyName CurrencySymbol CountryName CountryISO ISIN CUSIP CIK EmployerIdNumber FiscalYearEnd IPODate InternationalDomestic Sector Industry GicSector GicGroup GicIndustry GicSubIndustry HomeCategory IsDelisted Description Address AddressData Listings Officers Phone WebURL LogoURL FullTimeEmployees UpdatedAt
Street AAPL Common Stock Apple Inc NASDAQ USD US Dollar $ USA US US0378331005 037833100 0000320193 94-2404110 September 1980-12-12 International/Domestic Technology Consumer Electronics Information Technology Technology Hardware & Equipment Technology Hardware, Storage & Peripherals Technology Hardware, Storage & Peripherals Domestic False Apple Inc. designs, manufactures, and markets ... One Apple Park Way, Cupertino, CA, United Stat... One Apple Park Way NaN NaN 408-996-1010 http://www.apple.com /img/logos/US/aapl.png 147000 2021-02-25
City AAPL Common Stock Apple Inc NASDAQ USD US Dollar $ USA US US0378331005 037833100 0000320193 94-2404110 September 1980-12-12 International/Domestic Technology Consumer Electronics Information Technology Technology Hardware & Equipment Technology Hardware, Storage & Peripherals Technology Hardware, Storage & Peripherals Domestic False Apple Inc. designs, manufactures, and markets ... One Apple Park Way, Cupertino, CA, United Stat... Cupertino NaN NaN 408-996-1010 http://www.apple.com /img/logos/US/aapl.png 147000 2021-02-25
State AAPL Common Stock Apple Inc NASDAQ USD US Dollar $ USA US US0378331005 037833100 0000320193 94-2404110 September 1980-12-12 International/Domestic Technology Consumer Electronics Information Technology Technology Hardware & Equipment Technology Hardware, Storage & Peripherals Technology Hardware, Storage & Peripherals Domestic False Apple Inc. designs, manufactures, and markets ... One Apple Park Way, Cupertino, CA, United Stat... CA NaN NaN 408-996-1010 http://www.apple.com /img/logos/US/aapl.png 147000 2021-02-25
Country AAPL Common Stock Apple Inc NASDAQ USD US Dollar $ USA US US0378331005 037833100 0000320193 94-2404110 September 1980-12-12 International/Domestic Technology Consumer Electronics Information Technology Technology Hardware & Equipment Technology Hardware, Storage & Peripherals Technology Hardware, Storage & Peripherals Domestic False Apple Inc. designs, manufactures, and markets ... One Apple Park Way, Cupertino, CA, United Stat... United States NaN NaN 408-996-1010 http://www.apple.com /img/logos/US/aapl.png 147000 2021-02-25
ZIP AAPL Common Stock Apple Inc NASDAQ USD US Dollar $ USA US US0378331005 037833100 0000320193 94-2404110 September 1980-12-12 International/Domestic Technology Consumer Electronics Information Technology Technology Hardware & Equipment Technology Hardware, Storage & Peripherals Technology Hardware, Storage & Peripherals Domestic False Apple Inc. designs, manufactures, and markets ... One Apple Park Way, Cupertino, CA, United Stat... 95014 NaN NaN 408-996-1010 http://www.apple.com /img/logos/US/aapl.png 147000 2021-02-25
0 AAPL Common Stock Apple Inc NASDAQ USD US Dollar $ USA US US0378331005 037833100 0000320193 94-2404110 September 1980-12-12 International/Domestic Technology Consumer Electronics Information Technology Technology Hardware & Equipment Technology Hardware, Storage & Peripherals Technology Hardware, Storage & Peripherals Domestic False Apple Inc. designs, manufactures, and markets ... One Apple Park Way, Cupertino, CA, United Stat... NaN {'Code': '0R2V', 'Exchange': 'LSE', 'Name': '0... {'Name': 'Mr. Timothy D. Cook', 'Title': 'CEO ... 408-996-1010 http://www.apple.com /img/logos/US/aapl.png 147000 2021-02-25
1 AAPL Common Stock Apple Inc NASDAQ USD US Dollar $ USA US US0378331005 037833100 0000320193 94-2404110 September 1980-12-12 International/Domestic Technology Consumer Electronics Information Technology Technology Hardware & Equipment Technology Hardware, Storage & Peripherals Technology Hardware, Storage & Peripherals Domestic False Apple Inc. designs, manufactures, and markets ... One Apple Park Way, Cupertino, CA, United Stat... NaN {'Code': 'AAPL', 'Exchange': 'BA', 'Name': 'Ap... {'Name': 'Mr. Luca Maestri', 'Title': 'CFO & ... 408-996-1010 http://www.apple.com /img/logos/US/aapl.png 147000 2021-02-25
2 AAPL Common Stock Apple Inc NASDAQ USD US Dollar $ USA US US0378331005 037833100 0000320193 94-2404110 September 1980-12-12 International/Domestic Technology Consumer Electronics Information Technology Technology Hardware & Equipment Technology Hardware, Storage & Peripherals Technology Hardware, Storage & Peripherals Domestic False Apple Inc. designs, manufactures, and markets ... One Apple Park Way, Cupertino, CA, United Stat... NaN {'Code': 'AAPL34', 'Exchange': 'SA', 'Name': '... {'Name': 'Mr. Jeffrey E. Williams', 'Title': '... 408-996-1010 http://www.apple.com /img/logos/US/aapl.png 147000 2021-02-25
3 AAPL Common Stock Apple Inc NASDAQ USD US Dollar $ USA US US0378331005 037833100 0000320193 94-2404110 September 1980-12-12 International/Domestic Technology Consumer Electronics Information Technology Technology Hardware & Equipment Technology Hardware, Storage & Peripherals Technology Hardware, Storage & Peripherals Domestic False Apple Inc. designs, manufactures, and markets ... One Apple Park Way, Cupertino, CA, United Stat... NaN NaN {'Name': 'Ms. Katherine L. Adams', 'Title': 'S... 408-996-1010 http://www.apple.com /img/logos/US/aapl.png 147000 2021-02-25
4 AAPL Common Stock Apple Inc NASDAQ USD US Dollar $ USA US US0378331005 037833100 0000320193 94-2404110 September 1980-12-12 International/Domestic Technology Consumer Electronics Information Technology Technology Hardware & Equipment Technology Hardware, Storage & Peripherals Technology Hardware, Storage & Peripherals Domestic False Apple Inc. designs, manufactures, and markets ... One Apple Park Way, Cupertino, CA, United Stat... NaN NaN {'Name': 'Ms. Deirdre O'Brien', 'Title': 'Sr.... 408-996-1010 http://www.apple.com /img/logos/US/aapl.png 147000 2021-02-25
5 AAPL Common Stock Apple Inc NASDAQ USD US Dollar $ USA US US0378331005 037833100 0000320193 94-2404110 September 1980-12-12 International/Domestic Technology Consumer Electronics Information Technology Technology Hardware & Equipment Technology Hardware, Storage & Peripherals Technology Hardware, Storage & Peripherals Domestic False Apple Inc. designs, manufactures, and markets ... One Apple Park Way, Cupertino, CA, United Stat... NaN NaN {'Name': 'Mr. Chris Kondo', 'Title': 'Sr. Dir... 408-996-1010 http://www.apple.com /img/logos/US/aapl.png 147000 2021-02-25
6 AAPL Common Stock Apple Inc NASDAQ USD US Dollar $ USA US US0378331005 037833100 0000320193 94-2404110 September 1980-12-12 International/Domestic Technology Consumer Electronics Information Technology Technology Hardware & Equipment Technology Hardware, Storage & Peripherals Technology Hardware, Storage & Peripherals Domestic False Apple Inc. designs, manufactures, and markets ... One Apple Park Way, Cupertino, CA, United Stat... NaN NaN {'Name': 'Mr. James Wilson', 'Title': 'Chief ... 408-996-1010 http://www.apple.com /img/logos/US/aapl.png 147000 2021-02-25
7 AAPL Common Stock Apple Inc NASDAQ USD US Dollar $ USA US US0378331005 037833100 0000320193 94-2404110 September 1980-12-12 International/Domestic Technology Consumer Electronics Information Technology Technology Hardware & Equipment Technology Hardware, Storage & Peripherals Technology Hardware, Storage & Peripherals Domestic False Apple Inc. designs, manufactures, and markets ... One Apple Park Way, Cupertino, CA, United Stat... NaN NaN {'Name': 'Ms. Mary Demby', 'Title': 'Chief In... 408-996-1010 http://www.apple.com /img/logos/US/aapl.png 147000 2021-02-25
8 AAPL Common Stock Apple Inc NASDAQ USD US Dollar $ USA US US0378331005 037833100 0000320193 94-2404110 September 1980-12-12 International/Domestic Technology Consumer Electronics Information Technology Technology Hardware & Equipment Technology Hardware, Storage & Peripherals Technology Hardware, Storage & Peripherals Domestic False Apple Inc. designs, manufactures, and markets ... One Apple Park Way, Cupertino, CA, United Stat... NaN NaN {'Name': 'Ms. Nancy Paxton', 'Title': 'Sr. Di... 408-996-1010 http://www.apple.com /img/logos/US/aapl.png 147000 2021-02-25
9 AAPL Common Stock Apple Inc NASDAQ USD US Dollar $ USA US US0378331005 037833100 0000320193 94-2404110 September 1980-12-12 International/Domestic Technology Consumer Electronics Information Technology Technology Hardware & Equipment Technology Hardware, Storage & Peripherals Technology Hardware, Storage & Peripherals Domestic False Apple Inc. designs, manufactures, and markets ... One Apple Park Way, Cupertino, CA, United Stat... NaN NaN {'Name': 'Mr. Greg Joswiak', 'Title': 'Sr. VP... 408-996-1010 http://www.apple.com /img/logos/US/aapl.png 147000 2021-02-25
Basically it looks like each key in the nested keys is creating a new row in the dataframe. Here's my code:
import json
import requests
import pandas as pd
companyData = requests.get(url="https://eodhistoricaldata.com/api/fundamentals/AAPL.US?api_token=OeAFFmMliFG5orCUuwAKQ8l4WWFQ67YX").json()
General = pd.DataFrame.from_dict(companyData['General'])
General
In this case, my goal is that everything is just one row and any nesting would just show up as a json in the relevant column. Not create new duplicate rows for every item in the nested json.
try pd.json_normalize() with the max_level argument.
df = pd.json_normalize(companyData['General'],max_level=0)
print(df)
I am trying to scrape the titles, description, partners etc from this search result using requests and BeautifulSoup in Python. But the response object doesn't return the data which I need and which is shown when I visit the URL in the browser
Here is what I have so far:
import requests
from bs4 import BeautifulSoup as bs
url = 'https://partneredge.sap.com/content/partnerfinder/search.html#/search/results?itemsPerPage=10&sortBy=shortname&sortOrder=asc'
count = 0
response = requests.get(url)
if response.ok:
response = response.text
content = bs(response , 'lxml')
results = content.find_all('li',class_='search-results__item')
for each in results:
count+=1
title = each.find('header').find('a').text.strip()
link = each.find('header').find('a').get('href')
print('********************* '+str(count)+' *********************')
print('Title: {}\nLink: {}\n'.format(title,link))
The website is loaded with JavaScript event which render it's data dynamically once the page loads.
requests library will not be able to render JavaScript on the fly. so you can use selenium or requests_html. and indeed there's a lot of modules which can do that.
Now, we do have another option on the table, to track from where the data is rendered. I were able to locate the XHR request which is used to retrieve the data from the back-end API and render it to the users side.
You can get the XHR request by open Developer-Tools and check Network and check XHR/JS requests made depending of the type of call such as fetch
Below is a simple call where the limit is maximum of 600 configured by parameter size
So you will need to loop over by increment 600 each time till you reach total of 4803 which is the max result for sap. Below you do have a valid JSON dict, which you can access it using the keys.
import requests
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:74.0) Gecko/20100101 Firefox/74.0",
}
data = "{\"returnCount\":{},\"repository\":\"partnerfinder\",\"type\":\"content\",\"filters\":[{\"field\":\"AUTHOR\",\"values\":[\"PWP\"],\"type\":\"exact\"}],\"returnResults\":{\"sort\":[{\"field\":\"SHORTNAME\",\"order\":\"asc\"}],\"page\":{\"size\":600},\"outputFields\":[\"TITLE\",\"EMAIL\",\"PARTNERID\",\"DESCRIPTION\",\"PHONE\",\"AWARDID\",\"FOCUSAREA\",\"LEVEL\",\"URL\",\"SHORTNAME\"]}}"
def main(url):
with requests.Session() as req:
r = req.post(url, data=data, headers=headers).json()
# print(r.keys())
# humanview = json.dumps(r, indent=4)
for item in r["result"]["results"]["results"]:
print(item["TITLE"])
main("https://partneredge.sap.com/bin/fiji/es/search-results")
Sample of output:
IBM World Trade Corporation - Egypt Branch
Origo hf.
2BM A/S
Accenture (China) Co. Limited
Accenture (UK) Ltd.
ACCENTURE - CONSULTORES DE GESTÃO, S.A.
Accenture AB
Accenture AG
Accenture AS
Accenture Australia Pty Ltd
Accenture B.V.
Accenture Chile Asesorías y Serv. Ltda.
Accenture Company Limited
Accenture do Brasil Ltda.
Accenture GmbH
Accenture GmbH
Accenture Inc.
Accenture Japan Ltd.
Accenture Limited LAGOS
Accenture LLC
Accenture Middle East BV
ACCENTURE OUTSOURCING SERVICES, S.A.
Accenture Pte Ltd
Accenture S.A.
Accenture S.p.A.
Accenture S.R.L.
ACCENTURE SAS
Accenture Saudi Arabia Limited
Accenture Sdn Bhd
Accenture Solutions Private Limited
Accenture Solutions Sdn Bhd
Accenture Sp. z o.o.
Accenture Technology Solutions Oy
Accenture Technology Ventures B.V.branch
Accenture, Inc.
Accenture, S.L.
Accenture
Accenture
addIT Dienstleistungen GmbH & Co KG
advanced business Consulting GmbH
Ageless LLC
All for One Group AG
All for One Steeb GmbH
All Tax Platform - Soluções Tributárias S.A.
AMAZON WEB SERVICES INC
AO Deloitte and Touche CIS
Apigee Corporation
Atlantconsult LLC
Atos (Taiwan) Ltd
Atos AG
Atos Belgium SA
Atos Bilisim Danismanlik ve Müsteri Hizmetleri San. A.S.
Atos Bulgaria Competency Center EOOD
Atos Inc.
Atos India Pvt. Ltd.
Atos Information Technology (S) Pte Ltd
Atos Information Technology GmbH
Atos Information Technology (China) Co., Ltd.
Atos IT Services UK Limited
Atos It Services
Atos IT Solutions and Services A/S
Atos IT Solutions and Services AB
Atos IT Solutions and Services GmbH
ATOS IT SOLUTIONS AND SERVICES IBERIA SL
Atos IT Solutions and Services LLC
Atos IT Solutions and Services Oy
Atos IT Solutions and Services, Inc.
Atos IT Solutions and Services, s.r.o.
Atos IT Solutions and Services s.r.o.
ATOS IT SOLUTIONS AND SERVICES D.O.O. BEOGRAD
Atos IT Solutions and Services
Atos IT Solutions Romania SRL
Atos IT Solutions and Services Limited
Atos Italia S.p.A.
Atos Nederland B.V.
Atos Origin FZ LLC
Atos Polska S.A.
Atos Polska S.A.
Atos Saudi Arabia
Atos Serviços de Tecnologia da Informaçao do Brasil Ltda.
AUGUSTA REEVES
Axxis Consulting (S) Pte Ltd
B4B Solutions GmbH
B4B Solutions GmbH
Bluekey Seidor (Pty) Ltd.
BSG Partners Co., Ltd.
BULL SAS
CAPGEMINI (CHINA) CO., LTD
Capgemini AB
Capgemini Argentina S.A.
Capgemini Australia Pty Limited
Capgemini Belgium N.V.
Capgemini Brasil S.A.
CAPGEMINI ESPAÑA, S.L.
Capgemini Italia S.p.A.
Capgemini Mexico, S. de R.L. de C.V.
Capgemini Nederland B.V.
Capgemini Norge AS
CAPGEMINI PORTUGAL - SERVIÇOS DE CONSULTORIA E INFORMÁTICA, S.A.
Capgemini Services Malaysia Sdn Bhd
Capgemini Singapore Pte Ltd
Capgemini Technologies LLC
Capgemini Technologies LLC
Capgemini Technology Services India Ltd. Block 2&3,Plot no IT3,IT4, 2nd &
CAPGEMINI TECHNOLOGY SERVICES
Capgemini UK Plc
Celonis SE
CenturyLink CenturyTel LLC
CIDEON Software & Services GmbH & Co.KG
Cisco Systems, Inc.
COMPAGNIE IBM FRANCE SAS
CPM BRAXIS TECNOLOGIA LTDA.
Crystal Solutions S.A.
CSC Computer Sciences Brasil S.A.
CTAC België NV
Ctac N.V.
DC Extended Delivery S. de R.L. de C.V.
Deloitte & Co. S.A.
Deloitte & Touche Consulting Group ICS Pte Ltd.
Deloitte & Touche S.R.L.
Deloitte AB
Deloitte Asesores y Consultores Ltda
Deloitte Assessoria e Consultoria Ltda
Deloitte Business Solutions S.A
Deloitte Consulting & Advisory Deloitte Entity
Deloitte Consulting (Pty) Ltd
Deloitte Consulting (SEA) Sdn Bhd
Deloitte Consulting B.V.
DELOITTE CONSULTING CR, S.A.
Deloitte Consulting ehf.
Deloitte Consulting Ltd.
Deloitte Consulting Product Services LLC
Deloitte Consulting Pte Ltd
Deloitte Consulting S.A.
Deloitte Consulting S.r.l
DELOITTE CONSULTING VIETNAM CO., LTD
DELOITTE CONSULTORES, S.A.
Deloitte Inc.
DELOITTE LIMITED
Deloitte LLP
Deloitte Servicios Profesionales Ltda
Deloitte Tax & Consulting
Deloitte Technology Limited
Deloitte Tohmatsu Consulting LLC
Deloitte Touche Tohmatsu
Deloitte
Detay Danismanlik Bilgisayar Hizmetleri San. ve Dis Tic.A.S.
Deutsche Telekom Healthcare and Security Solutions GmbH
DSC Software AG
Dunn Solutions Group
DXC Technologies
DXC Technology (Middle East) FZ LLC
DXC Technology Austria GmbH
DXC Technology Czech Republic s.r.o.
DXC Technology Deutschland GmbH
DXC TECHNOLOGY FRANCE
DXC Technology Japan, Ltd.
DXC Technology Services LLC
DXC Technology Services Singapore Pte. Ltd.
DXC Technology Services Vietnam Company Limited
DXC Technology Spain, S.A.
DXD APPLICATIONS AND IT SOLUTIONS, S.L.
DynaSys Solutions Ltd.
Edenhouse Solutions Limited
Edraky LLC Edraky
EIT Services India Private Limited
Enterprise IT Services MiddleEast FZ LLC
ENTERPRISE SERVICES BRASIL SERVICOS DE TECNOLOGIA LTDA
Enterprise Services d.o.o. Beograd
EntServ Enterprise Services Mexico S. de R.L. de C.V.
ENTSERV PHILIPPINES,INC
EPAM Systems, Inc.
Everis Chile S.A.
Everis México, S. de R.L. de C.V.
Everis Peru Sociedad Anonima Cerrada
Excellence Delivered ExD Pvt. Ltd
EY Brightree Sdn. Bhd.
Fasttrack ERP Solutions Inc.
FastTrack Solutions, Inc.
FUJITSU LIMITED
GE Digital
Hewlett Packard Colombia LTDA.
Hewlett Packard Enterprise Company
Hewlett Packard Enterprise India Private Limited
Hewlett Packard Enterprise Polska sp. z o.o.
HEWLETT PACKARD FRANCE SAS
Hewlett-Packard (M) Sdn Bhd
Hewlett-Packard (Schweiz) GmbH
Hewlett-Packard (Thailand) Ltd.
Hewlett-Packard HK SAR Limited
Hitachi Vantara Corporation
Huawei Technologies Co., Ltd
IBM (China) Company Limited
IBM ARGENTINA SRL
IBM Australia Limited
IBM Belgium B.V.B.A.
IBM Brasil - Indústria, Máquinas e Serviços Limitada.
IBM BULGARIA EOOD
IBM Canada Limited
IBM Ceska republika, spol. s r.o.
IBM China / Hong Kong Limited (Partner)
IBM Corporation International Technical
IBM Danmark ApS
IBM de Chile S.A.C.
IBM De Colombia & Cia. S.C.A.
IBM del Peru S.A.C.
IBM Deutschland GmbH
IBM Eastern Europa/Asia Ltd.
IBM Eesti OÜ
IBM Engineering Technology (Shanghai) Co Ltd.
IBM GLOBAL SERVICES ESPAÑA, S.A.
IBM Global Services ve Teknoloji Hizmetleri
IBM Hellas S.A.
IBM HRVATSKA D.O.O.
IBM India Pvt Ltd
IBM Int. Business Machines AS
IBM International Business Machines doo
IBM Ireland Ltd
IBM Italia S.p.A.
IBM Italia S.p.A
IBM Italia SpA (Pakistan)
IBM Japan Services Company Ltd.
IBM Japan, Ltd.
IBM Korea, INC.
IBM Malaysia Sdn Bhd
IBM Middle East FZ LLC
IBM Middle East FZ-LLC SaudiArabiaBranch
IBM Polska Sp. z o.o.
IBM Qatar SSC
IBM Romania SRL
IBM Schweiz AG
IBM Singapore Pte Ltd
IBM Slovenija d.o.o.
IBM South Africa (Pty) Ltd
IBM Svenska AB
IBM Taiwan Corporation
IBM Thailand Co., Ltd.
IBM United Kingdom Limited
IBM World Trade Corporation
ICM.S S.r.l.
Illumiti Corp.
Illumiti Inc
Illumiti One Inc
Implema AB
In Cloud Solutions Ltd
Intelligroup Saudi Arabia Company Ltd
IPS Co., Ltd.
ISS Consulting (Thailand) Ltd.
Itelis d.o.o.
itelligence a/s
Itelligence AB
Itelligence AG, Niederlassung Wien
itelligence AG
itelligence AG
Itelligence AS
Itelligence Benelux Holding BV
Itelligence Bilgi Sistemleri A.S.
Itelligence Business Solutions (UK) Limited
Itelligence Business Solutions
itelligence Business Solutions Canada, Inc.
itelligence Consulting (Shanghai) Ltd
ITELLIGENCE FRANCE
itelligence Hungary Informatika Kft.
Itelligence India Software Solutions Private Limited
Itelligence LLC
itelligence Outsourcing MSC Sdn Bhd
itelligence Slovakia, s.r.o.
Itelligence Software Solutions
itelligence Sp. z o.o.
itelligence, a.s.
Itelligence, Inc.
itelligence
KBJ S.A.
KWP Austria GmbH
KWP INSIDE HR GmbH
Lenovo Global Technology HK Limited
MIBCON a.s.
MSS Seidor Colombia S.A.S.
MSS Seidor Peru SAC
MSS Seidor, S.L.
Myanmar Information Technology Pte Ltd.
NetApp, Inc.
NTT DATA Business Solutions Malaysia Sdn Bhd
NTT DATA Business Solutions Australia Pty Ltd
NTT DATA Business Solutions Singapore Pte Ltd
NTT DATA Romania SA
NTT DATA VIETNAM CO. LTD
Nutanix Inc
OOO T-Systems CIS
Open Text Corporation
ORBIT Gesellschaft für Applikations- und Informationssysteme mbH
OSC Smart Integration GmbH
OSIsoft, LLC
Oxygen Business Solutions Ltd
Oxygen Business Solutions Pty Ltd
Pearl Norge AS
Price Waterhouse & Co Asesores de Empresas S.R.L
PricewaterhouseCoopers Advisory N.V.
PricewaterhouseCoopers Advisory S.p.A.
PricewaterhouseCoopers Business Solutions SA
PricewaterhouseCoopers Consulting (Australia) Pty Limited
PricewaterhouseCoopers Consulting (Singapore) PTE LTD
PricewaterhouseCoopers Consulting (Thailand) Ltd.
PriceWaterhouseCoopers Consultores Auditores SpA
PricewaterhouseCoopers GmbH Wirtschaftsprüfungsgesellschaft
PricewaterhouseCoopers Inc
PricewaterhouseCoopers LLP
PriceWaterhouseCoopers LLP
PricewaterhouseCoopers LLP
PricewaterhouseCoopers Management Consulting (Shanghai) Limited
Pricewaterhousecoopers Management Consultants SRL
PricewaterhouseCoopers Private Limited
Pricewaterhousecoopers, S.C.
PricewaterhouseCoopers Česká republika s.r.o.
Process Partner AG
Proquire LLC
PT Accenture
PT Deloitte Consulting
PT IBM Indonesia
PT Soltius Indonesia
Pure Storage Inc.
PwC Consulting LLC
PWC Consulting Services (M) Sdn Bhd.
PwC Enterprise Advisory cvba/scrl
PwC Société coopérative
Questionmark Computing Limited
QZing Technology (Beijing) Company Limited
Real Consulting IT Business Solutions SA
Red Hat, Inc.
RED POINT SOFTWARE SOLUTIONS SRL
Redwood Alliances B.V.
SAS Institute Inc.
SAVIC Technologies Private Limited
Seidor Chile S.A.
Seidor Colombia SAS
Seidor Consulting Peru SAC
Seidor Consulting SA
SEIDOR CONSULTING, S.L.
Seidor Crystalis - Tecnologia da Informação S.A.
Seidor Crystalis Costa Rica
Seidor Maroc, SARL
SEIDOR MEXICO SAPI de CV
Seidor Middle East & North Africa FZ-LLC
Seidor Portugal, Lda
Seidor S.A.
Seidor UK Ltd
Seidor Uruguay Informatica S.A
Seidor USA Corp
Servicios Informáticos Itelligence, S.A
Shanghai Acloudear Info. Tech. Co., Ltd.
SOA PEOPLE SA/NV
SOA PEOPLE SAS
Société Conseil Groupe LGS.
Sofigate Business Technologies Oy
SOFTINSA - ENGENHARIA DE SOFTWARE AVANÇADO, LDA
SUSE Software Solutions Germany GmbH
Syniti
Systex Corporation
T-Systems Austria GesmbH
T-Systems do Brasil Ltda
T-Systems International GmbH
T-SYSTEMS ITC IBERIA, S.A.U.
T-Systems Malaysia Sdn. Bhd.
T-Systems Multimedia Solutions GmbH
T-Systems México, S.A. de C.V.
T-Systems Nederland B.V.
T-Systems North America, Inc.
T-Systems P.R.China Ltd.
T-Systems Polska Spolka z o.o.
T-Systems Schweiz AG
T-Systems Singapore Pte Ltd
T-Systems South Africa (Pty) Ltd
TalentChamp Consulting GmbH
TEAMIDEA GROUP LLC
TECNILÓGICA ECOSISTEMAS, S.A.U.
Telekom Deutschland GmbH
Terralink Technologies LLC
The Hackett Group d/b/a Answerthink, Inc.
unit-IT Dienstleistungs GmbH & Co KG
Utopia Global, Inc.
VIEWNEXT S.A.
Vistex, Inc.
Westrocon Seidor (PTY) Ltd
Öhrlings PricewaterhouseCoopers AB
3Hold Technologies, SL
4process AG
Aareon Deutschland GmbH
AB Consulting Group Zrt.
Abacus Cambridge Partners (Middle East) Ltd.
Abacus Cambridge Partners Saudia LLC
Abacus Consulting Technology Pvt Ltd
AbacusConsulting
abat AG
abc Consulting S.A.
ABeam Consulting (Malaysia) Sdn Bhd
ABeam Consulting (Shanghai) Co., Ltd.
ABeam Consulting (Thailand) Limited
ABeam Consulting Ltd.
Abeo Vietnam Co., Ltd
Abide Consult AG
abilis GmbH
ABM Global Solutions, Inc.
Abracon GmbH
Absoft Limited
Acando Consulting AB
Accely Consulting India Private Limited
Acclimation Pty Ltd
ACEteK Software Limited
AchieveIT Solutions, Inc.
ACI Holdings
ACJ Consulting Co., Ltd.
Acorel VAR B.V.
ACRON Bilisim A.S.
Actualisap Consultores Bolivia S.A.
Acuma Solutions Ltd
Adesso SE
ADP Consultores S.R.L.
AdS Consulting, Beratung für angewandte Informationsverarbeitung GmbH
Advanced Applications GmbH
Advanced Business Software
Advanced Business Solutions
Advanced Electronics Company
Advantech Technologies Ltd
Advantic Consultores de sistemas de información S.L.
AEKANSA S.A. SYPSOFT360
AFON Technologies Pte Ltd
AG Consultancy & Apps Ltd
Agentil SA
AGILITA AG
Agion
AICOMP Cloud GmbH
AKT Solutions Ltd
AL BILAD ARABIA CO. LTD.
ALEXANDER MOORE S.A.
Alfa Sistemas de Gestao LTDA
Alfacloud LTD
all4cloud GmbH & Co. KG
Allgeier ES France
Allos S.r.l.
alogis AG
Altab S.A.
Altaflux Corporation
ALTEA UP S.r.l.
Altevie Technologies S.r.l.
ALTIM TECNOLOGIAS DE LA INFORMACIÓN,S.L.
AMS Advanced Management Solutions Ltd
AMS Solutions Limited
AMT - CONSULTING, S.A.
Anda Srl
ANSI Information Systems, Inc.
anthesis GmbH
AO "ECOPSY"
AODYS
APPCENTRIC Solutions Inc.
applied international informatics GmbH
APPTechnology Experts, Inc.
APSIA
apsolut GmbH
Arago Consulting Iberia
ARAGO CONSULTING SAS
Arete Bilgisayar, Otomasyon Egitim ve Danismanlik Hizm. San. Tic. Ltd. Sti.
Arete Global Company Branch FZ-LLC
Arineo GmbH
Arinso Iberica S.A.U.
Arithnea GmbH
Arvato Systems GmbH
AS Emergn
ASAP soft
ASAR AMERICA INC.
Ascarii Ltd
Ascentium Corporation DBA SMITH
AScorpi GmbH
Asecom B.V.
ASG Group Limited
Aspire HR Inc.
ASPN CO.,LTD.
Attune Consulting USA, Inc.
Aubay Italia S.p.A.
avantum Consult AG
Avectris AG
Avtenta, napredne poslovne rešitve, d.o.o.
Axians ICT Austria GmbH
Axians NEO Solutions & Technology GmbH
Axianseu II Digital CONSULTING SA
Axxiom Soluções Tecnológicas S.A
Ayesa Advanced Technologies, S.A.
b1 consulting GmbH
Babiel GmbH
Backoffice Associates
BAITCON S.A.
Baraka IT Solutions (Pty) Ltd
BC SKILLS CONSULTING
BCI Consulting SRL
BCS Business Consulting Services Kft.
BDO Unicon Business Solutions AO
BDO Ziv Haft Consulting # Management Ltd
be one solutions Americas Inc.
be one solutions Deutschland GmbH
be one solutions Japan K.K.
Be1Eye GmbH
BearingPoint GmbH
Beijing AVA Technology Inc.
Beijing Ether Electronics Group Co., Ltd.
Beijing Faujor Technology Co., Ltd.
Beijing Pactera Services Limited
Beijing Shunshiheng Technology Development Co,. Ltd.
BENOY LLC
Bestcom Infotech Corporation
Beyond Technologies Consulting Inc.
BEYOND TECHNOLOGIES
BGP Management Consulting S.p.A.
BH Consulting Co., Ltd.
Bilot Oyj
Birchman Solutions Ltd
BizTech Partners Co., Ltd
Blend IT Consultoria e Serviços em Informática Ltda
Blue Pencil Consulting Pty Ltd
Bluekey Seidor (K) Limited
BluePrint Technologies Private Limited
Bluetree Solutions Pty Ltd
BluLeader Pty Ltd
BMS Global Services LLC
bneXt Inc.
BosCloud Jiangsu Science and Technology Co., Ltd.
Boyum IT A/S
Bramasol, Inc.
Brave New World Consulting Pty Ltd t/as BNW Consulting
bridgX
Bright Business Partners
Britehouse a Division of Dimension Data (pty) Ltd
BS&C
BSGOne Co., Ltd. BSGOne
BTC Bilisim Hizmetleri A.S.
BTC Business Technology Consulting AG
BUSINESS AT WORK
Business Process Solutions SA de CV Xamai
Business Service Center LLC
BXT Solution Co., Ltd
BXTI Soluciones en Tecnología de Información S.A. de C.V.
C2E Teknoloji Servisleri Ticaret Anonim Sirketi
Caleo Consulting GmbH
Camelot ITLab GmbH
CANCOM Managed Services GmbH
CAPLAN Corporation
Castaliaz Technologies Private Limited
CBL CONSULTING
CCelera s.r.l.
CEGB Corporation
Celeritech Mexico SAPI de CV
CEO Consultoría, S.R.L
CEREALOG
CGI Suomi Oy
CGI Sverige AB
Chain Services TI S.A.C. Csti S.A.C.
Chengdu Biz-United Information Technology Co., Ltd.
China National Software and Service Co., Ltd.
Chinasoft Technology (Shenzhen) Corporation Limited
cHReative Consultoria LTDA
Cibernetica, S.A.
Citek Technology Joint Stock Company
Clarex Srl
Clariba Consulting S.L.U.
Clientis AG
Clients First Business Solutions LLC
Clockwork Business Solutions Private Limited
Clockwork Inc.
Cloudera Limited
Cloudway Consulting Private Limited
CM CONSULTING
CNBM Technology Co., Ltd
CNT Management Consulting AG
Codestone Solutions Ltd
CODILOG - ELIANCE
Cogniscient Business Solutions Private Limited
Cognitus Consulting LLC
Columbus Systems GmbH
COMLINE Computer + Softwarelösungen SE
COMMON MANAGEMENT SOLUTIONS, S.L.
Complete Business Solutions
COMPTA - EQUIPAMENTOS E SERVIÇOS DE INFORMÁTICA, S.A.
Compuage Infocom Limited
CompuNet, S.A.
CompuTec S.A.
Computer Systems Pvt Ltd
ComSol AG Commercial Solutions
CON.SE s.r.l.
ConCorn LLC.
CONPLUS Mittelstandslösungen GmbH
CONSEILS PLUS
Conseils-Plus
Consensus International, LLC
Consensus S.A.S.
Consilio GmbH
consolut.gmbh
Consulting 4U, s.r.o.
Consultoría Organizacional S.A.S.
ConVista Consulting AG
Cormeta AG
Corponet Implements, S.A. de C.V.
Corporacion Saratella, S.A. Vivo Consulting
You need to add a header with User-Agent string to your request. You can do that like this:
url = 'https://partneredge.sap.com/content/partnerfinder/search.html#/search/results?itemsPerPage=10&sortBy=shortname&sortOrder=asc'
header = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'
}
response = requests.get(url, headers = header)
You can try this
results = content.find_all('li', {'class': ['search-results__item']})
I have a column on my dataframe that contains the following
Wal-Mart Stores, Inc., Clinton, IA 52732
Benton Packing, LLC, Clearfield, UT 84016
North Coast Iron Corp, Seattle, WA 98109
Messer Construction Co. Inc., Amarillo, TX 79109
Ocean Spray Cranberries, Inc., Henderson, NV 89011
W R Derrick & Co. Lexington, SC 29072
I am having problem to capture it using regex so far my regex works for first 2 lines:
[A-Z][A-za-z-\s]+,\s{1}(Inc.|LLC)
How do I split the column to 4 additional columns? i.e. Column1 = Company Name, Column 2 = City, Column 3 = State, Column 4 = Zipcode.
Example of the output is shown below:
Company_Name City State ZipCode
Wal-Mart Stores, Inc. Clinton IA 52732
The names are probably the trickiest part, but if you know that the structure of city, state, zip will always be the same (i.e. no extra commas) you could use rsplit to split the strings. Similarly pandas has a str.rsplit method as well.
df
Address
0 Wal-Mart Stores, Inc., Clinton, IA 52732
1 Benton Packing, LLC, Clearfield, UT 84016
2 North Coast Iron Corp, Seattle, WA 98109
3 Messer Construction Co. Inc., Amarillo, TX 79109
df['Zip'] = df.Address.map(lambda x: x.rsplit(' ', 1)[-1])
df['Name'], df['City'], df['State']= zip(*df.Address.map(lambda x: x.rsplit(' ', 1)[0].rsplit(',', 2)))
df
Address Zip \
0 Wal-Mart Stores, Inc., Clinton, IA 5273 5273
1 Benton Packing, LLC, Clearfield, UT 84016 84016
2 North Coast Iron Corp, Seattle, WA 98109 98109
3 Messer Construction Co. Inc., Amarillo, TX 79109 79109
Name City State
0 Wal-Mart Stores, Inc. Clinton IA
1 Benton Packing, LLC Clearfield UT
2 North Coast Iron Corp Seattle WA
3 Messer Construction Co. Inc. Amarillo TX