I'm scraping a news article using BeautifulSoup trying to only return the text body of the article itself, not all the additional "noise". Is there any easy way to do this?
import bs4
import requests
url = 'https://www.cnn.com/2018/01/22/us/puerto-rico-privatizing-state-power-authority/index.html'
res = requests.get(url)
soup = bs4.BeautifulSoup(res.text,'html.parser')
element = soup.select_one('div.pg-rail-tall__body #body-text').text
print(element)
Trying to exclude some of the information returned such as
{CNN.VideoPlayer.handleUnmutePlayer = function
handleUnmutePlayer(containerId, dataObj) {'use strict';var
playerInstance,playerPropertyObj,rememberTime,unmuteCTA,unmuteIdSelector
= 'unmute_' +
The noise, as you call it, is the text in the <script>...</script> tags (JavaScript code). You can remove it using .extract() like:
for s in soup.find_all('script'):
s.extract()
You can use this:
r = requests.get('https://edition.cnn.com/2018/01/22/us/puerto-rico-privatizing-state-power-authority/index.html')
soup = BeautifulSoup(r.text, 'html.parser')
[x.extract() for x in soup.find_all('script')] # Does the same thing as the 'for-loop' above
element = soup.find('div', class_='pg-rail-tall__body')
print(element.text)
Partial Output:
(CNN)Puerto Rico Gov. Ricardo Rosselló announced Monday that the
commonwealth will begin privatizing the Puerto Rico Electric Power
Authority, or PREPA. In comments published on Twitter, the governor
said the assets sale would transform the island's power generation
system into a "modern" and "efficient" one that would be less
expensive for citizens.He said the system operates "deficiently" and
that the improved infrastructure would respond more "agilely" to
natural disasters. The privatization process will begin "in the next
few days" and occur in three phases over the next 18 months, the
governor said.JUST WATCHEDSchool cheers as power returns after 112
daysReplayMore Videos ...MUST WATCHSchool cheers as power returns
after 112 days 00:48San Juan Mayor Carmen Yulin Cruz, known for her
criticisms of the Trump administration's response to Puerto Rico after
Hurricane Maria, spoke out against the move.Cruz, writing on her
official Twitter account, said PREPA's privatization would put the
commonwealth's economic development into "private hands" and that the
power authority will begin to "serve other interests.
Try this:
import bs4
import requests
url = 'https://www.cnn.com/2018/01/22/us/puerto-rico-privatizing-state-power-au$'
res = requests.get(url)
soup = bs4.BeautifulSoup(res.text, 'html.parser')
elementd = soup.findAll('div', {'class': 'zn-body__paragraph'})
elementp = soup.findAll('p', {'class': 'zn-body__paragraph'})
for i in elementp:
print(i.text)
for i in elementd:
print(i.text)
Related
I am working on to extract link and text from from anchor tag using beautiful soup
The below code is from where i have to extract the data from anchor tag which is link and the text
Mumbai: Vaccination figures surge in private hospitals, stagnate in government centres
Chennai: Martial arts instructor arrested following allegations of sexual assault
Mumbai Metro lines 2A and 7: Here is everything you need to know
**Python code to extract the content from the above code.**
#app.get('/indian_express', response_class=HTMLResponse)
async def dna_india(request: Request):
print("1111111111111111")
dict={}
URL="https://indianexpress.com/latest-news/"
page=requests.get(URL)
soup=BS(page.content, 'html.parser')
results=soup.find_all('div', class_="nation")
for results_element in results:
results_element_1 = soup.find_all('div', class_="title")
for results_element_2 in results_element_1:
for results_element_3 in results_element_2:
print(results_element_3) **The above printed html code is because of this print**
print(" ")
link_element=results_element_3.find_all('a', class_="title", href=True) **I am getting empty [] when i try to print here **
# print(link_element)
# title_elem = results_element_3.find('a')['href']
# link_element=results_element_3.find('a').contents[0]
# print(title_elem)
# print(link_element)
# for index,(title,link) in enumerate(zip(title_elem, link_element)):
# dict[str(title.text)]=str(link['href'])
json_compatible_item_data = jsonable_encoder(dict)
return templates.TemplateResponse("display.html", {"request":request, "json_data":json_compatible_item_data})
#app.get('/deccan_chronicle', response_class=HTMLResponse)
async def deccan_chronicle(request: Request):
dict={}
URL="https://www.news18.com/india/"
page=requests.get(URL)
soup=BS(page.content, 'html.parser')
main_div = soup.find("div", class_="blog-list")
for i in main_div:
#link_data = i.find("div", class_="blog-list-blog").find("a")
link_data=i.find("div",class_="blog-list-blog").find("a")
text_data = link_data.text
dict[str(text_data)] = str(link_data.attrs['href'])
json_compatible_item_data = jsonable_encoder(dict)
return templates.TemplateResponse("display.html", {"request":request, "json_data":json_compatible_item_data})
Please help me out with this code
You can find main_div tag which has all the records of news in which you can find articles where all data is defined and iterating over that articles title can be extract using finding proper a tag which contain title as well as herf of same!
import requests
from bs4 import BeautifulSoup
res=requests.get("https://indianexpress.com/latest-news/")
soup=BeautifulSoup(res.text,"html.parser")
main_div=soup.find("div",class_="nation")
articles=main_div.find_all("div",class_="articles")
for i in articles:
href=i.find("div",class_="title").find("a")
print(href.attrs['href'])
text_data=href.text
print(text_data)
Output:
https://indianexpress.com/article/business/banking-and-finance/banks-cant-cite-2018-rbi-circular-to-caution-clients-on-virtual-currencies-7338628/
Banks can’t cite 2018 RBI circular to caution clients on virtual currencies
https://indianexpress.com/article/india/supreme-court-stays-delhi-high-court-order-on-levy-of-igst-on-imported-oxygen-concentrators-for-personal-use-7339478/
Supreme Court stays Delhi High Court order on levy of IGST on imported oxygen concentrators for personal use
...
2nd Method
Dont make so complex just observe tags what data they contain like i have found main tag main_div and then go for tag which contains text as well as links and you can find it in h4 tag and iterate over it !
from bs4 import BeautifulSoup
import requests
res=requests.get("https://www.news18.com/india/")
soup=BeautifulSoup(res.text,"html.parser")
main_div = soup.find("div", class_="blog-list")
data=main_div.find_all("h4")
for i in data:
print(i.find("a")['href'])
print(i.find("a").text)
output:
https://www.news18.com/news/india/2-killed-six-injured-after-portion-of-two-storey-building-collapses-in-varanasi-pm-assures-help-3799610.html
2 Killed, Six Injured After Portion of Two-Storey Building Collapses in Varanasi; PM Assures Help
https://www.news18.com/news/india/dont-compel-citizens-to-move-courts-again-again-follow-national-litigation-policy-hc-tells-centre-3799598.html
Don't Compel Citizens to Move Courts Again & Again, Follow National Litigation Policy, HC Tells Centre
...
I want to scrape data from Yahoo News and 'Bing News' pages. The data that I want to scrape are headlines or/and text below headlines (what ever It can be scraped) and dates (time) when its posted.
I have wrote a code but It does not return anything. Its the problem with my url since Im getting response 404
Can you please help me with it?
This is the code for 'Bing'
from bs4 import BeautifulSoup
import requests
term = 'usa'
url = 'http://www.bing.com/news/q?s={}'.format(term)
response = requests.get(url)
print(response)
soup = BeautifulSoup(response.text, 'html.parser')
print(soup)
And this is for Yahoo:
term = 'usa'
url = 'http://news.search.yahoo.com/q?s={}'.format(term)
response = requests.get(url)
print(response)
soup = BeautifulSoup(response.text, 'html.parser')
print(soup)
Please help me to generate these urls, whats the logic behind them, Im still a noob :)
Basically your urls are just wrong. The urls that you have to use are the same ones that you find in the address bar while using a regular browser. Usually most search engines and aggregators use q parameter for the search term. Most of the other parameters are usually not required (sometimes they are - eg. for specifying result page no etc..).
Bing
from bs4 import BeautifulSoup
import requests
import re
term = 'usa'
url = 'https://www.bing.com/news/search?q={}'.format(term)
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
for news_card in soup.find_all('div', class_="news-card-body"):
title = news_card.find('a', class_="title").text
time = news_card.find(
'span',
attrs={'aria-label': re.compile(".*ago$")}
).text
print("{} ({})".format(title, time))
Output
Jason Mohammed blitzkrieg sinks USA (17h)
USA Swimming held not liable by California jury in sexual abuse case (1d)
United States 4-1 Canada: USA secure payback in Nations League (1d)
USA always plays the Dalai Lama card in dealing with China, says Chinese Professor (1d)
...
Yahoo
from bs4 import BeautifulSoup
import requests
term = 'usa'
url = 'https://news.search.yahoo.com/search?q={}'.format(term)
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
for news_item in soup.find_all('div', class_='NewsArticle'):
title = news_item.find('h4').text
time = news_item.find('span', class_='fc-2nd').text
# Clean time text
time = time.replace('·', '').strip()
print("{} ({})".format(title, time))
Output
USA Baseball will return to Arizona for second Olympic qualifying chance (52 minutes ago)
Prized White Sox prospect Andrew Vaughn wraps up stint with USA Baseball (28 minutes ago)
Mexico defeats USA in extras for Olympic berth (13 hours ago)
...
I'm trying to parse a website and get some info with the find_all() method, but it doesn't find them all.
This is the code:
#!/usr/bin/python3
from bs4 import BeautifulSoup
from urllib.request import urlopen
page = urlopen ("http://mangafox.me/directory/")
# print (page.read ())
soup = BeautifulSoup (page.read ())
manga_img = soup.findAll ('a', {'class' : 'manga_img'}, limit=None)
for manga in manga_img:
print (manga['href'])
It only prints half of them...
Different HTML parsers deal differently with broken HTML. That page serves broken HTML, and the lxml parser is not dealing very well with it:
>>> import requests
>>> from bs4 import BeautifulSoup
>>> r = requests.get('http://mangafox.me/directory/')
>>> soup = BeautifulSoup(r.content, 'lxml')
>>> len(soup.find_all('a', class_='manga_img'))
18
The standard library html.parser has less trouble with this specific page:
>>> soup = BeautifulSoup(r.content, 'html.parser')
>>> len(soup.find_all('a', class_='manga_img'))
44
Translating that to your specific code sample using urllib, you would specify the parser thus:
soup = BeautifulSoup(page, 'html.parser') # BeatifulSoup can do the reading
The quick way to grab all href elements is to use CSS Selector which will select all a tags with an href element that contains /manga at the beginning link.
Output will contain all links that starts with /manga/"title"(check this in dev tools using inspector):
import requests
from bs4 import BeautifulSoup
import lxml
html = requests.get('http://fanfox.net/directory/').text
soup = BeautifulSoup(html, 'lxml')
for a_tag in soup.select('a[href*="/manga"]'):
link = a_tag['href']
link = link[1:]
print(f'http://fanfox.net/{link}')
Alternative method:
Change requests.get to a different URL (directory/2.html)
Here's the working code(works 2-3-4-5-6.. pages as well) and replit.com to play around:
import requests
from bs4 import BeautifulSoup
import lxml
html = requests.get('http://fanfox.net/directory/').text
soup = BeautifulSoup(html, 'lxml')
for manga in soup.select('.line'):
title = manga.select('.manga-list-1-item-title a')
for t in title:
print(t.text)
for i in manga.findAll('img', class_='manga-list-1-cover'):
img = i['src']
print(img)
for l in manga.findAll('p', class_='manga-list-1-item-title'):
link = l.a['href']
link = link[1:]
print(f'http://fanfox.net/{link}')
Output(could be prettier), all in order:
A Story About Treating a Female Knig...
Tales of Demons and Gods
Martial Peak
Onepunch-Man
One Piece
Star Martial God Technique
Solo Leveling
The Last Human
Kimetsu no Yaiba
Versatile Mage
Boku no Hero Academia
Apotheosis
Black Clover
Tensei Shitara Slime Datta Ken
Kingdom
Tate no Yuusha no Nariagari
Tomo-chan wa Onna no ko!
Goblin Slayer
Yakusoku no Neverland
God of Martial Arts
Kaifuku Jutsushi no Yarinaoshi
Re:Monster
Mushoku Tensei - Isekai Ittara Honki...
Nanatsu no Taizai
Battle Through the Heavens
Shingeki no Kyojin
Iron Ladies
Monster Musume no Iru Nichijou
World’s End Harem
Bleach
Parallel Paradise
Shokugeki no Soma
Spirit Sword Sovereign
Horimiya
Dungeon ni Deai o Motomeru no wa Mac...
Dr. Stone
Berserk
The New Gate
Akatsuki no Yona
Naruto
Overlord
Death March kara Hajimaru Isekai Kyo...
Tsuki ga Michibiku Isekai Douchuu
Eternal Reverence
Minamoto-kun Monogatari
Beastars
Jujutsu Kaisen
Hajime no Ippo
Kaguya-sama wa Kokurasetai - Tensai-...
Domestic na Kanojo
The Legendary Moonlight Sculptor
The Gamer
Kumo desu ga, nani ka?
Bokutachi wa Benkyou ga Dekinai
Enen no Shouboutai
Tsuyokute New Saga
Fairy Tail
Komi-san wa Komyushou Desu.
Kenja no Mago
Soul Land
Boruto: Naruto Next Generations
Hunter X Hunter
History’s Strongest Disciple Kenichi
Phoenix against the World
LV999 no Murabito
Gate - Jietai Kare no Chi nite, Kaku...
Kengan Asura
Konjiki no Moji Tsukai - Yuusha Yoni...
Please don’t bully me, Nagatoro
Isekai Maou to Shoukan Shoujo Dorei ...
http://fmcdn.mfcdn.net/store/manga/27418/cover.jpg?token=64e5c0c930644528cba6eb2f2f5f5a2f3762188d&ttl=1616839200&v=1615891672
http://fmcdn.mfcdn.net/store/manga/16627/cover.jpg?token=33f5ea4c1ba1a013c5bdcfdac87209fe472cf6d5&ttl=1616839200&v=1616396463
http://fmcdn.mfcdn.net/store/manga/27509/cover.jpg?token=ce2b16e8e867a8ce13ad0bee9940b68eef324cac&ttl=1616839200&v=1616737688
http://fmcdn.mfcdn.net/store/manga/11362/cover.jpg?token=1a5876d8a767fd27b26f0287bbb36eb82f9cf811&ttl=1616839200&v=1615796703
http://fmcdn.mfcdn.net/store/manga/106/cover.jpg?token=5313fc0dae53f33fcd1284cd4858603fc47ffa04&ttl=1616839200&v=1616748903
http://fmcdn.mfcdn.net/store/manga/22443/cover.jpg?token=89760754754a63efc875aa7e2de0536a5238bed3&ttl=1616839200&v=1616396922
http://fmcdn.mfcdn.net/store/manga/29037/cover.jpg?token=e8b496db4ad520f002040761c5887bc1e17af63a&ttl=1616839200&v=1616653683
http://fmcdn.mfcdn.net/store/manga/28343/cover.jpg?token=71c1b201e4d714f893efb7ac984c9787dd8df915&ttl=1616839200&v=1616748232
http://fmcdn.mfcdn.net/store/manga/19287/cover.jpg?token=803eb8beab4dc6aa8d73f5137a6e3331c0034d24&ttl=1616839200&v=1609900224
http://fmcdn.mfcdn.net/store/manga/27761/cover.jpg?token=6c11f2bddb31b460fccc9a158cc13b9593fb1ad2&ttl=1616839200&v=1616740672
http://fmcdn.mfcdn.net/store/manga/14356/cover.jpg?token=93638c7ec630de193299caa8d513e045818b35ce&ttl=1616839200&v=1616170144
http://fmcdn.mfcdn.net/store/manga/27118/cover.jpg?token=9c876792ad8e6e5f9777386184ea8e6f409aa9fd&ttl=1616839200&v=1616654344
http://fmcdn.mfcdn.net/store/manga/15291/cover.jpg?token=e0a3195fcc88e397703e8bdf6580a62a0d856816&ttl=1616839200&v=1616345844
http://fmcdn.mfcdn.net/store/manga/15975/cover.jpg?token=e07844bb607a3d53ababab51683ee6fa06906d7c&ttl=1616839200&v=1616733843
http://fmcdn.mfcdn.net/store/manga/8198/cover.jpg?token=bc135016049bb63e5b65ec87207e0c91bb0c62c8&ttl=1616839200&v=1616335864
http://fmcdn.mfcdn.net/store/manga/14036/cover.jpg?token=c13dab07379e88fb871d3d833999ead13bfaf0fc&ttl=1616839200&v=1615393923
http://fmcdn.mfcdn.net/store/manga/16159/cover.jpg?token=cdf538f92f729999bcb9fcae7fb31b7a8c306c92&ttl=1616839200&v=1569492366
http://fmcdn.mfcdn.net/store/manga/20569/cover.jpg?token=f9c08cde2f0a6bd646dc87dc4a8dee6fa44eca3c&ttl=1616839200&v=1616680427
http://fmcdn.mfcdn.net/store/manga/21271/cover.jpg?token=062fd439c18afaf178d3408c64b2b305f679e91a&ttl=1616839200&v=1611285077
http://fmcdn.mfcdn.net/store/manga/26916/cover.jpg?token=cda99bf9831ada1322045bf82893a9ed1ad868d5&ttl=1616839200&v=1615188784
http://fmcdn.mfcdn.net/store/manga/26841/cover.jpg?token=055e9ff117c28b3a7c3089c4d691228adeba1f55&ttl=1616839200&v=1616201299
http://fmcdn.mfcdn.net/store/manga/13895/cover.jpg?token=e7661738326d62d38b5f93771105898cb95adaba&ttl=1616839200&v=1612570263
http://fmcdn.mfcdn.net/store/manga/14217/cover.jpg?token=3263f009d5b42e441a09e14c44e3fd7d12a83089&ttl=1616839200&v=1615259584
http://fmcdn.mfcdn.net/store/manga/11374/cover.jpg?token=ab9d85a9efdd5b41391db5249bcf0011ce07070f&ttl=1616839200&v=1600762925
http://fmcdn.mfcdn.net/store/manga/14225/cover.jpg?token=e8912699841e28f9ca8b40eb8fe1d37d2a6ce3e3&ttl=1616839200&v=1616097340
http://fmcdn.mfcdn.net/store/manga/9011/cover.jpg?token=eaca757d4352b66d4ef69812ec5c265b5a2f7a28&ttl=1616839200&v=1614982324
http://fmcdn.mfcdn.net/store/manga/29235/cover.jpg?token=23b3338eaa8984bad9c17a2d604c60c909282715&ttl=1616839200&v=1614666974
http://fmcdn.mfcdn.net/store/manga/10348/cover.jpg?token=c4209cc06013a704c9f7a0e942b8ae55a7546941&ttl=1616839200&v=1616082423
http://fmcdn.mfcdn.net/store/manga/20107/cover.jpg?token=699e867d86e4957b8ef4d3eee5200f80cdbbea88&ttl=1616839200&v=1610529669
http://fmcdn.mfcdn.net/store/manga/9/cover.jpg?token=a4894a5ce212a490dda9c6cf73b717bbfbf015c3&ttl=1616839200&v=1616593028
http://fmcdn.mfcdn.net/store/manga/24693/cover.jpg?token=d968c24525bc6fe467f40c9ad2ff087ebfb60e4a&ttl=1616839200&v=1615325943
http://fmcdn.mfcdn.net/store/manga/11529/cover.jpg?token=1a3ab38ba3f212d5c95138bb690b155f38390aab&ttl=1616839200&v=1615891829
http://fmcdn.mfcdn.net/store/manga/28001/cover.jpg?token=1769a66a83df9adfed58a36dc9275f202d1f8f37&ttl=1616839200&v=1615891671
http://fmcdn.mfcdn.net/store/manga/11147/cover.jpg?token=e6d602fcd4b438ec299c955738487127cef7a3bf&ttl=1616839200&v=1616264399
http://fmcdn.mfcdn.net/store/manga/12978/cover.jpg?token=7e9094f238fcbd19717ffeeb4dcfe686a99dba4b&ttl=1616839200&v=1568611983
http://fmcdn.mfcdn.net/store/manga/24445/cover.jpg?token=0f77d7a743c0f613ff773f3e430f688e3aa77239&ttl=1616839200&v=1616345762
http://fmcdn.mfcdn.net/store/manga/176/cover.jpg?token=e8e87528092cd5b902767d7564e035486b8535f2&ttl=1616839200&v=1611297351
http://fmcdn.mfcdn.net/store/manga/14588/cover.jpg?token=469da1dfa4953459e08efdeb24561f78f7a68b47&ttl=1616839200&v=1615891989
http://fmcdn.mfcdn.net/store/manga/9126/cover.jpg?token=53689bb06b90c163b58b0410e80252941b27aff6&ttl=1616839200&v=1616083893
http://fmcdn.mfcdn.net/store/manga/8/cover.jpg?token=8e5cbd08bd42f0684f36f107fc991c75b56bbed2&ttl=1616839200&v=1615891989
http://fmcdn.mfcdn.net/store/manga/14765/cover.jpg?token=8a8e0582258d852b4c9d017567dd6820958f5a67&ttl=1616839200&v=1615042503
http://fmcdn.mfcdn.net/store/manga/16457/cover.jpg?token=7e59859f7af131902006c3eb8ed55745ef14573f&ttl=1616839200&v=1613139843
http://fmcdn.mfcdn.net/store/manga/16675/cover.jpg?token=cbb268f1326b704b1bb11accadc35ae3b7222e39&ttl=1616839200&v=1615891829
http://fmcdn.mfcdn.net/store/manga/26261/cover.jpg?token=d83f514efe719b2dd301c2ecc8d672e9d935084c&ttl=1616839200&v=1613384403
http://fmcdn.mfcdn.net/store/manga/9518/cover.jpg?token=76170cb8b2defc468a817a69bf6e799900c4fd9f&ttl=1616839200&v=1596437944
http://fmcdn.mfcdn.net/store/manga/24547/cover.jpg?token=b99d7b791e14ec290054d57ead4dcf9fb61b4d7a&ttl=1616839200&v=1615891989
http://fmcdn.mfcdn.net/store/manga/27861/cover.jpg?token=d14b0f3f2362869830c2971007e86ca43637bb85&ttl=1616839200&v=1616345044
http://fmcdn.mfcdn.net/store/manga/231/cover.jpg?token=53c2dc9eb6bf5c6f635de12496088a27b28e04f7&ttl=1616839200&v=1616418784
http://fmcdn.mfcdn.net/store/manga/17825/cover.jpg?token=f1b7954fba32d3146282b2b5bba4e1419578d65b&ttl=1616839200&v=1616677923
http://fmcdn.mfcdn.net/store/manga/14099/cover.jpg?token=7b7a61b4e544a65a75394e4cabf04831cf0c5d7a&ttl=1616839200&v=1611909666
http://fmcdn.mfcdn.net/store/manga/15177/cover.jpg?token=4442c2f4cf7e5c69d3449e7b358960930ff19e11&ttl=1616839200&v=1605145143
http://fmcdn.mfcdn.net/store/manga/13088/cover.jpg?token=d8ab36b3d0f4d9c6263a4f482f98c4d99809eb36&ttl=1616839200&v=1616641226
http://fmcdn.mfcdn.net/store/manga/18225/cover.jpg?token=ea670a4bc8d1aa0312f5427b24bf5702c12ef3a3&ttl=1616839200&v=1615470603
http://fmcdn.mfcdn.net/store/manga/23945/cover.jpg?token=9e078e0cb6da91194a6f86c814ae03922e8460d0&ttl=1616839200&v=1615891671
http://fmcdn.mfcdn.net/store/manga/17045/cover.jpg?token=3026e40a21e490f37c656a778e9227c6c891cade&ttl=1616839200&v=1615891829
http://fmcdn.mfcdn.net/store/manga/13930/cover.jpg?token=f773694a746e2015b4ca5c46afcc801d9795393c&ttl=1616839200&v=1616100123
http://fmcdn.mfcdn.net/store/manga/246/cover.jpg?token=3926211df393a0d50e58c0285c05f067c1ad64e5&ttl=1616839200&v=1615891989
http://fmcdn.mfcdn.net/store/manga/17189/cover.jpg?token=f9ffcf2a07bb8d1f7a49eac36c1f6c4fcd7e5622&ttl=1616839200&v=1616514627
http://fmcdn.mfcdn.net/store/manga/20299/cover.jpg?token=121f6571e072381a545e9e3790b4bf1723865859&ttl=1616839200&v=1615891671
http://fmcdn.mfcdn.net/store/manga/13841/cover.jpg?token=86245cf3afab622c35a41f4e2bf388ac48713906&ttl=1616839200&v=1615891672
http://fmcdn.mfcdn.net/store/manga/19939/cover.jpg?token=563a2963a0a153ac1c53779712f48af5630e0377&ttl=1616839200&v=1616714152
http://fmcdn.mfcdn.net/store/manga/44/cover.jpg?token=febabec452a05c1415f02bf8387a0a8f16c20137&ttl=1616839200&v=1548837372
http://fmcdn.mfcdn.net/store/manga/107/cover.jpg?token=3dcce47a3a6760b9b81b7b576711980d36cf7be1&ttl=1616839200&v=1543561843
http://fmcdn.mfcdn.net/store/manga/24241/cover.jpg?token=b4a1834d714f0476c2d99c5ffb905351c7a4d72f&ttl=1616839200&v=1616176266
http://fmcdn.mfcdn.net/store/manga/25773/cover.jpg?token=7bf8a8e9346a02250bb24cd8e6e4da0933e6a05f&ttl=1616839200&v=1616655977
http://fmcdn.mfcdn.net/store/manga/10956/cover.jpg?token=db3b74dc959adedbd847142cd3a079caca6b25d1&ttl=1616839200&v=1612043463
http://fmcdn.mfcdn.net/store/manga/15593/cover.jpg?token=caceb80b7266f438bdedae8cf69653ab7911fe68&ttl=1616839200&v=1606188363
http://fmcdn.mfcdn.net/store/manga/14916/cover.jpg?token=0dab5e6797f4cc915a035632ed0d02a2492afbcc&ttl=1616839200&v=1609752363
http://fmcdn.mfcdn.net/store/manga/26771/cover.jpg?token=77a6aa9bbb7ebcd3df15cd4cc65b4e3915e96ed4&ttl=1616839200&v=1615891829
http://fmcdn.mfcdn.net/store/manga/16569/cover.jpg?token=e5815ac1520ad179ad2d6f798e4b6ead6790cd33&ttl=1616839200&v=1614957071
http://fanfox.net/manga/a_story_about_treating_a_female_knight_who_has_never_been_treated_as_a_woman_as_a_woman/
http://fanfox.net/manga/tales_of_demons_and_gods/
http://fanfox.net/manga/martial_peak/
http://fanfox.net/manga/onepunch_man/
http://fanfox.net/manga/one_piece/
http://fanfox.net/manga/star_martial_god_technique/
http://fanfox.net/manga/solo_leveling/
http://fanfox.net/manga/the_last_human/
http://fanfox.net/manga/kimetsu_no_yaiba/
http://fanfox.net/manga/versatile_mage/
http://fanfox.net/manga/boku_no_hero_academia/
http://fanfox.net/manga/apotheosis/
http://fanfox.net/manga/black_clover/
http://fanfox.net/manga/tensei_shitara_slime_datta_ken/
http://fanfox.net/manga/kingdom/
http://fanfox.net/manga/tate_no_yuusha_no_nariagari/
http://fanfox.net/manga/tomo_chan_wa_onna_no_ko/
http://fanfox.net/manga/goblin_slayer/
http://fanfox.net/manga/yakusoku_no_neverland/
http://fanfox.net/manga/god_of_martial_arts/
http://fanfox.net/manga/kaifuku_jutsushi_no_yarinaoshi/
http://fanfox.net/manga/re_monster/
http://fanfox.net/manga/mushoku_tensei_isekai_ittara_honki_dasu/
http://fanfox.net/manga/nanatsu_no_taizai/
http://fanfox.net/manga/battle_through_the_heavens/
http://fanfox.net/manga/shingeki_no_kyojin/
http://fanfox.net/manga/iron_ladies/
http://fanfox.net/manga/monster_musume_no_iru_nichijou/
http://fanfox.net/manga/world_s_end_harem/
http://fanfox.net/manga/bleach/
http://fanfox.net/manga/parallel_paradise/
http://fanfox.net/manga/shokugeki_no_soma/
http://fanfox.net/manga/spirit_sword_sovereign/
http://fanfox.net/manga/horimiya/
http://fanfox.net/manga/dungeon_ni_deai_o_motomeru_no_wa_machigatte_iru_darou_ka/
http://fanfox.net/manga/dr_stone/
http://fanfox.net/manga/berserk/
http://fanfox.net/manga/the_new_gate/
http://fanfox.net/manga/akatsuki_no_yona/
http://fanfox.net/manga/naruto/
http://fanfox.net/manga/overlord/
http://fanfox.net/manga/death_march_kara_hajimaru_isekai_kyousoukyoku/
http://fanfox.net/manga/tsuki_ga_michibiku_isekai_douchuu/
http://fanfox.net/manga/eternal_reverence/
http://fanfox.net/manga/minamoto_kun_monogatari/
http://fanfox.net/manga/beastars/
http://fanfox.net/manga/jujutsu_kaisen/
http://fanfox.net/manga/hajime_no_ippo/
http://fanfox.net/manga/kaguya_sama_wa_kokurasetai_tensai_tachi_no_renai_zunousen/
http://fanfox.net/manga/domestic_na_kanojo/
http://fanfox.net/manga/the_legendary_moonlight_sculptor/
http://fanfox.net/manga/the_gamer/
http://fanfox.net/manga/kumo_desu_ga_nani_ka/
http://fanfox.net/manga/bokutachi_wa_benkyou_ga_dekinai/
http://fanfox.net/manga/enen_no_shouboutai/
http://fanfox.net/manga/tsuyokute_new_saga/
http://fanfox.net/manga/fairy_tail/
http://fanfox.net/manga/komi_san_wa_komyushou_desu/
http://fanfox.net/manga/kenja_no_mago/
http://fanfox.net/manga/soul_land/
http://fanfox.net/manga/boruto_naruto_next_generations/
http://fanfox.net/manga/hunter_x_hunter/
http://fanfox.net/manga/history_s_strongest_disciple_kenichi/
http://fanfox.net/manga/phoenix_against_the_world/
http://fanfox.net/manga/lv999_no_murabito/
http://fanfox.net/manga/gate_jietai_kare_no_chi_nite_kaku_tatakeri/
http://fanfox.net/manga/kengan_asura/
http://fanfox.net/manga/konjiki_no_moji_tsukai_yuusha_yonin_ni_makikomareta_unique_cheat/
http://fanfox.net/manga/please_don_t_bully_me_nagatoro/
http://fanfox.net/manga/isekai_maou_to_shoukan_shoujo_dorei_majutsu/
I found the best way (for me) with .find_all()/.findAll() methods is just to use for loop, same goes with .select() method.
And in some cases .select() giving better results.
Check out SelectorGadget to quickly find css selector.
I am trying to scrape the links from an inputted URL, but its only working for one url (http://www.businessinsider.com). How can it be adapted to scrape from any url inputted? I am using BeautifulSoup, but is Scrapy better suited for this?
def WebScrape():
linktoenter = input('Where do you want to scrape from today?: ')
url = linktoenter
html = urllib.request.urlopen(url).read()
soup = BeautifulSoup(html, "lxml")
if linktoenter in url:
print('Retrieving your links...')
links = {}
n = 0
link_title=soup.findAll('a',{'class':'title'})
n += 1
links[n] = link_title
for eachtitle in link_title:
print(eachtitle['href']+","+eachtitle.string)
else:
print('Please enter another Website...')
You could make a more generic scraper, searching for all tags and all links within those tags. Once you have the list of all links, you can use a regular expression or similar to find the links that match your desired structure.
import requests
from bs4 import BeautifulSoup
import re
response = requests.get('http://www.businessinsider.com')
soup = BeautifulSoup(response.content)
# find all tags
tags = soup.find_all()
links = []
# iterate over all tags and extract links
for tag in tags:
# find all href links
tmp = tag.find_all(href=True)
# append masters links list with each link
map(lambda x: links.append(x['href']) if x['href'] else None, tmp)
# example: filter only careerbuilder links
filter(lambda x: re.search('[w]{3}\.careerbuilder\.com', x), links)
code:
def WebScrape():
url = input('Where do you want to scrape from today?: ')
html = urllib.request.urlopen(url).read()
soup = bs4.BeautifulSoup(html, "lxml")
title_tags = soup.findAll('a', {'class': 'title'})
url_titles = [(tag['href'], tag.text)for tag in title_tags]
if title_tags:
print('Retrieving your links...')
for url_title in url_titles:
print(*url_title)
out:
Where do you want to scrape from today?: http://www.businessinsider.com
Retrieving your links...
http://www.businessinsider.com/trump-china-drone-navy-2016-12 Trump slams China's capture of a US Navy drone as 'unprecedented' act
http://www.businessinsider.com/trump-thank-you-rally-alabama-2016-12 'This is truly an exciting time to be alive'
http://www.businessinsider.com/how-smartwatch-pioneer-pebble-lost-everything-2016-12 How the hot startup that stole Apple's thunder wound up in Silicon Valley's graveyard
http://www.businessinsider.com/china-will-return-us-navy-underwater-drone-2016-12 Pentagon: China will return US Navy underwater drone seized in South China Sea
http://www.businessinsider.com/what-google-gets-wrong-about-driverless-cars-2016-12 Here's the biggest thing Google got wrong about self-driving cars
http://www.businessinsider.com/sheriff-joe-arpaio-still-wants-to-investigate-obamas-birth-certificate-2016-12 Sheriff Joe Arpaio still wants to investigate Obama's birth certificate
http://www.businessinsider.com/rents-dropping-in-new-york-bubble-pop-2016-12 Rents are finally dropping in New York City, and a bubble might be about to pop
http://www.businessinsider.com/trump-david-friedman-ambassador-israel-2016-12 Trump's ambassador pick could drastically alter 2 of the thorniest issues in the US-Israel relationship
http://www.businessinsider.com/can-hackers-be-caught-trump-election-russia-2016-12 Why Trump's assertion that hackers can't be caught after an attack is wrong
http://www.businessinsider.com/theres-a-striking-commonality-between-trump-and-nixon-2016-12 There's a striking commonality between Trump and Nixon
http://www.businessinsider.com/tesla-year-in-review-2016-12 Tesla's biggest moments of 2016
http://www.businessinsider.com/heres-why-using-uber-to-fill-public-transportation-gaps-is-a-bad-idea-2016-12 Here's why using Uber to fill public transportation gaps is a bad idea
http://www.businessinsider.com/useful-hard-adopt-early-morning-rituals-productive-exercise-2016-12 4 morning rituals that are hard to adopt but could really pay off
http://www.businessinsider.com/most-expensive-champagne-bottles-money-can-buy-2016-12 The 11 most expensive Champagne bottles money can buy
http://www.businessinsider.com/innovations-in-radiology-2016-11 5 innovations in radiology that could impact everything from the Zika virus to dermatology
http://www.businessinsider.com/ge-healthcare-mr-freelium-technology-2016-11 A new technology is being developed using just 1% of the finite resource needed for traditional MRIs
I'm trying to parse a website and get some info with the find_all() method, but it doesn't find them all.
This is the code:
#!/usr/bin/python3
from bs4 import BeautifulSoup
from urllib.request import urlopen
page = urlopen ("http://mangafox.me/directory/")
# print (page.read ())
soup = BeautifulSoup (page.read ())
manga_img = soup.findAll ('a', {'class' : 'manga_img'}, limit=None)
for manga in manga_img:
print (manga['href'])
It only prints half of them...
Different HTML parsers deal differently with broken HTML. That page serves broken HTML, and the lxml parser is not dealing very well with it:
>>> import requests
>>> from bs4 import BeautifulSoup
>>> r = requests.get('http://mangafox.me/directory/')
>>> soup = BeautifulSoup(r.content, 'lxml')
>>> len(soup.find_all('a', class_='manga_img'))
18
The standard library html.parser has less trouble with this specific page:
>>> soup = BeautifulSoup(r.content, 'html.parser')
>>> len(soup.find_all('a', class_='manga_img'))
44
Translating that to your specific code sample using urllib, you would specify the parser thus:
soup = BeautifulSoup(page, 'html.parser') # BeatifulSoup can do the reading
The quick way to grab all href elements is to use CSS Selector which will select all a tags with an href element that contains /manga at the beginning link.
Output will contain all links that starts with /manga/"title"(check this in dev tools using inspector):
import requests
from bs4 import BeautifulSoup
import lxml
html = requests.get('http://fanfox.net/directory/').text
soup = BeautifulSoup(html, 'lxml')
for a_tag in soup.select('a[href*="/manga"]'):
link = a_tag['href']
link = link[1:]
print(f'http://fanfox.net/{link}')
Alternative method:
Change requests.get to a different URL (directory/2.html)
Here's the working code(works 2-3-4-5-6.. pages as well) and replit.com to play around:
import requests
from bs4 import BeautifulSoup
import lxml
html = requests.get('http://fanfox.net/directory/').text
soup = BeautifulSoup(html, 'lxml')
for manga in soup.select('.line'):
title = manga.select('.manga-list-1-item-title a')
for t in title:
print(t.text)
for i in manga.findAll('img', class_='manga-list-1-cover'):
img = i['src']
print(img)
for l in manga.findAll('p', class_='manga-list-1-item-title'):
link = l.a['href']
link = link[1:]
print(f'http://fanfox.net/{link}')
Output(could be prettier), all in order:
A Story About Treating a Female Knig...
Tales of Demons and Gods
Martial Peak
Onepunch-Man
One Piece
Star Martial God Technique
Solo Leveling
The Last Human
Kimetsu no Yaiba
Versatile Mage
Boku no Hero Academia
Apotheosis
Black Clover
Tensei Shitara Slime Datta Ken
Kingdom
Tate no Yuusha no Nariagari
Tomo-chan wa Onna no ko!
Goblin Slayer
Yakusoku no Neverland
God of Martial Arts
Kaifuku Jutsushi no Yarinaoshi
Re:Monster
Mushoku Tensei - Isekai Ittara Honki...
Nanatsu no Taizai
Battle Through the Heavens
Shingeki no Kyojin
Iron Ladies
Monster Musume no Iru Nichijou
World’s End Harem
Bleach
Parallel Paradise
Shokugeki no Soma
Spirit Sword Sovereign
Horimiya
Dungeon ni Deai o Motomeru no wa Mac...
Dr. Stone
Berserk
The New Gate
Akatsuki no Yona
Naruto
Overlord
Death March kara Hajimaru Isekai Kyo...
Tsuki ga Michibiku Isekai Douchuu
Eternal Reverence
Minamoto-kun Monogatari
Beastars
Jujutsu Kaisen
Hajime no Ippo
Kaguya-sama wa Kokurasetai - Tensai-...
Domestic na Kanojo
The Legendary Moonlight Sculptor
The Gamer
Kumo desu ga, nani ka?
Bokutachi wa Benkyou ga Dekinai
Enen no Shouboutai
Tsuyokute New Saga
Fairy Tail
Komi-san wa Komyushou Desu.
Kenja no Mago
Soul Land
Boruto: Naruto Next Generations
Hunter X Hunter
History’s Strongest Disciple Kenichi
Phoenix against the World
LV999 no Murabito
Gate - Jietai Kare no Chi nite, Kaku...
Kengan Asura
Konjiki no Moji Tsukai - Yuusha Yoni...
Please don’t bully me, Nagatoro
Isekai Maou to Shoukan Shoujo Dorei ...
http://fmcdn.mfcdn.net/store/manga/27418/cover.jpg?token=64e5c0c930644528cba6eb2f2f5f5a2f3762188d&ttl=1616839200&v=1615891672
http://fmcdn.mfcdn.net/store/manga/16627/cover.jpg?token=33f5ea4c1ba1a013c5bdcfdac87209fe472cf6d5&ttl=1616839200&v=1616396463
http://fmcdn.mfcdn.net/store/manga/27509/cover.jpg?token=ce2b16e8e867a8ce13ad0bee9940b68eef324cac&ttl=1616839200&v=1616737688
http://fmcdn.mfcdn.net/store/manga/11362/cover.jpg?token=1a5876d8a767fd27b26f0287bbb36eb82f9cf811&ttl=1616839200&v=1615796703
http://fmcdn.mfcdn.net/store/manga/106/cover.jpg?token=5313fc0dae53f33fcd1284cd4858603fc47ffa04&ttl=1616839200&v=1616748903
http://fmcdn.mfcdn.net/store/manga/22443/cover.jpg?token=89760754754a63efc875aa7e2de0536a5238bed3&ttl=1616839200&v=1616396922
http://fmcdn.mfcdn.net/store/manga/29037/cover.jpg?token=e8b496db4ad520f002040761c5887bc1e17af63a&ttl=1616839200&v=1616653683
http://fmcdn.mfcdn.net/store/manga/28343/cover.jpg?token=71c1b201e4d714f893efb7ac984c9787dd8df915&ttl=1616839200&v=1616748232
http://fmcdn.mfcdn.net/store/manga/19287/cover.jpg?token=803eb8beab4dc6aa8d73f5137a6e3331c0034d24&ttl=1616839200&v=1609900224
http://fmcdn.mfcdn.net/store/manga/27761/cover.jpg?token=6c11f2bddb31b460fccc9a158cc13b9593fb1ad2&ttl=1616839200&v=1616740672
http://fmcdn.mfcdn.net/store/manga/14356/cover.jpg?token=93638c7ec630de193299caa8d513e045818b35ce&ttl=1616839200&v=1616170144
http://fmcdn.mfcdn.net/store/manga/27118/cover.jpg?token=9c876792ad8e6e5f9777386184ea8e6f409aa9fd&ttl=1616839200&v=1616654344
http://fmcdn.mfcdn.net/store/manga/15291/cover.jpg?token=e0a3195fcc88e397703e8bdf6580a62a0d856816&ttl=1616839200&v=1616345844
http://fmcdn.mfcdn.net/store/manga/15975/cover.jpg?token=e07844bb607a3d53ababab51683ee6fa06906d7c&ttl=1616839200&v=1616733843
http://fmcdn.mfcdn.net/store/manga/8198/cover.jpg?token=bc135016049bb63e5b65ec87207e0c91bb0c62c8&ttl=1616839200&v=1616335864
http://fmcdn.mfcdn.net/store/manga/14036/cover.jpg?token=c13dab07379e88fb871d3d833999ead13bfaf0fc&ttl=1616839200&v=1615393923
http://fmcdn.mfcdn.net/store/manga/16159/cover.jpg?token=cdf538f92f729999bcb9fcae7fb31b7a8c306c92&ttl=1616839200&v=1569492366
http://fmcdn.mfcdn.net/store/manga/20569/cover.jpg?token=f9c08cde2f0a6bd646dc87dc4a8dee6fa44eca3c&ttl=1616839200&v=1616680427
http://fmcdn.mfcdn.net/store/manga/21271/cover.jpg?token=062fd439c18afaf178d3408c64b2b305f679e91a&ttl=1616839200&v=1611285077
http://fmcdn.mfcdn.net/store/manga/26916/cover.jpg?token=cda99bf9831ada1322045bf82893a9ed1ad868d5&ttl=1616839200&v=1615188784
http://fmcdn.mfcdn.net/store/manga/26841/cover.jpg?token=055e9ff117c28b3a7c3089c4d691228adeba1f55&ttl=1616839200&v=1616201299
http://fmcdn.mfcdn.net/store/manga/13895/cover.jpg?token=e7661738326d62d38b5f93771105898cb95adaba&ttl=1616839200&v=1612570263
http://fmcdn.mfcdn.net/store/manga/14217/cover.jpg?token=3263f009d5b42e441a09e14c44e3fd7d12a83089&ttl=1616839200&v=1615259584
http://fmcdn.mfcdn.net/store/manga/11374/cover.jpg?token=ab9d85a9efdd5b41391db5249bcf0011ce07070f&ttl=1616839200&v=1600762925
http://fmcdn.mfcdn.net/store/manga/14225/cover.jpg?token=e8912699841e28f9ca8b40eb8fe1d37d2a6ce3e3&ttl=1616839200&v=1616097340
http://fmcdn.mfcdn.net/store/manga/9011/cover.jpg?token=eaca757d4352b66d4ef69812ec5c265b5a2f7a28&ttl=1616839200&v=1614982324
http://fmcdn.mfcdn.net/store/manga/29235/cover.jpg?token=23b3338eaa8984bad9c17a2d604c60c909282715&ttl=1616839200&v=1614666974
http://fmcdn.mfcdn.net/store/manga/10348/cover.jpg?token=c4209cc06013a704c9f7a0e942b8ae55a7546941&ttl=1616839200&v=1616082423
http://fmcdn.mfcdn.net/store/manga/20107/cover.jpg?token=699e867d86e4957b8ef4d3eee5200f80cdbbea88&ttl=1616839200&v=1610529669
http://fmcdn.mfcdn.net/store/manga/9/cover.jpg?token=a4894a5ce212a490dda9c6cf73b717bbfbf015c3&ttl=1616839200&v=1616593028
http://fmcdn.mfcdn.net/store/manga/24693/cover.jpg?token=d968c24525bc6fe467f40c9ad2ff087ebfb60e4a&ttl=1616839200&v=1615325943
http://fmcdn.mfcdn.net/store/manga/11529/cover.jpg?token=1a3ab38ba3f212d5c95138bb690b155f38390aab&ttl=1616839200&v=1615891829
http://fmcdn.mfcdn.net/store/manga/28001/cover.jpg?token=1769a66a83df9adfed58a36dc9275f202d1f8f37&ttl=1616839200&v=1615891671
http://fmcdn.mfcdn.net/store/manga/11147/cover.jpg?token=e6d602fcd4b438ec299c955738487127cef7a3bf&ttl=1616839200&v=1616264399
http://fmcdn.mfcdn.net/store/manga/12978/cover.jpg?token=7e9094f238fcbd19717ffeeb4dcfe686a99dba4b&ttl=1616839200&v=1568611983
http://fmcdn.mfcdn.net/store/manga/24445/cover.jpg?token=0f77d7a743c0f613ff773f3e430f688e3aa77239&ttl=1616839200&v=1616345762
http://fmcdn.mfcdn.net/store/manga/176/cover.jpg?token=e8e87528092cd5b902767d7564e035486b8535f2&ttl=1616839200&v=1611297351
http://fmcdn.mfcdn.net/store/manga/14588/cover.jpg?token=469da1dfa4953459e08efdeb24561f78f7a68b47&ttl=1616839200&v=1615891989
http://fmcdn.mfcdn.net/store/manga/9126/cover.jpg?token=53689bb06b90c163b58b0410e80252941b27aff6&ttl=1616839200&v=1616083893
http://fmcdn.mfcdn.net/store/manga/8/cover.jpg?token=8e5cbd08bd42f0684f36f107fc991c75b56bbed2&ttl=1616839200&v=1615891989
http://fmcdn.mfcdn.net/store/manga/14765/cover.jpg?token=8a8e0582258d852b4c9d017567dd6820958f5a67&ttl=1616839200&v=1615042503
http://fmcdn.mfcdn.net/store/manga/16457/cover.jpg?token=7e59859f7af131902006c3eb8ed55745ef14573f&ttl=1616839200&v=1613139843
http://fmcdn.mfcdn.net/store/manga/16675/cover.jpg?token=cbb268f1326b704b1bb11accadc35ae3b7222e39&ttl=1616839200&v=1615891829
http://fmcdn.mfcdn.net/store/manga/26261/cover.jpg?token=d83f514efe719b2dd301c2ecc8d672e9d935084c&ttl=1616839200&v=1613384403
http://fmcdn.mfcdn.net/store/manga/9518/cover.jpg?token=76170cb8b2defc468a817a69bf6e799900c4fd9f&ttl=1616839200&v=1596437944
http://fmcdn.mfcdn.net/store/manga/24547/cover.jpg?token=b99d7b791e14ec290054d57ead4dcf9fb61b4d7a&ttl=1616839200&v=1615891989
http://fmcdn.mfcdn.net/store/manga/27861/cover.jpg?token=d14b0f3f2362869830c2971007e86ca43637bb85&ttl=1616839200&v=1616345044
http://fmcdn.mfcdn.net/store/manga/231/cover.jpg?token=53c2dc9eb6bf5c6f635de12496088a27b28e04f7&ttl=1616839200&v=1616418784
http://fmcdn.mfcdn.net/store/manga/17825/cover.jpg?token=f1b7954fba32d3146282b2b5bba4e1419578d65b&ttl=1616839200&v=1616677923
http://fmcdn.mfcdn.net/store/manga/14099/cover.jpg?token=7b7a61b4e544a65a75394e4cabf04831cf0c5d7a&ttl=1616839200&v=1611909666
http://fmcdn.mfcdn.net/store/manga/15177/cover.jpg?token=4442c2f4cf7e5c69d3449e7b358960930ff19e11&ttl=1616839200&v=1605145143
http://fmcdn.mfcdn.net/store/manga/13088/cover.jpg?token=d8ab36b3d0f4d9c6263a4f482f98c4d99809eb36&ttl=1616839200&v=1616641226
http://fmcdn.mfcdn.net/store/manga/18225/cover.jpg?token=ea670a4bc8d1aa0312f5427b24bf5702c12ef3a3&ttl=1616839200&v=1615470603
http://fmcdn.mfcdn.net/store/manga/23945/cover.jpg?token=9e078e0cb6da91194a6f86c814ae03922e8460d0&ttl=1616839200&v=1615891671
http://fmcdn.mfcdn.net/store/manga/17045/cover.jpg?token=3026e40a21e490f37c656a778e9227c6c891cade&ttl=1616839200&v=1615891829
http://fmcdn.mfcdn.net/store/manga/13930/cover.jpg?token=f773694a746e2015b4ca5c46afcc801d9795393c&ttl=1616839200&v=1616100123
http://fmcdn.mfcdn.net/store/manga/246/cover.jpg?token=3926211df393a0d50e58c0285c05f067c1ad64e5&ttl=1616839200&v=1615891989
http://fmcdn.mfcdn.net/store/manga/17189/cover.jpg?token=f9ffcf2a07bb8d1f7a49eac36c1f6c4fcd7e5622&ttl=1616839200&v=1616514627
http://fmcdn.mfcdn.net/store/manga/20299/cover.jpg?token=121f6571e072381a545e9e3790b4bf1723865859&ttl=1616839200&v=1615891671
http://fmcdn.mfcdn.net/store/manga/13841/cover.jpg?token=86245cf3afab622c35a41f4e2bf388ac48713906&ttl=1616839200&v=1615891672
http://fmcdn.mfcdn.net/store/manga/19939/cover.jpg?token=563a2963a0a153ac1c53779712f48af5630e0377&ttl=1616839200&v=1616714152
http://fmcdn.mfcdn.net/store/manga/44/cover.jpg?token=febabec452a05c1415f02bf8387a0a8f16c20137&ttl=1616839200&v=1548837372
http://fmcdn.mfcdn.net/store/manga/107/cover.jpg?token=3dcce47a3a6760b9b81b7b576711980d36cf7be1&ttl=1616839200&v=1543561843
http://fmcdn.mfcdn.net/store/manga/24241/cover.jpg?token=b4a1834d714f0476c2d99c5ffb905351c7a4d72f&ttl=1616839200&v=1616176266
http://fmcdn.mfcdn.net/store/manga/25773/cover.jpg?token=7bf8a8e9346a02250bb24cd8e6e4da0933e6a05f&ttl=1616839200&v=1616655977
http://fmcdn.mfcdn.net/store/manga/10956/cover.jpg?token=db3b74dc959adedbd847142cd3a079caca6b25d1&ttl=1616839200&v=1612043463
http://fmcdn.mfcdn.net/store/manga/15593/cover.jpg?token=caceb80b7266f438bdedae8cf69653ab7911fe68&ttl=1616839200&v=1606188363
http://fmcdn.mfcdn.net/store/manga/14916/cover.jpg?token=0dab5e6797f4cc915a035632ed0d02a2492afbcc&ttl=1616839200&v=1609752363
http://fmcdn.mfcdn.net/store/manga/26771/cover.jpg?token=77a6aa9bbb7ebcd3df15cd4cc65b4e3915e96ed4&ttl=1616839200&v=1615891829
http://fmcdn.mfcdn.net/store/manga/16569/cover.jpg?token=e5815ac1520ad179ad2d6f798e4b6ead6790cd33&ttl=1616839200&v=1614957071
http://fanfox.net/manga/a_story_about_treating_a_female_knight_who_has_never_been_treated_as_a_woman_as_a_woman/
http://fanfox.net/manga/tales_of_demons_and_gods/
http://fanfox.net/manga/martial_peak/
http://fanfox.net/manga/onepunch_man/
http://fanfox.net/manga/one_piece/
http://fanfox.net/manga/star_martial_god_technique/
http://fanfox.net/manga/solo_leveling/
http://fanfox.net/manga/the_last_human/
http://fanfox.net/manga/kimetsu_no_yaiba/
http://fanfox.net/manga/versatile_mage/
http://fanfox.net/manga/boku_no_hero_academia/
http://fanfox.net/manga/apotheosis/
http://fanfox.net/manga/black_clover/
http://fanfox.net/manga/tensei_shitara_slime_datta_ken/
http://fanfox.net/manga/kingdom/
http://fanfox.net/manga/tate_no_yuusha_no_nariagari/
http://fanfox.net/manga/tomo_chan_wa_onna_no_ko/
http://fanfox.net/manga/goblin_slayer/
http://fanfox.net/manga/yakusoku_no_neverland/
http://fanfox.net/manga/god_of_martial_arts/
http://fanfox.net/manga/kaifuku_jutsushi_no_yarinaoshi/
http://fanfox.net/manga/re_monster/
http://fanfox.net/manga/mushoku_tensei_isekai_ittara_honki_dasu/
http://fanfox.net/manga/nanatsu_no_taizai/
http://fanfox.net/manga/battle_through_the_heavens/
http://fanfox.net/manga/shingeki_no_kyojin/
http://fanfox.net/manga/iron_ladies/
http://fanfox.net/manga/monster_musume_no_iru_nichijou/
http://fanfox.net/manga/world_s_end_harem/
http://fanfox.net/manga/bleach/
http://fanfox.net/manga/parallel_paradise/
http://fanfox.net/manga/shokugeki_no_soma/
http://fanfox.net/manga/spirit_sword_sovereign/
http://fanfox.net/manga/horimiya/
http://fanfox.net/manga/dungeon_ni_deai_o_motomeru_no_wa_machigatte_iru_darou_ka/
http://fanfox.net/manga/dr_stone/
http://fanfox.net/manga/berserk/
http://fanfox.net/manga/the_new_gate/
http://fanfox.net/manga/akatsuki_no_yona/
http://fanfox.net/manga/naruto/
http://fanfox.net/manga/overlord/
http://fanfox.net/manga/death_march_kara_hajimaru_isekai_kyousoukyoku/
http://fanfox.net/manga/tsuki_ga_michibiku_isekai_douchuu/
http://fanfox.net/manga/eternal_reverence/
http://fanfox.net/manga/minamoto_kun_monogatari/
http://fanfox.net/manga/beastars/
http://fanfox.net/manga/jujutsu_kaisen/
http://fanfox.net/manga/hajime_no_ippo/
http://fanfox.net/manga/kaguya_sama_wa_kokurasetai_tensai_tachi_no_renai_zunousen/
http://fanfox.net/manga/domestic_na_kanojo/
http://fanfox.net/manga/the_legendary_moonlight_sculptor/
http://fanfox.net/manga/the_gamer/
http://fanfox.net/manga/kumo_desu_ga_nani_ka/
http://fanfox.net/manga/bokutachi_wa_benkyou_ga_dekinai/
http://fanfox.net/manga/enen_no_shouboutai/
http://fanfox.net/manga/tsuyokute_new_saga/
http://fanfox.net/manga/fairy_tail/
http://fanfox.net/manga/komi_san_wa_komyushou_desu/
http://fanfox.net/manga/kenja_no_mago/
http://fanfox.net/manga/soul_land/
http://fanfox.net/manga/boruto_naruto_next_generations/
http://fanfox.net/manga/hunter_x_hunter/
http://fanfox.net/manga/history_s_strongest_disciple_kenichi/
http://fanfox.net/manga/phoenix_against_the_world/
http://fanfox.net/manga/lv999_no_murabito/
http://fanfox.net/manga/gate_jietai_kare_no_chi_nite_kaku_tatakeri/
http://fanfox.net/manga/kengan_asura/
http://fanfox.net/manga/konjiki_no_moji_tsukai_yuusha_yonin_ni_makikomareta_unique_cheat/
http://fanfox.net/manga/please_don_t_bully_me_nagatoro/
http://fanfox.net/manga/isekai_maou_to_shoukan_shoujo_dorei_majutsu/
I found the best way (for me) with .find_all()/.findAll() methods is just to use for loop, same goes with .select() method.
And in some cases .select() giving better results.
Check out SelectorGadget to quickly find css selector.