from urllib.request import urlopen
from bs4 import BeautifulSoup
import random
google = ""
page = urlopen(google).read()
soup = BeautifulSoup(page, "html.parser")
img_tags = soup.find_all('img')
Now I'm stuck.
I'm trying to download all images on the link, can someone help me?
this is the code I used to get the data from a website with all the wordle possible words, im trying to put them in a list so I can create a wordle clone but I get a weird output when I do this. please help
import requests
from bs4 import BeautifulSoup
url = ""
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
word_list = list(soup)
It do not need BeautifulSoup, simply split the text of the response:
import requests
url = ""
Or if you like to do it wit BeautifulSoup anyway:
import requests
from bs4 import BeautifulSoup
url = ""
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
Hey I can't seem to scrape the images from this website
I am using the following code
product.find('img', {'class': 'css-1fxh5tw product-card__hero-image'})['src']]
It returns this
Your code was not wrong. I have extracted images
import requests
from bs4 import BeautifulSoup
url =""
response = requests.get(url)
soup = BeautifulSoup(response.text, 'lxml')
images = soup.find_all('img', {'class':'css-1fxh5tw product-card__hero-image'},src=True)
for i in images:
if 'data:image' not in i['src']:
I am trying to download PDF files from this website.
I am new to Python and am currently learning about the software. I have downloaded packages such as urllib and bs4. However, there is no .pdf extension in any of the URLs. Instead, each one has the following format:{.....}.
I have tried to use the soup.find_all command. However, this was not successful.
from urllib import request
from bs4 import BeautifulSoup
import re
import os
import urllib
response = request.urlopen(url).read()
soup= BeautifulSoup(response, "html.parser")
links = soup.find_all('a', href=re.compile(r'('))
This works for me:
import re
import requests
from bs4 import BeautifulSoup
url = ""
response = requests.get(url).content
soup = BeautifulSoup(response, "html.parser")
links = soup.find_all('a', href=re.compile(r'('))
links = [l['href'] for l in links]
Only difference is that I use requests because I'm used to it, and I take the href attribute for each of the returned Tag from BeautifulSoup.
Please let me know how to find the class from HTML Code from the URL:
from bs4 import BeautifulSoup as bs
import requests
r = requests.get('')
soup = bs(r.text, 'lxml')
soup.findAll('div', {'class': 'metadata-block validate'})
Result giving blank.
I am trying to get the HTML source of a web page using beautifulsoup.
import bs4 as bs
import requests
import urllib.request
I want the HTML source of the page. This is what I am getting now:
'"siteSettings", {"title":"PakWheels Forums","contact_email":"","contact_url":"","logo_url":"","logo_small_url":"/images/d-logo-sketch-small.png","mobile_logo_url":"'
Have a look at this code:
from urllib import request
from bs4 import BeautifulSoup
url_1 = ""
page = request.urlopen(url_1)
soup = BeautifulSoup(page)
Import everything you need correctly. Read this.