Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 14 days ago.
This post was edited and submitted for review 7 days ago.
Improve this question
I am using Selenium to scrap some Booking.com pages. Sometimes, when I request for an specific page it works fine but others it get the wrong URL
You can use the following code to reproduce it:
from selenium import webdriver
class TestURLGet:
def __init__(self):
self.driver = webdriver.Chrome('./../SeleniumDrivers/chromedriver')
options = webdriver.ChromeOptions()
self.driver = webdriver.Chrome(options=options, executable_path='./../SeleniumDrivers/chromedriver')
self.driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
"source": """
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined
})
"""
})
self.driver.execute_cdp_cmd("Network.enable", {})
def getURL(self, url):
self.driver.get(url) # El navegador viaja a esta dirección
return 0
if name == "main": # AL CORRER ESTE SCRIPT DE MANERA INDIVIDUAL, SE EJECUTARA ESTE TEST
url = "https://www.booking.com/hotel/es/ruralgest-posada-rural-el-ajillo.es.html?aid=311090&label=ruralgest-posada-rural-el-ajillo-lUn4JdtwSBPwZTAJ%2AL59kwS575078511800%3Apl%3Ata%3Ap1%3Ap2%3Aac%3Aap%3Aneg%3Afi%3Atikwd-370732725344%3Alp9061047%3Ali%3Adec%3Adm&sid=e9ba4a6f7629809d9c3161e9671bbd73&dest_id=-380385;dest_type=city;dist=0;group_adults=2;group_children=0;hapos=1;hpos=1;no_rooms=1;req_adults=2;req_children=0;room1=A%2CA;sb_price_type=total;sr_order=popularity;srepoch=1675340918;srpvid=efba57ba41c50076;type=total;ucfs=1&#tab-reviews"
test = TestURLGet()
test.getURL(url)
If you manually paste the URL in a new tab from the Chrome session that is opened it works fine.
Thank you a lot.
Related
Hoe can I accept the dialog using python playwright. For your kind information I have already tried this code but it doesn't seems to work for me. Any other solution other than that will be appreciable. Thanks
from playwright.sync_api import sync_playwright
def handle_dialog(dialog):
print(dialog.message)
dialog.dismiss()
def run(playwright):
chromium = playwright.chromium
browser = chromium.launch()
page = browser.new_page()
page.on("dialog", handle_dialog)
page.evaluate("alert('1')")
browser.close()
with sync_playwright() as playwright:
run(playwright)
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 months ago.
Improve this question
There is such a check on the site:
Check
How can I get data from such a site?
As I already wrote in the comments, try using the selenium library, it imitates working with a browser.
Before starting, install selenium and webdriver_manager (for easier work with drivers)
pip install -U selenium webdriver-manager
Here is an example code that works for all sites (Chrome):
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager # automatic webdriver for Chrome browser (can change to your browser)
import time
URL = 'YOUR LINK'
headers = {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36",
"Accept": "text/html,application/xhtml+xml,application/xml; q=0.9,image/webp,image/apng,*/*;q=0.8"
}
# opening the page and get elements from the table
options = webdriver.ChromeOptions()
driver = webdriver.Chrome(options=options, executable_path=ChromeDriverManager().install())
driver.get(URL)
time.sleep(6) # falling asleep (6 sec) to accurately load the site
html = driver.page_source
print(html) # outputs html code
# save html to file
with open('saving.html', 'wb+') as f:
f.write(str.encode(html))
driver.close
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 12 months ago.
Improve this question
please help. I don't understand what is going on here...
I've tried so many different codes and websites and still get the same error all the time.
from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Firefox(executable_path = 'C:/Webdriver/geckodriver.exe')
driver.get('https://www.dnes.bg/?cat=581')
results =[]
content = driver.page_source
soup = BeautifulSoup(content, features="lxml")
driver.quit()
for element in soup.findAll(attrs='b2'):
name = element.find('ttl')
if name not in results:
results.append(name).text
print(results)
The Error:
Web Scrapper Studio Code\Main.py", line 15, in
results.append(name).text
AttributeError: 'NoneType' object has no attribute 'text'
Appending to a list returns None
a = []
value = a.append("x")
print(value)
None
What you probably meant to do is:
results.append(name.text)
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
Attempting to driver.get() this url but it just hangs.
Is there a way I can stop this page from loading and then grab the html that was loaded?
I tried manipulating the page in various ways to load dynamic content but no matter what I do the page still hangs.
Set the set_page_load_timeout, then handle the exception:
from __future__ import print_function
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
def get_page(driver, page):
try:
driver.get(page)
except TimeoutException:
pass
def main():
driver = webdriver.Chrome()
try:
driver.set_page_load_timeout(5)
get_page(driver, "http://your/long/url")
print("Returned from page get")
from pprint import pprint as pp
pp(driver.page_source)
finally:
driver.quit()
if __name__ == "__main__":
main()
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
We are pretty new to using Python and Selenium. So please bear with us. As part of our effort on test automation for our website, we have used a helper class to show the Captcha used during new customer registration (works great). We are now trying to read that value, save it to memory (a string) and then input that saved value in the correct page element. This is something entirely new for us and we're at a loss. Here is what we have so far.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import Select
from selenium.common.exceptions import NoSuchElementException
import unittest, time, re
import urllib, urllib2
class NewAccountTests(unittest.TestCase):
def setUp(self):
self.driver = webdriver.Firefox()
self.driver.implicitly_wait(5)
self.base_url = "http://www.test.com/"
self.verificationErrors = []
self.accept_next_alert = True
response = urllib.urlopen("http://www.test.com/")
htmlSource = sock.read(id = "captcha")
var.captcha = "htmlSource"
sock.close()
#SIGN UP NEW USER
def test_00_sign_up(self):
driver = self.driver
driver.get(self.base_url + "/")
driver.find_element_by_id("name").send_keys("Foo")
driver.find_element_by_id("email").send_keys("test#me.com")
driver.find_element_by_id("screenname").send_keys("1234")
driver.find_element_by_id("password").send_keys("xxx")
driver.find_element_by_id("password2").send_keys("xxx")
driver.find_element_by_id("option1").click()
driver.find_element_by_id("option2").click()
driver.find_element_by_id("captcha").click()
>> I don't how to send the var string to the element "captcha"
driver.find_element_by_id("registration_button").click()
I am certain someone knows this all too easily, so any assistance would be greatly appreciated.
I think you need to use send_keys():
send_keys(*value)
Simulates typing into the element.
driver.find_element_by_id("captcha").send_keys("test")