I'm writing a trying to scrape some data from the following website:
http://www.b3.com.br/pt_br/market-data-e-indices/servicos-de-dados/market-data/historico/renda-fixa/
It worked as expected for a while, but now it get stuck in loading the page at line 3.
url = 'http://www.b3.com.br/pt_br/market-data-e-indices/servicos-de-dados/market-data/historico/renda-fixa/'
driver = webdriver.Chrome()
driver.get(url)
What is weird is that the page is in fact fully loaded, as I can browse through it without a problem, but chrome keeps showing me a "Connecting..." message in the bottom.
When selenium finally gives up and raises the TimeoutException, the "Connecting..." message dissapears and Chrome understands that the page is in fact fully loaded.
If I try to manually open the link in another tab, it does so in less than a second.
Is there a way I can overide the built in "wait until loaded" and just get to next steps, as everything i need is already loaded?
http://www.b3.com.br/lumis/portal/controller/html/SetLocale.jsp?lumUserLocale=pt_BR
This link loads inifinitely.
Report a bug an ask developers to fix.
Related
I unfortunately can not stop a page from loading using Selenium in Python.
I have tried:
driver.execute_script("window.stop();")
driver.set_page_load_timeout(10)
webdriver.ActionChains(driver).send_keys(Keys.ESCAPE).perform()
The page is a .cgi that constantly loads. I would like to either scrape data from a class on the page or the page title, however neither works with the 3 methods above.
When I try to manually press ESC, or click the cross, it works perfectly.
Thank you for reading.
You didn't share your code and a page you are working on, so we can only guess.
So, in case you really tried all the above correctly and it still not helped try adding Eager page loading strategy to your driver options.
Eager page loading strategy will make WebDriver wait until the initial HTML document has been completely loaded and parsed, and discards loading of stylesheets, images and subframes (DOMContentLoaded event fire is returned).
With it your code will look something like this:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.page_load_strategy = 'eager'
driver = webdriver.Chrome(options=options)
# Navigate to url
driver.get(your_page_url)
UPD
You are trying to upload a file with Selenium and doing it wrong.
To upload the file with Selenium you need to send a full file path to that element.
So, if the file you want to upload is located by C:/Model.lp your code should be:
driver.find_element_by_xpath("//input[#name='field.1']").send_keys("C:/Model.lp")
The problem is when I download a file through a click(), it opens a new tab/window to initiate the download, then the browser automatically closes the tab/window. However, when I want to access the previous page in which the download link was, I get the error "invalid session id".
I get the error "No such window exception" when using Safari for the automation.
If anyone knows how to deal with this issue I would appreciate all the help I could get. thank you all!
my code is below and the error comes after trying to click file_dl2
attachments = browser.find_element_by_id('sAttachments')
attachments.click()
time.sleep(2)
files = browser.find_element_by_xpath('//*[#id="FileListFI"]/div[1]')
files.click()
file_dl = browser.find_element_by_xpath('//*[#id="ctl00_chpDialog_hrefFileLink"]/img')
file_dl.click()
browser.implicitly_wait(10)
file_dl2 = browser.find_element_by_xpath('//*[#id="ctl01_chpDialog_hrefFileLink"]/img')
file_dl2.click()
....
driver.get("<web-site>")
sleep(1)
....
This fetches the original/previous website is fetched (add this code where you have click() action in role ).
You can append new set of instruction after sleep() function and wrok further.
I'm trying to save a full-page screenshot of a chrome store page, using selenium, and python 3.
I've searched online for different answers and I keep getting only the "header" part, no matter what I try. As if the page doesn't scroll for the next "section".
I tried clicking inside the page to verify it's in focus but that didn't help.
Tried answers with stitching and imported Screenshots and Image.
my current code is:
ob = Screenshot_Clipping.Screenshot()
driver2 = webdriver.Chrome(executable_path=chromedriver)
url = "https://chrome.google.com/webstore/detail/online-game-zone-new-tab/abalcghoakdcaalbfadaacmapphamklh"
driver2.get(url)
img_url = ob.full_Screenshot(driver, save_path=r'.', image_name='Myimage.png')
print(img_url)
print('done')
driver2.close()
driver2.quit()
but that gives me this picture:
What am I doing wrong?
There is a lot that has been written about timeout and selenium and page loads.
But almost none of it works in chromedriver.
And all of what works is not exactly what I am looking for.
Note: I am not looking for set_page_load_timeout()
What do I want:
I say: driver.get("some-weird-slow-place")
chromedriver says: yes, yes... on my way
[15 seg later...] still on my way
[20 seg later...] okay sir... please do javascript window.stop();
But! Keep working as usual with whatever loaded elements you have.
Why I want this:
Because maybe I just want to get the url of the site and its title... and not the fancy huge background image or the crunchy brunchy punchy animated banners and multiple thousand jquery magics that are still loading.
What did I try:
driver.get(url)
driver.execute_script("setInterval(function(){ window.stop(); }, 20000);")
But it does not work, because driver.get() will wait until page is loaded before executing the script.
Why do you need selenium for this ? How about https://www.crummy.com/software/BeautifulSoup/bs4/doc/
I have a pretty simple process where within a loop, I get a url, wait for certain elements to load using WebDriverWait and then start processing the page source.
The issue I'm having is that it seems like if I don't add explicit wait times between the driver.get(url), webdriver can return elements from the old page rather than the most recent page. If I add explicit (for instance 3 seconds) or call on driver.get(url) twice, the issue gets resolved.
Has any one experienced this issue or know how to fix it?
Thanks.
Actually, I have encountered this on firefox with geckodriver and it is because you loaded another page by clicking on an old element. That driver object needs to be refreshed.
Reloading the page needed to traverse worked for me, waiting doesn't help.
driver.get(driver.current_url)
Edit: Chrome doesn't seem to have this problem and driver can access the elements from the clicked page directly.
It seems like a problem on a web browser. That is why after driver.get(url) I usually check, if driver.get() is successful:
driver.get(url)
sleep(n_seconds) #that is not necessary.
if driver.current_url == url:
#do something
else:
#try again to get the page or wait more.