Stop page loading Python/Selenium/PhantomJS [closed] - python

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
Attempting to driver.get() this url but it just hangs.
Is there a way I can stop this page from loading and then grab the html that was loaded?
I tried manipulating the page in various ways to load dynamic content but no matter what I do the page still hangs.

Set the set_page_load_timeout, then handle the exception:
from __future__ import print_function
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
def get_page(driver, page):
try:
driver.get(page)
except TimeoutException:
pass
def main():
driver = webdriver.Chrome()
try:
driver.set_page_load_timeout(5)
get_page(driver, "http://your/long/url")
print("Returned from page get")
from pprint import pprint as pp
pp(driver.page_source)
finally:
driver.quit()
if __name__ == "__main__":
main()

Related

How do I make Selenium Driver in Python wait until .get() page loads fully to get entire page source? [duplicate]

This question already has answers here:
Wait until page is loaded with Selenium WebDriver for Python
(17 answers)
Closed 3 months ago.
My Selenium Driver is suddenly receiving None when requesting from a page. I want to store urlx inside dictionary data but my driver.get(x) is returning None.
self.data= {
"google.com": page_source,
"stackoverflow.com/....": page_source,
etc...
}
I tried using
self.data[link] = self.driver.get(link)
try:
self.data[link] = WebDriverWait(self.driver, 10).until(EC.presence_of_element_located(By.NAME, "?"))
finally:
continue
but it doesnt seem to stop and wait, also I don't want to wait for a specific element, I want for the entire page to finish loading.
I've also tried to use driver.implicity_wait(10) but it doesnt seem to wait at all.
How do I do this?
Reference this answer to the above question.
driver.get("https://stackoverflow.com")
pageSource = driver.page_source
fileToWrite = open("page_source.html", "w")
fileToWrite.write(pageSource)
fileToWrite.close()
fileToRead = open("page_source.html", "r")
print(fileToRead.read())
fileToRead.close()
driver.quit()

Guys my scrapping tool is constantly returning AttributeError: 'NoneType' object has no attribute 'text' [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 12 months ago.
Improve this question
please help. I don't understand what is going on here...
I've tried so many different codes and websites and still get the same error all the time.
from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Firefox(executable_path = 'C:/Webdriver/geckodriver.exe')
driver.get('https://www.dnes.bg/?cat=581')
results =[]
content = driver.page_source
soup = BeautifulSoup(content, features="lxml")
driver.quit()
for element in soup.findAll(attrs='b2'):
name = element.find('ttl')
if name not in results:
results.append(name).text
print(results)
The Error:
Web Scrapper Studio Code\Main.py", line 15, in
results.append(name).text
AttributeError: 'NoneType' object has no attribute 'text'
Appending to a list returns None
a = []
value = a.append("x")
print(value)
None
What you probably meant to do is:
results.append(name.text)

Scraping Flickr with Selenium/Beautiful soup in Python - ABSWP

I'm going through Automate Boring Stuff with Python and I'm stuck at the chapter about downloading data from the internet. One of the tasks is download photos for a given keyword from Flickr.
I have a massive problem with scraping this site. I've tried BeautifulSoup (which I think is not appropriate in this case as it uses Javascript) and Selenium. Looking at the html I think that I should locate 'overlay' class. However no matter which option I use (find_element_by_class_name, ...by_text, ...by_partial_text) I am not able to find these elements (I get: ".
Could you please help me to clarify what I'm doing wrong? I'd be also grateful for any materials that could help me understadt such cases better. Thanks!
Here's my simple code:
import sys
search_keywords = sys.argv[1]
from selenium import webdriver
browser = webdriver.Firefox()
browser.get(f'https://www.flickr.com/search/?text={search_keywords}')
elems = browser.find_element_by_class_name("overlay")
print(elems)
elems.click()
Sample keywords I type in shell: "industrial design interior"
Are you getting any error message? With Selenium it's useful to surround your code in try/except blocks.
What are you trying to do exactly, download the photos? With a bit of re-writing
try:
options = webdriver.ChromeOptions()
#options.add_argument('--headless')
driver = webdriver.Chrome(chrome_options = options)
search_keywords = "cars"
driver.get(f'https://www.flickr.com/search/?text={search_keywords}')
time.sleep(1)
except Exception as e:
print("Error loading search results page" + str(e))
try:
elems = driver.find_element_by_class_name("overlay")
print(elems)
elems.click()
time.sleep(5)
except Exception as e:
print(str(e))
Loads the page as expected and then clicks on the photo, taking us to This Page
I would be able to help more if you could go into more detail of what you're wanting to accomplish.

Using Python (Selenium) to read, save, and then input a value from Web Page [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
We are pretty new to using Python and Selenium. So please bear with us. As part of our effort on test automation for our website, we have used a helper class to show the Captcha used during new customer registration (works great). We are now trying to read that value, save it to memory (a string) and then input that saved value in the correct page element. This is something entirely new for us and we're at a loss. Here is what we have so far.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import Select
from selenium.common.exceptions import NoSuchElementException
import unittest, time, re
import urllib, urllib2
class NewAccountTests(unittest.TestCase):
def setUp(self):
self.driver = webdriver.Firefox()
self.driver.implicitly_wait(5)
self.base_url = "http://www.test.com/"
self.verificationErrors = []
self.accept_next_alert = True
response = urllib.urlopen("http://www.test.com/")
htmlSource = sock.read(id = "captcha")
var.captcha = "htmlSource"
sock.close()
#SIGN UP NEW USER
def test_00_sign_up(self):
driver = self.driver
driver.get(self.base_url + "/")
driver.find_element_by_id("name").send_keys("Foo")
driver.find_element_by_id("email").send_keys("test#me.com")
driver.find_element_by_id("screenname").send_keys("1234")
driver.find_element_by_id("password").send_keys("xxx")
driver.find_element_by_id("password2").send_keys("xxx")
driver.find_element_by_id("option1").click()
driver.find_element_by_id("option2").click()
driver.find_element_by_id("captcha").click()
>> I don't how to send the var string to the element "captcha"
driver.find_element_by_id("registration_button").click()
I am certain someone knows this all too easily, so any assistance would be greatly appreciated.
I think you need to use send_keys():
send_keys(*value)
Simulates typing into the element.
driver.find_element_by_id("captcha").send_keys("test")

How can I get html content written by JavaScript with Selenium/Python [duplicate]

This question already has answers here:
Get HTML source of WebElement in Selenium WebDriver using Python
(19 answers)
Closed 8 years ago.
I'm doing web-crawling with Selenium and I want to get an element(such as a link) written by JavaScript after Selenium simulating clicking on a fake link.
I tried get_html_source(), but it doesn't include the content written by JavaScript.
Code I've written:
def test_comment_url_fetch(self):
sel = self.selenium
sel.open("/rmrb")
url = sel.get_location()
#print url
if url.startswith('http://login'):
sel.open("/rmrb")
i = 1
while True:
try:
if i == 1:
sel.click("//div[#class='WB_feed_type SW_fun S_line2']/div/div/div[3]/div/a[4]")
print "click"
else:
XPath = "//div[#class='WB_feed_type SW_fun S_line2'][%d]/div/div/div[3]/div/a[4]"%i
sel.click(XPath)
print "click"
except Exception, e:
print e
break
i += 1
html = sel.get_html_source()
html_file = open("tmp\\foo.html", 'w')
html_file.write(html.encode('utf-8'))
html_file.close()
I use a while-loop to click a series of fake links which trigger js-actions to show extra content, and that content is what I want. But sel.get_html_source() didn't give what I want.
Anybody may help? Thanks a lot.
Since I usually do post-processing on the fetched nodes I run JavaScript directly in the browser with execute_script. For example to get all a-tags:
js_code = "return document.getElementsByTagName('a')"
your_elements = sel.execute_script(js_code)
Edit: execute_script and get_eval are equivalent except that get_eval performs an implicit return, in execute_script it has to be stated explicitly.
Can't you just call the browser object inside your selenium environment? For example:
self.browser.find_elements_by_tag_name("div")
Should return you an array of divs. You can also find by class, id, and so on.
Edit Below is the code to create your 'browser' object.
from selenium import webdriver #The browser object
self.browser = webdriver.Firefox() #I Use firefox, but can do chrome, IE, and safari i believe
Then you should be able to do as shown above with the find_elements_by_tag_name.
You would need to use a browser engine which can execute Javascript, such as PhantomJS. Javascript's changes are only visible to clients which can execute Javascript and provide a DOM/Runtime for events to be fired.
Also very close in relation to: Executing Javascript from Python

Categories

Resources