Scrapy PhantomJs slow linux - python

I'm trying to scrape contents from this page on my linux machine. I want to display all the list of wines by clicking the show more button [around 600] until no show more buttons appear. I'm using selenium and PhantomJS for handling javascripts. I'm using time.sleep() show that once i click the show more button, it sleeps for some short time until another appears. The problem i'm facing is, initially the program clicks the show more button quickly but once it reaches around 100-150 clicks, the time taken to detect the show more button increases at an alarming rate, taking too much time. Below is the code that detects the show more button and clicks it.
def parse(self,response):
sel = Selector(self.driver.get(response.url))
self.driver.get(response.url)
click = self.driver.find_elements_by_xpath("//*[#id='btn-more-wines']")
try:
while click:
click[0].click()
time.sleep(2)
except Exception:
print 'no clicks'

An Explicit Wait (instead of time.sleep()) can make a positive impact here:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
wait = WebDriverWait(driver, 10)
click = wait.until(EC.element_to_be_clickable((By.XPATH, "//*[#id='btn-more-wines']")))
This would basically wait for the "Show More" button to become clickable.
Another possible improvement could be achieved by switching to Chrome in a headless mode (with a virtual display):
How do I run Selenium in Xvfb?

Related

Timeout to a function Python on Windows

I use Selenium with Chrome WebDriver on mobile mode, I got an element and I need to click on it.
When I click, A pop-up is visible, which is good for me, but the click function is still going.
It's stuck and doesn't finish.
Also, when I click the element, there is no new screen on the browser, just an object visible, this is why the click button is not getting any response and doesn't finish.
I want to wrap this click function with a timeout to finish after a couple of seconds.
This must run on a Windows and I can not open a new process/thread for it, just finish the call function and continue with the code.
This is an idea of the code:
from selenium import webdriver
driver= webdriver.Chrome()
driver.get('https://webstie.com')
toggle_to_mobile() # THIS IS PRIVATE METHOD
element = driver.find_element_by_name("my_element")
element.click() # THIS IS STUCK
.
.
Thanks!

Python selenium webdriver remote wont opening new tab

i am having this issue where every time i do
driver.execute_script("arguments[0].click();", element)
or
driver.execute_script("window.open('_blank');")
browser blocks popup and prevents new tab from opening but when i use this
element.click()
It opens new tab without any issue so the problem is i am using webdriver.Remote for multilogin platform to automate some tasks so the execute_script method is very quick on the other hand element.click() takes forever.
What is the difference between them and for execute script new tab won't open i have tried doing this
options = webdriver.ChromeOptions()
options.set_capability("profile.default_content_settings.popups", 0)
driver = webdriver.Remote(link, options=options)
also added the same in experimental options as well as prefs but nothing works
thanks for your time any input is appreciated

how to click a button as long as xpath is not found on web browser using python? i don't want to use sleep function

I am creating a program in python using web browser. There is an internet issue. When the internet is slow, the program gets the error (xpath is not found) and stops. I am also using the sleep function
How can I create a while loop of xpath?
or any other methods please explain.
I have done this...
browser.find_element_by_xpath('//*[#id="react-root"]/section/nav/div[2]/div/div/div[2]/div[1]/div/span').click()
time.sleep(3)
i want to do
when internet gets slow. The program will wait for xpath then click.
Try using Explicit Wait
See this link : https://www.selenium.dev/documentation/en/webdriver/waits/#explicit-wait
You can and should use webdriver wait.
It was introduced exactly for this purpose.
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
wait = WebDriverWait(driver, 20)
wait.until(EC.visibility_of_element_located((By.XPATH, '//*[#id="react-root"]/section/nav/div[2]/div/div/div[2]/div[1]/div/span'))).click()
You also should use better locators.
//*[#id="react-root"]/section/nav/div[2]/div/div/div[2]/div[1]/div/span
doesn't look good at all.

Possible to show the animation of scrolling on Selenium Python

I have seen posts on using the sleep function before scrolling to the element but I was wondering if its possible to actually show the process of scrolling, instead of hard cutting down to the element. I want the python script to scroll as if were using a mouse to scroll through a page.
So far I have the scroll line:
driver.implicitly_wait(5)
driver.execute_script("window.scrollTo(0, 1000)")
which works, but just hard cuts down the page.
The answer was a lot simple than I thought.
from selenium import webdriver
path = 'D:\coding\chromedriver.exe'
driver = webdriver.Chrome(path)
driver.get('https://selenium-python.readthedocs.io/getting-started.html#simple-usage') #ANY WEBSITE INSIDE BRACKETS
for i in range(1000):
browser.execute_script("window.scrollBy(0, 1)")
This seems to work perfectly, the only downside being that you cannot specify the length of time scrolling, only adjust the pixels.

Selenium - How to adjust the mouse speed when moving a slider?

I've got this code to bypass captcha basically:
#!/usr/bin/python
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.action_chains import ActionChains
import time
import sys
try:
driver = webdriver.Chrome()
driver.get(sys.argv[1])
time.sleep(2)
slider = driver.find_element_by_id('nc_2_n1z')
move = ActionChains(driver)
move.click_and_hold(slider).move_by_offset(400, 0).release().perform()
time.sleep(5)
driver.close()
except:
pass
Everything works but when I execute this code, it moves the slider very fast (probably less than 1 second) so I can't bypass the Slide to verify captcha. From start to finish moving the slider, I want it to take 3-5 seconds so it'll act more like a human when moving the slider. Is it possible to adjust the speed when moving the slider ?
You can try this by splitting the below line
move.click_and_hold(slider).move_by_offset(400, 0).release().perform()
You have to click and hold for the desired seconds then release
move.click_and_hold(slider).perform()
sleep(2)
move.move_by_offset(400, 0).release().perform()
However i am not sure the ask can handle captcha as most of the modern captcha can figure out if you are running script

Categories

Resources