Click on multiple elements using selenium on Python - python

I would like to create a bot which will retweet all tweet load on an account.
First I try to use this method:
rt = self.driver.find_elements_by_xpath("//div[#data-testid='retweet']")
for i in rt:
time.sleep(1)
i.click()
"""Confirm the retweet"""
time.sleep(1)
self.driver.find_element_by_xpath("//div[#data-testid='retweetConfirm']").click()
But it didn't work, I wasn't able to wait enought before clicking, even with a high time.sleep().
So instead, I try this:
for i in rt:
z=WebDriverWait(self.driver, 10).until(EC.element_to_be_clickable((By.XPATH,"//div[#data-testid='retweet']")))
z.click()
#Confirm the retweet
b=WebDriverWait(self.driver, 10).until(EC.element_to_be_clickable((By.XPATH,"//div[#data-testid='retweetConfirm']")))
b.click()
Unfortunately this code will only retweet the first tweet. I should modify something in EC.element_to_be_clickable((By.Xpath,"//*[#data-testid='retweet'])) (and in the second EC.element_to_be_clickable) to go to the next tweet for each iteration but I don't know what.
Does anybody know a way to iterate through all my tweets with this method (or another)? I have been thinking about getting the absolute path of all my element in "rt" but I don't know if I can do this using only Selenium. I could also use Twitter API but I want to be able to create a bot on others websites.
Thank you

Ok, I think I have found a solution to my issue.
For each click, I try to catch the exception "ElementClickInterceptedException" when it happens; otherwise I click on my retweet.
So it gaves something like that:
for i in rt:
first_clique=False
while first_clique==False:
try:
i.click()
except ElementClickInterceptedException:
pass
else:
first_clique= True

Related

Selenium Python Instagram Scraping All Images in a post not working

I am writing a small code to download all images/videos in a post. Here is my code:
import urllib.request as reqq
from selenium import webdriver
import time
browser = webdriver.Chrome("D:\\Python_Files\\Programs\\chromedriver.exe")
browser.get(url)
browser.maximize_window()
url_list = ['https://www.instagram.com/p/CE9CZmsghan/']
img_urls = []
vid_urls = []
img_url = ""
vid_url = ""
for x in url_list:
count = 0
browser.get(x)
while True:
try:
elements = browser.find_elements_by_class_name('_6CZji')
elements[0].click()
time.sleep(1)
except:
count+=1
time.sleep(1)
if count == 2:
break
try:
vid_url = browser.find_element_by_class_name('_5wCQW').find_element_by_tag_name('video').get_attribute('src')
vid_urls.append(vid_url)
except:
img_url = browser.find_element_by_class_name('KL4Bh').find_element_by_tag_name('img').get_attribute('src')
img_urls.append(img_url)
for x in range(len(img_urls)):
reqq.urlretrieve(img_urls[x],f"D:\\instaimg"+str(x+1)+".jpg")
for x in range(len(vid_urls)):
reqq.urlretrieve(vid_urls[x],"D:\\instavid"+str(x+1)+".mp4")
browser.close()
This code extracts all the images in the post except the last image. IMO, this code is right. Do you know why this code doesn't extract the last image? Any help would be appreciated. Thanks!
Go to the URL that you're using in the example and open the inspector, and very carefully watch how the DOM changes as you click between images. There are multiple page elements with class KL4Bh because it tracks the previous image, the current image, and the next image.
So doing find_element_by_class_name('KL4Bh') returns the first match on the page.
Ok, lets break down this loop and see what is happening:
first iteration
page opens
immediately click 'next' to second photo
grab the first element for class 'KL4Bh' from the DOM
the first element for that class is the first image (now the 'previous' image)
[... 2, 3, 4 same as 1 ...]
fifth iteration
look for a "next" button to click
find no next button
`elements[0]` fails with index error
grab the first element for class 'KL4Bh' from the DOM
the first element for that class is **still the fourth image**
sixth iteration
look for a "next" button to click
find no next button
`elements[0]` fails with index error
error count exceeds threshold
exit loop
try something like this:
n = 0
while True:
try:
elements = browser.find_elements_by_class_name('_6CZji')
elements[0].click()
time.sleep(1)
except IndexError:
n=1
count+=1
time.sleep(1)
if count == 2:
break
try:
vid_url = browser.find_elements_by_class_name('_5wCQW')[n].find_element_by_tag_name('video').get_attribute('src')
vid_urls.append(vid_url)
except:
img_url = browser.find_elements_by_class_name('KL4Bh')[n].find_element_by_tag_name('img').get_attribute('src')
img_urls.append(img_url)
it will do the same thing as before, except since it's now using find_elements_by_class and indexing into the resulting list, when it gets to the last image the index error for the failed button click will also cause the image lookup to increment the index it uses. So it will take the second element (the current image) on the last iteration of the loop.
There are still some serious problems with this code, but it does fix the bug you are seeing. One problem at a time :)
Edit
A few things that I think would improve this code:
When using try-except blocks to catch exceptions/errors, there are a few rules that should almost always be followed:
Name specific exceptions & errors to handle, don't use unqualified except. The reason for this is that by catching every possible error, we actually suppress and obfuscate the source of bugs. The only legitimate reason to do this is to generate a custom error message, and the last line of the except-block should always be raise to allow the error to propagate. It goes against how we typically think of software errors, but when writing code, errors are your friend.
The try-except blocks are also problematic because they are being used as a conditional control structure. Sometimes it seems easier to code like this, but it is usually a sign of incomplete understanding of the libraries being used. I am specifically referring to the block that is checking for a video versus an image, although the other one could be refactored too. As a rule, when doing conditional branching, use an if statement.
Using sleep with selenium is almost always incorrect, but it's by far the most common pitfall for new selenium users. What happens is that the developer will start getting errors about missing elements when trying to search the DOM. They will correctly conclude that it is because the page was not full loaded in the browser before selenium tried to read it. But using sleep is not the right approach because just waiting for a fixed time makes no guarantee that the page will be fully loaded. Selenium has a built-in mechanism to handle this, called explicit wait (along with implicit wait and fluent wait). Using an explicit wait will guarantee that the page element is visible before your code is allowed to proceed.

Element sometimes appears and sometimes does not, how to continue script either way?

My selenium (python) script is doing some work on a website that is a dashboard I.e the actual webpage/link does not change as I interact with elements on this dashboard. I mention this as I considered simply hopping around different links via driver.get() command to avoid some of the issues I am having right now.
At one point, the script reaches a part of the dashboard that for some reason during some of my test runs has an element and other times this element is not present. I have a few lines that interact with this element as it is in the way of another element that I would like to .click(). So when this element is not present my script stops working. This is a script that will repeat the same actions with small variations at the beginning, so I need to somehow integrate some sort of an 'if' command I guess.
My goal is to write something that does this:
- If this element is NOT present skip the lines of code that interact with it and jump to the next line of action. If the element is present obviously carry on.
day = driver.find_element_by_xpath('/html/body/div[4]/table/tbody/tr[4]/td[2]/a')
ActionChains(driver).move_to_element(day).click().perform()
driver.implicitly_wait(30)
lizard2 = driver.find_element_by_xpath('/html/body/div[5]/div/div[2]/div[1]/img')
ActionChains(driver).move_to_element(lizard2).perform()
x2 = driver.find_element_by_xpath('/html/body/div[5]/div/div[2]/div[2]')
ActionChains(driver).move_to_element(x2).click().perform()
driver.execute_script('window.scrollTo(0, document.body.scrollHeight)')
So the block that starts with 'lizard2' and goes to the ActionChains line is the block that interacts with this element that is sometimes apparent sometimes not as the script goes back and forth doing the task.
The top two lines are just the code that gets me to the phase of the dashboard that has this randomly appearing element. And the scroll command at the end is what follows. As mentioned before I need the middle part to be ignored and continue to the scroll part if this 'lizard2' and 'x2' elements are not found.
I apologize if this is confusing and not really concise, I am excited to hear what you guys think and provide any additional information/details, thank you!
You can simply perform your find_element_by_xpath within a try/except block like so:
from selenium.common.exceptions import NoSuchElementException
try:
myElement = driver.find_element_by_xpath(...)
#Element exists, do X
except NoSuchElementException:
#Element doesn't exist, do Y
Edit: Just like to add that someone in the comments to your question suggested that this method was 'hacky', it's actually very Pythonic and very standard.
private boolean isElementPresent(By by) {
try {
driver.findElement(by);
return true;
} catch (NoSuchElementException e) {
return false;
}
}
Answered by #Lucan :
You can simply perform your find_element_by_xpath within a try/except block like so:
from selenium.common.exceptions import NoSuchElementException
try:
myElement = driver.find_element_by_xpath(...)
#Element exists, do X
except NoSuchElementException:
#Element doesn't exist, do Y

How to make a while-loop run as long as an element is present on page using selenium in python?

I'm struggling with creating a while-loop which runs as long as a specific element is present on a website. What I currently have is by no means a solution I'm proud of, but it works. I would however very much appreciate some suggestions on how to change the below:
def spider():
url = 'https://stackoverflow.com/questions/ask'
driver.get()
while True:
try:
unique_element = driver.find_element_by_class_name("uniqueclass")
do_something()
except NoSuchElementException:
print_data_to_file(entries)
break
do_something_else()
As you can see, the first thing I do within the while-loop is to check for a unique element which is only present on pages containing data I'm interested in. Thus, when I reach a page without this information, I'll get the NoSuchElementException and break.
How can I achieve the above without having to make a while True?
driver.find_elements won't throw any error. If the returned list is empty it means there aren't any more elements
def spider():
url = 'https://stackoverflow.com/questions/ask'
driver.get()
while len(driver.find_elements_by_class_name("uniqueclass")) > 0:
do_something()
do_something_else()
You could also use explicit wait with expected condition staleness_of, however you won't be able to execute do_something() and it's used for short waiting periods.

Stop when error

In python, I am using selenium to scrape a webpage. There is a button that I need to repeatedly click until there is no button anymore. So far, I have code like:
count = 20
while count:
driver.find_elements_by_class_name('buttontext')[0].click()
count-=1
The problem is that I don't actually know how many times I need to do this - count = 20 is incorrect. What I actually need is for it to keep going until the command encounters an error (IndexError: list index out of range), and then stop. How can I do this?
Follow the EAFP approach - make an endless loop and break it once there is no element found:
from selenium.common.exceptions import NoSuchElementException
while True:
try:
button = driver.find_element_by_class_name("buttontext")
button.click()
except NoSuchElementException:
break
You should use the try statement to handle exceptions. It will run your code until it finds an exception. Your code should look something like that:
try:
while True:
click_function()
except Exception: #as Exception being your error
got_an_error() #or just a pass
I would follow the Selenium docs' recommendation and use .findElements().
findElement should not be used to look for non-present elements, use
findElements(By) and assert zero length response instead.
buttons = driver.find_elements_by_class_name("buttontext")
while buttons:
buttons[0].click()
// might need a slight pause here depending on what your click triggers
buttons = driver.find_elements_by_class_name("buttontext")
Do you need this?
while True:
try:
driver.find_elements_by_class_name('buttontext')[0].click()
except IndexError:
break
try will try to run some code, and except can capture the error that you select.
And if you didn't select an error, except will capture all errors. For more info.

Learning to use Assert and Asserttrue in Python for selenium web driver

Im trying to make a python webdriver to load a webpage and then assert true and run a print command if a text or object is there, and if not I want it to just continue running my loop. Im a noob to python and have been learning from Learn python the Hard Way, and reading documentation. Ive spent the last day or so trying to get my code finding text or elements, but it doesnt feed back info...Here is my code so far minus the top part about going to the webpage, I am just stuff on this count loop assert logic.
count = 1000.00
while count < 1000.03:
driver.find_element_by_id("select").clear()
driver.find_element_by_id("select").send_keys(str(count))
time.sleep(1)
driver.find_element_by_id("BP").click()
driver.find_element_by_id("BP").click()
count += 0.01 ## increases count to test
highervalue = driver.find_element_by_link_text("Select Error")
assertTrue(driver.link_contains_match_for("")) ##could also be, ##text_containt_match_for("ex") or driver.assertTrue(element in WEBPAGE)??
print 'Someone is %.7r' %count
else:
print 'I have %.7r' %count
time.sleep(1)
then the loop starts over again. The issue i am having is I want to find "Select Error" on the webpage in some kind of form, link, or text, and then if it is there print me a msg, and if not, to just continue my loop.
Is it better to use assert/asserttrue, or something like
def is_element_present(self, how, what):
try: self.driver.find_element(by=how, value=what)
except NoSuchElementException, e: return False
return True
or
Some other examples I have searched about that could be used:
self.assertTrue(self.is_element_present(By.ID, "FOO"))
self.assertTrue(self.is_element_present(By.TEXT, "BAR"))
self.assertTrue(self.is_text_present("FOO"))
self.assertTrue(self.driver.is_text_present("FOO"))
Can someone let me know how I would write the part when I find something in the webpage and it gives me feedback if found?
Selenium contains methods known as waits to assist with this specific situation. I would read up on the documentation of explicit waits. Also, there is another Stack Overflow question which may be useful to look at.
In addition, and slightly off topic, it is atypical to use Asserts in the way you have in the code shown. In general, you should not think of using Asserts to check for conditions, but rather a condition that must be true for the test to pass, otherwise the test stops and there is no reason to continue on.
Cheers.
Assuming that your "Select Error" message is actually a link_text, you could try using a 'try/except' statement.
try:
WebDriverWait(driver, 3).until(
expected_conditions.text_to_be_present_in_element((By.LINK_TEXT, "Select Error"), "Select Error")
)
print "Link Found..."
except NoSuchElementException:
pass
This code tries to find the "Select Error" link text for 3 seconds, catches the NoSuchElementException error, and then continues with your code.

Categories

Resources