Can I scroll on a side panel using Python Selenium? - python

Essentially title - I've seen plenty of methods here for scrolling on webpages but not on the side panel of a webpage. Here's the code - the scrolling part seems to have no effect:
driver = webdriver.Chrome(chromedriver_path) # chromedriver path
driver.get('https://www.google.com/maps/search/gas/#40,-75.2,12z/data=!4m2!2m1!6e2')
time.sleep(3)
driver.execute_script("window.scrollTo(0, 100)")
time.sleep(3)
Here's the page I'm trying to access. Any tips?

You can scroll to a specific element using
els = driver.find_elements(By.CSS_SELECTOR, '.TFQHme')
driver.execute_script("arguments[0].scrollIntoView();", els[-1])
In this example the last one loaded.
The list will grow as you scroll down, to keep scrolling, just retrieve the list again.
You can implement an infinite scroll using the following:
def infinite_scroll(driver):
number_of_elements_found = 0
while True:
els = driver.find_elements(By.CSS_SELECTOR, '.TFQHme')
if number_of_elements_found == len(els):
# Reached the end of loadable elements
break
try:
driver.execute_script("arguments[0].scrollIntoView();", els[-1])
number_of_elements_found = len(els)
except StaleElementReferenceException:
# Possible to get a StaleElementReferenceException. Ignore it and retry.
pass

Related

How to break a loop if certain element is disabled and get text from multiple pages in Selenium Python

I am a new learner for python and selenium. I have written a code to extract data from multiple pages but there is certain problem in the code.
I am not able to break the a while loop function which clicks on next page until there is an option. The next page element disables after reaching the last page but code sill runs.
xpath: '//button[#aria-label="Next page"]'
Full SPAN: class="awsui_icon_h11ix_31bp4_98 awsui_size-normal-mapped-height_h11ix_31bp4_151 awsui_size-normal_h11ix_31bp4_147 awsui_variant-normal_h11ix_31bp4_219"
I am able to get the list of data which I want to extract from the webpage but I am getting on the last page data when I close the webpage from my end, ending the while loop.
Full Code:
opts = webdriver.ChromeOptions()
opts.headless = True
driver = webdriver.Chrome(ChromeDriverManager().install())
base_url = "XYZ"
driver.maximize_window()
driver.get(base_url)
driver.set_page_load_timeout(50)
element = WebDriverWait(driver, 50).until(EC.presence_of_element_located((By.ID, 'all-my-groups')))
driver.find_element(by=By.XPATH, value = '//*[#id="sim-issueListContent"]/div[1]/div/div/div[2]/div[1]/span/div/input').send_keys('No Stock')
dfs = []
page_counter = 0
while True:
wait = WebDriverWait(driver, 30)
wait.until(EC.visibility_of_all_elements_located((By.XPATH, "//div[contains(#class, 'alias-wrapper sim-ellipsis sim-list--shortId')]")))
cards = driver.find_elements_by_xpath("//div[contains(#class, 'alias-wrapper sim-ellipsis sim-list--shortId')]")
sims = []
for card in cards:
sims.append([card.text])
df = pd.DataFrame(sims)
dfs.append(df)
print(page_counter)
page_counter+=1
try:
wait.until(EC.element_to_be_clickable((By.XPATH,'//button[#aria-label="Next page"]'))).click()
except:
break
driver.close()
driver.quit()
I am also, attaching the image of the class and sorry I cannot share the URL as it of private domain.
The easiest option is to let your wait.until() fail via timeout when the "Next page" button is missing. Right now your line wait = WebDriverWait(driver, 30) is setting the timeout to 30 seconds; assuming the page normally loads much faster than that, you could change the timeout to be 5 seconds and then the loop will end faster once you're at the last page. If your page load times are sometimes slow then you should make sure the timeout won't accidentally cut off too early; if the load times are consistently fast then you might be able to get away with an even shorter timeout interval.
Alternatively, you could look through the specific target webpage more carefully to find some element that a) is always present and b) can be used to determine whether we're on the final page or not. Then you could read the value of that element and decide whether to break the loop before trying to find the "Next page" button. This could save a couple of seconds of waiting on the final loop iteration (avoid waiting for timeout) but may not be worth the trouble.
Change the below condtion
try:
wait.until(EC.element_to_be_clickable((By.XPATH,'//button[#aria-label="Next page"]'))).click()
except:
break
as shown in the below pseduocode #disabled is the diff that will make sure to exit the while loop if the button is disabled.
if(driver.find_elements_by_xpath('//button[#aria-label="Next page"][#disabled]'))).size()>0)
break
else
driver.find_element_by_xpath('//button[#aria-label="Next page"]').click()

Right click save link as then save

hello guys i want to right click save link as then save on the save pop up that windows shows.
this is an example:
https://www.who.int/data/gho/data/indicators/indicator-details/GHO/proportion-of-population-below-the-international-poverty-line-of-us$1-90-per-day-(-)
go on this page in the data tab u can see EXPORT DATA in CSV format:Right-click here & Save link
so if u right click and save link as it will let u save the data as csv.
i want to automate that can it be done using selenium python if so how?
i tried using actionchains but im not sure thats gona work
I think you would be better off using the Data API (json) offered on the same page, but I have managed to download the file using the code below and Google Chrome.
There is a lot going on which I didn't want to go into (hence the lazy usage of an occasional sleep), but the basic principle is that the Export link is inside a frame inside a frame (and there are many iframes on the page). Once the correct iframe has been found and the link located, a right click brings up the system menu.
This menu cannot be accessed via Selenium (because it is inside an iframe?), so pyautogui is used to move down to "Save link as..." and also to click the "Save" button on the Save As dialog.:
url = "https://www.who.int/data/gho/data/indicators/indicator-details/GHO/proportion-of-population-below-the-international-poverty-line-of-us$1-90-per-day-(-)"
driver.get(url)
driver.execute_script("window.scrollBy(0, 220);")
button = driver.find_element(By.CSS_SELECTOR, "button#dataButton")
button.click()
WebDriverWait(driver, 5).until(EC.text_to_be_present_in_element_attribute((By.CSS_SELECTOR, "button#dataButton"), "class", "active"))
time.sleep(10)
iframes = driver.find_elements(By.CSS_SELECTOR, "iframe[src*='https://app.powerbi.com/reportEmbed'")
for i in iframes:
try:
driver.switch_to.frame(i)
iframes2 = driver.find_elements(By.CSS_SELECTOR, "iframe[src*='cvSandboxPack.html']")
for i2 in iframes2:
try:
driver.switch_to.frame(i2)
downloads = driver.find_elements(By.CSS_SELECTOR, "a[download='data.csv']")
if len(downloads) > 0:
ActionChains(driver).context_click(downloads[0]).perform()
# Selenium cannot communicate with system dialogs
time.sleep(1)
pyautogui.typewrite(['down','down','down','down','enter'])
time.sleep(1)
pyautogui.press('enter')
time.sleep(2)
return
except StaleElementReferenceException as e:
continue
finally:
driver.switch_to.frame(i)
iframes2 = WebDriverWait(driver, 5).until(EC.presence_of_all_elements_located((By.TAG_NAME, "iframe")))
except StaleElementReferenceException as e:
continue
finally:
driver.switch_to.default_content()
iframes = WebDriverWait(driver, 5).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "iframe[src*='https://app.powerbi.com/reportEmbed']")))

ElementClickInterceptedException: element click intercepted. Other element would receive the click: Selenium Python

I have seen other questions about this error but my case is that in my program the other element should receive the click. In details: the webdriver is scrolling through google search and it must click every website it finds but the program is preventing that. How can I make it NOT search the previous site it clicked?
This is the function. The program is looping it and after the first loop it scrolls down and the error occurs:
def get_info():
browser.switch_to.window(browser.window_handles[2])
description = WebDriverWait(browser, 10).until(
EC.presence_of_element_located((By.TAG_NAME, "h3"))
).text
site = WebDriverWait(browser, 10).until(
EC.presence_of_element_located((By.TAG_NAME, "cite"))
)
site.click()
url=browser.current_url
#removes the https:// and the / of the url
#to get just the domain of the website
try:
link=url.split("https://")
link1=link[1].split("/")
link2=link1[0]
link3=link2.split("www.")
real_link=link3[1]
except IndexError:
link=url.split("https://")
link1=link[1].split("/")
real_link=link1[0]
time.sleep(3)
screenshot=browser.save_screenshot("photos/"+"(" + real_link + ")" + ".png")
global content
content=[]
content.append(real_link)
content.append(description)
print(content)
browser.back()
time.sleep(5)
browser.execute_script("window.scrollBy(0,400)","")
time.sleep(5)
You can create a list of clicked website and check every time if that link is clicked or not. Here's the demo code :
clicked_website=[]
url=browser.current_url
clicked_website.append(url)
# Now while clicking
if <new_url> not in clicked_website:
<>.click()
This is just an idea how to implement. Your code is mess, I didn't understand clearly so, implement in your code by yourself.

web scraping contat list selenium python

How can i loop through contacts in group in Discord using selenium in Python?
I tried this code, and i have this error:
selenium.common.exceptions.StaleElementReferenceException: Message: The element reference of is stale; either the element is no longer attached to the DOM, it is not in the current frame context, or the document has been refreshed
Problem is scroller and contacts are constantly updating...
I tried this code:
while True:
num=0
try:
users_list = driver.find_elements_by_css_selector("div.memberOnline-1CIh-0.member-3W1lQa")
for user in users_list:
num+=1
user.click()
driver.execute_script("arguments[0].scrollIntoView();",user)
print('User number {}'.format(num))
except StaleElementReferenceException and ElementClickInterceptedException:
print('bad')
driver.execute_script("arguments[0].scrollTop = arguments[0].scrollHeight",users_list)
From your given code, you only scroll element, so the reason of Stale exception is you not wait page load complete, or at least not wait the contacts not load complete.
For debug purpose, you can simple add a long sleep before the loop, like sleep(15), and replace to explicit wait if production code, like
WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.ID, "myDynamicElement"))
)
Detail of Explicit Wait at here
If you call click() in loop, you need to find the elements again in loop:
while True:
num=0
try:
time.sleep(15)
users_list = driver
.find_elements_by_css_selector("div.memberOnline-1CIh-0.member-3W1lQa")
length = len(users_list)
for num in range(0, length):
user = users_list[num]
user.click()
time.sleep(15)
driver.execute_script("arguments[0].scrollIntoView();",user)
print('User number {}'.format(num+1))
// because the above `click` make page happen changes
// so selenium will treat it as a new page,
// those element reference found on `old` page, can not work on `new` page
// you need to find elements belongs to `old` page again on `new` page
// find users_list again from `new` page
users_list = driver
.find_elements_by_css_selector("div.memberOnline-1CIh-0.member-3W1lQa")
except StaleElementReferenceException and ElementClickInterceptedException:
print('bad')
driver.execute_script("arguments[0].scrollTop = arguments[0].scrollHeight",
users_list)

Unable to fetch all the necessary links during Iteration - Selenium Python

I am newbie to Selenium Python. I am trying to fetch the profile URLs which will be 10 per page. Without using while, I am able to fetch all 10 URLs but for only the first page alone. When I use while, it iterates, but fetches only 3 or 4 URLs per page.
I need to fetch all the 10 links and keep iterating through pages. I think, I must do something with StaleElementReferenceException
Kindly help me solve this problem.
Given the code below.
def test_connect_fetch_profiles(self):
driver = self.driver
search_data = driver.find_element_by_id("main-search-box")
search_data.clear()
search_data.send_keys("Selenium Python")
search_submit = driver.find_element_by_name("search")
search_submit.click()
noprofile = driver.find_elements_by_xpath("//*[text() = 'Sorry, no results containing all your search terms were found.']")
self.assertFalse(noprofile)
while True:
wait = WebDriverWait(driver, 150)
try:
profile_links = wait.until(EC.presence_of_all_elements_located((By.XPATH,"//*[contains(#href,'www.linkedin.com/profile/view?id=')][text()='LinkedIn Member'or contains(#href,'Type=NAME_SEARCH')][contains(#class,'main-headline')]")))
for each_link in profile_links:
page_links = each_link.get_attribute('href')
print(page_links)
driver.implicitly_wait(15)
appendFile = open("C:\\Users\\jayaramb\\Documents\\profile-links.csv", 'a')
appendFile.write(page_links + "\n")
appendFile.close()
driver.implicitly_wait(15)
next = wait.until(EC.visibility_of(driver.find_element_by_partial_link_text("Next")))
if next.is_displayed():
next.click()
else:
print("End of Page")
break
except ValueError:
print("It seems no values to fetch")
except NoSuchElementException:
print("No Elements to Fetch")
except StaleElementReferenceException:
print("No Change in Element Location")
else:
break
Please let me know if there are any other effective ways to fetch the required profile URL and keep iterating through pages.
I created a similar setup which works alright for me. I've had some problems with selenium trying to click on the next-button but it throwing a WebDriverException instead, likely because the next-button is not in view. Hence, instead of clicking the next-button I get its href-attribute and load the new page up with driver.get() and thus avoiding an actual click making the test more stable.
def test_fetch_google_links():
links = []
# Setup driver
driver = webdriver.Firefox()
driver.implicitly_wait(10)
driver.maximize_window()
# Visit google
driver.get("https://www.google.com")
# Enter search query
search_data = driver.find_element_by_name("q")
search_data.send_keys("test")
# Submit search query
search_button = driver.find_element_by_xpath("//button[#type='submit']")
search_button.click()
while True:
# Find and collect all anchors
anchors = driver.find_elements_by_xpath("//h3//a")
links += [a.get_attribute("href") for a in anchors]
try:
# Find the next page button
next_button = driver.find_element_by_xpath("//a[#id='pnnext']")
location = next_button.get_attribute("href")
driver.get(location)
except NoSuchElementException:
break
# Do something with the links
for l in links:
print l
print "Found {} links".format(len(links))
driver.quit()

Categories

Resources