Scraping through multiple pages when url doesn't change

Scraping through multiple pages when url doesn't change - python

I need to scrape through all the pages in the (link)[https://mahabocw.in/safety-kit-benefits-distribution/]. But the url doesn't change when I move to the next page. I tried to use selenium, but I am stuck as I don't know how to click the next page. Any help or suggestions would be really appreciated.
I have implemented the following code till now.
from selenium import webdriver
import time
url = "https://mahabocw.in/safety-kit-benefits-distribution/"
driver = webdriver.Chrome()
driver.get(url)
Below is the button element I need to click
<button type="button" class="ag-paging-button">Next</button>
Thanks a lot in advance.
[1]: https://mahabocw.in/safety-kit-benefits-distribution/

You need to tell selenium to click the next button. Add this to your code and see if it works.
next_button = '/html/body/div/div[6]/div/article/div/div/div/div/div[2]/div/div/div[2]/div/div[4]/span[2]/div[3]/button'
click_next = driver.find_element_by_xpath(next_button)
click_next.submit()
You might have to use click_next.click() instead of .submit() depending on the page. Also, to get the 'next_button' you just inspect elements on the page, find the item you want, and click copy as xpath.

Related

SELENIUM (Python) : How to retrieve the URL to which an element redirects me to (opens a new tab) after clicking? Element has <a> tag but no href

I am trying to scrape a website with product listings that if clicked on redirect the user to a new tab with further information/contact the seller details. I am trying to retrieve said URL without actually having to click on each listing in the catalog and wait for the page to load as this would take a lot of time.
I have searched in web inspector for the "href" but the only link available is to the image source of each listing. However, I noticed that after clicking each element, a GET request method gets sent and this is the URL (https://api.wallapop.com/api/v3/items/v6g2v4y045ze?language=es) it contains pretty much all the information I need, I'm not sure if it's of any use, but its the furthest I've gotten.
UPDATE: I tried the code I was suggested (with modifications to specifically find the 'href' attributes in the clickable elements), but I get None returning. I have been looking into finding an 'onclick' element or something similar that might have what I'm looking for but so far it looks like the solution will end up being clicking each element and extracting all the information from there.
elements123 = driver.find_elements(By.XPATH, '//a[contains(#class,"ItemCardList__item")]')
for e in elements123:
print(e.get_attribute('href'))
I appreciate any insights, thank you in advance.

You need something like this:
from selenium import webdriver
from selenium.webdriver.common.by import By
driver = webdriver.Chrome()
driver.get("https://google.com")
# Get all the elements available with tag name 'a'
elements = driver.find_elements(By.TAG_NAME, 'a')
for e in elements:
print(e.get_attribute('href'))

Why won't Selenium in Python click the pop-up privacy button?

I am having some issues with Selenium not clicking the pop-up privacy button on https://www.transfermarkt.com/
Here is my code so far:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://www.transfermarkt.com/')
accept_button = driver.find_element_by_xpath('/html/body/div/div[2]/div[3]/div[2]/button')
accept_button.click()
It comes up saying:
NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"/html/body/div/div[2]/div[3]/div[2]/button"}
Anyone have any thoughts?

I believe the issue is that the element you are trying to click is inside an iframe. In order to click that element you'll have to first switch to that frame. I noticed the iframe has title="SP Consent Message" so I'll use a CSS Selector to identify it based on that. Your code with the added line:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://www.transfermarkt.com/')
driver.switch_to.frame(driver.find_element_by_css_selector('iframe[title="SP Consent Message"]'))
accept_button = driver.find_element_by_xpath('/html/body/div/div[2]/div[3]/div[2]/button')
accept_button.click()
Note that you may have to switch back to the default frame to continue your test, I'm not 100% sure as that iframe has gone away.
Also it seems like some folks don't get that popup when the hit the website, not sure why, may be something you want to look in to to determine how you want to test this.
Like #Prophet says you should improve the xpath for the button (again it seems to have a unique title so I would use CSS Selector 'button[title="ACCEPT ALL"]'), but this works for me to click it.

There are 2 issues here:
You need to add wait / delay before accept_button = driver.find_element_by_xpath('/html/body/div/div[2]/div[3]/div[2]/button') to let the page load
It's very, very bad practice to use absolute XPaths. You have to define an unique, short and clear related XPath expression.
Additionally, I see no pop-up privacy button when I open that page. So, maybe that element is indeed not there.

Scroll down when the Scrollbar has no tag with Selenium for python

I wanted to scroll down in an element, where the scrollbar doesnt have a tag/element. I did quite a few research on this but i couldnt find anything for that specific case. Im using Selenium for Python
We are talking about the Site "www.swap.gg", its a site for csgo skins, and in order to load every item from the bot, you need to scroll down in the element "Bot inventory". The Site looks like this when using F12 in browser it looks like this The respective element
There is no key which you can use to scroll down there, therefore the .send_keys() command doesnt work by any chance. I also tried the touchactions.scroll_from_element(element,xoffset,yoffset) but didn't have any luck either. Im using Firefox as a webdriver incase that matters.
The xpath of the element is "/html/body/div/div1/section1/div[2]/div[3]/div[2]/div[2]"
The CSS selector is "div.is-paddingless:nth-child(3)"
Any ideas?

Find the parent div
element = driver.find_element_by_xpath('/html/body/div/div[1]/section[1]/div[2]/div[3]/div[2]/div[2]')
Scroll down to trigger the reload
driver.execute_script("arguments[0].scrollBy(0, 500)", element)
import selenium
import selenium.webdriver
import time
driver = selenium.webdriver.Chrome()
driver.get('https://swap.gg/')
element = driver.find_element_by_xpath('/html/body/div/div[1]/section[1]/div[2]/div[3]/div[2]/div[2]')
for i in range(20):
driver.execute_script("arguments[0].scrollBy(0, 500)", element)
time.sleep(2)

How Do I Use Selenium WebDriver To Click A Non Anchor Tag?

I am using Selenium WebDriver to do automated testing of a website. I have been successful in clicking through numerous menus and links to a point.
At one point the website I am working with generates links that look like this:
<U onclick="HourglassSubmitItem(document.all('PageName').value, '00000001Responsibility Code')">Responsibility Code</U>
I am trying to use the .click functionality of the webdriver to click this link with no success.
Using this:
page.find_element_by_xpath("//u[contains(text(),'Responsibility Code')]")
successfully finds the U tag above. but when I add .click() to the end of this xpath, the click is not performed. But it also does not generate an error. So, my question is can Selenium be used to simulate clicks on an HTML tag that is NOT an anchor () tag? If so, how?
I will also say that I do not have control over the page I am working with, so changing the to is not possible.
I would appreciate any guidance the Community could provide.
Thank You for you help,
Chris

Sometimes using JavaScript could solve the "clicking" issue:
element = page.find_element_by_xpath("//u[contains(text(),'Responsibility Code')]")
page.execute_script('arguments[0].click();', element)

You can prefer JavaScript in this case.
WebElment element = page.find_element_by_xpath("//u[contains(text(),'Responsibility Code')]")
JavaScriptExecutor executor = (JavaScriptExecutor)driver;
executor.ExecuteScript("arguments[0].click();", element);

Clicking JavaScript links in Selenium WebDriver and Python

I am using Selenium Webdriver in Python and I got stuck trying to activate a javascript button.
What I need to do here is to click the Go to Previous Month button twice so that I have August 2014.
And then I need to click on one of the days.
The images below show the code. Please tell me if I need to provide more info.
THIS IS THE "GO TO PREVIOUS MONTH BUTTON" + INSPECT ELEMENT
AND HERE I'VE CLICKED ON THE 1ST OF AUGUST + INSPECT ELEMENT ON "1"
How do I do it?

First find your element with CSS selectors (you need to be familiar how CSS selectors work - this is prerequisite for most web development):
elem = webdriver.find_element_by_css_selector("a[title='Go to previous month']")[0]
Related WebDriver documentantion.
Then when you get your elem (there might paeg loading time, etc. issues you need to deal with) you can click it.
elem.click()
See also: Wait for page load in Selenium
Related click() documentation.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Scraping through multiple pages when url doesn't change - python

Related

SELENIUM (Python) : How to retrieve the URL to which an element redirects me to (opens a new tab) after clicking? Element has <a> tag but no href

Why won't Selenium in Python click the pop-up privacy button?

Scroll down when the Scrollbar has no tag with Selenium for python

How Do I Use Selenium WebDriver To Click A Non Anchor Tag?

Clicking JavaScript links in Selenium WebDriver and Python

Categories

Resources