Chrome Options Doesn't Apply Upon Loading The Page In Selenium - python

I'm trying to scrape an Amazon French page using Selenium. I want this page to be translated from French to English upon loading. I have attempted to do that using following code:
myoptions = webdriver.ChromeOptions()
prefs = {
"translate_whitelists": {"fr":"en"},
"translate": {"enabled":"true"}
}
myoptions.add_experimental_option("prefs", prefs)
path = r'C:\chromedriver.exe'
browser = webdriver.Chrome(executable_path=path, options=myoptions)
browser.get("https://www.amazon.fr/dp/0001002791")
However, when the page loads, it still shows up in French, as you can see in the image below:
Now, if I navigate to any other link from this webpage, the option to translate the webpage works and the icon shows in the search bar. Plus, the message of translation also pop ups as shown below:
From here on, all of the webpages gets translated, even the initial one, as you can see:
Why didn't it work earlier when the page loaded? How do I fix this?

As described in this post setting the language option should fix this:
myoptions.add_argument("--lang=en")

Related

Get the current url when it's not valid with Selenium Python

I'm an beginner learning web scraping with Selenium. Recently I faced the problem that sometimes there are button elements that do not have a "href" attribute with link to the website it leads to. In order to obtain the link or useful information from that link, I need to click on the button and get the current url in the new window using the "current_url" method. However, it doesn't always work, when the new url is not valid. I'm asking for help on the solution.
To give you an example, say one wants to obtain the Spotify link to the song listed on https://www.what-song.com/Tvshow/100242/BoJack-Horseman/e/116712. After clicking on the Spotify button, instead of being directed to spotify web player, I see a new window popping up with this url "spotify:track:6ta5yavnnEfCE4faU0jebM". It's not valid probably due to some errors made by the website, but the identifier "6ta5yavnnEfCE4faU0jebM" is still useful so I want to obtain it.
However, when I try using the "current_url" method, it gives me the original link "https://www.what-song.com/Tvshow/100242/BoJack-Horseman/e/116712", instead of the invalid url. My codes are attached below. Note that I already have a time.sleep.
Specs: MacOS 12.6, chrome and webdriver version 106.something, Python 3.
s = Service('/web_scraping/chromedriver')
driver = webdriver.Chrome(service=s)
wait = WebDriverWait(driver, 3)
driver.get('https://www.what-song.com/Tvshow/100242/BoJack-Horseman/e/116712')
spotify_button_element = driver.find_element("xpath",'/html/body/div/div[2]/main/div[2]/div/div[1]/div[5]/div[1]/div[2]/div/div/div[2]/div/div[1]/button[3]')
driver.execute_script("arguments[0].click();", spotify_button_element)
time.sleep(3)
print(driver.current_url)
Any idea on why this happened and how to fix it? Hugh thanks in advance!
What you could do instead of finding the button to click and opening a new tab is to do the following:
import json
spotify_data_request = driver.find_element("id",'__NEXT_DATA__') # get the data stored in a script tag with id = '__NEXT_DATA__'
temp = json.loads(spotify_data_request.get_attribute('innerHTML')) # convert the string into a dict like object
print(temp['props']['pageProps']['episode']['songs'][0]['song']['spotifyId']) # get the Id attribute that you want instead of having to click the spotify button and retrieve it from the URL

Stopping Page Loading - Selenium Python

I unfortunately can not stop a page from loading using Selenium in Python.
I have tried:
driver.execute_script("window.stop();")
driver.set_page_load_timeout(10)
webdriver.ActionChains(driver).send_keys(Keys.ESCAPE).perform()
The page is a .cgi that constantly loads. I would like to either scrape data from a class on the page or the page title, however neither works with the 3 methods above.
When I try to manually press ESC, or click the cross, it works perfectly.
Thank you for reading.
You didn't share your code and a page you are working on, so we can only guess.
So, in case you really tried all the above correctly and it still not helped try adding Eager page loading strategy to your driver options.
Eager page loading strategy will make WebDriver wait until the initial HTML document has been completely loaded and parsed, and discards loading of stylesheets, images and subframes (DOMContentLoaded event fire is returned).
With it your code will look something like this:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.page_load_strategy = 'eager'
driver = webdriver.Chrome(options=options)
# Navigate to url
driver.get(your_page_url)
UPD
You are trying to upload a file with Selenium and doing it wrong.
To upload the file with Selenium you need to send a full file path to that element.
So, if the file you want to upload is located by C:/Model.lp your code should be:
driver.find_element_by_xpath("//input[#name='field.1']").send_keys("C:/Model.lp")

Taking full page screenshot in chrome store SELENIUM PYTHON

I'm trying to save a full-page screenshot of a chrome store page, using selenium, and python 3.
I've searched online for different answers and I keep getting only the "header" part, no matter what I try. As if the page doesn't scroll for the next "section".
I tried clicking inside the page to verify it's in focus but that didn't help.
Tried answers with stitching and imported Screenshots and Image.
my current code is:
ob = Screenshot_Clipping.Screenshot()
driver2 = webdriver.Chrome(executable_path=chromedriver)
url = "https://chrome.google.com/webstore/detail/online-game-zone-new-tab/abalcghoakdcaalbfadaacmapphamklh"
driver2.get(url)
img_url = ob.full_Screenshot(driver, save_path=r'.', image_name='Myimage.png')
print(img_url)
print('done')
driver2.close()
driver2.quit()
but that gives me this picture:
What am I doing wrong?

How to Scrape data for mobile reviews in flipkart?

how to scrap the mobile reviews data from Flipkart
I tried using selenium package and but unable to extract all the reviews at a glance except for one review so can anyone help me with the code...
fk_path = ('[https://www.flipkart.com/moto-g-turbo-white-16-gb/product-
reviews/itmecc4uhbue7ve6?pid=MOBECC4UQTJ5QZFR][1]')
from selenium import webdriver
browser = webdriver.Chrome('/home/subhasis/chromedriver')
browser.get(fk_path)
browser.find_element_by_xpath("//span[#class='_1EPkIx']/span").click()
# Mimick clicking on 'Read More'
[p.click() for p in browser.find_elements_by_xpath("//span[#class='_1EPkIx']/span")] # Expand
all 'Read More' buttons
browser.find_element_by_xpath("//div[#class='_3DCdKt']//div[#class='qwjRop']/div").text
# Extract texts from respective Xpaths (1st review)
Try opening a browser like firefox / chrome and checking the the xpath selection.
$x('//div[#class="col"]')
$x('//div[#class="col"]/*/*/p/text()')
Consider giving the browser some time to load all of the extra javascript as well before going through and clicking so quickly, this also prevents any timeouts that might occur because of getting blocked for making so many requests so quickly, consider between clicking "read more":
time.sleep(1)
The reason being is that it looks like it might make a network request when clicking read more.

Redirect the selenium link to another link

When I press a button using selenium, it redirects me to a new page. I want my selenium to redirect to same link also. How can I do it?
driver.find_element_by_xpath('//span[contains(text(),"Secure Login")]').click()
is a button I am clicking. It is redirecting me to a new page. I want selenium to get the same link and point to new page.
WebDriver creates and interface between selenium and your web browser. You don't need to handle anything separately in selenium if everything is happening in the same tab and it's just a navigation to a different link.
Though if your link is opening to a different tab, you will need to switch tabs. Before communicating with the elements over the new page.
Why don't you switch to new window, not sure about python but in java its like:
String currentState = driver.getWindowHandle().toString();
for (String handles : driver.getWindowHandles())
{
if (!handles.equalsIgnoreCase(currentState))
{
driver.switchTo().window(handles);
}
}
You can do same in python.

Categories

Resources