Right click save link as then save - python

hello guys i want to right click save link as then save on the save pop up that windows shows.
this is an example:
https://www.who.int/data/gho/data/indicators/indicator-details/GHO/proportion-of-population-below-the-international-poverty-line-of-us$1-90-per-day-(-)
go on this page in the data tab u can see EXPORT DATA in CSV format:Right-click here & Save link
so if u right click and save link as it will let u save the data as csv.
i want to automate that can it be done using selenium python if so how?
i tried using actionchains but im not sure thats gona work

I think you would be better off using the Data API (json) offered on the same page, but I have managed to download the file using the code below and Google Chrome.
There is a lot going on which I didn't want to go into (hence the lazy usage of an occasional sleep), but the basic principle is that the Export link is inside a frame inside a frame (and there are many iframes on the page). Once the correct iframe has been found and the link located, a right click brings up the system menu.
This menu cannot be accessed via Selenium (because it is inside an iframe?), so pyautogui is used to move down to "Save link as..." and also to click the "Save" button on the Save As dialog.:
url = "https://www.who.int/data/gho/data/indicators/indicator-details/GHO/proportion-of-population-below-the-international-poverty-line-of-us$1-90-per-day-(-)"
driver.get(url)
driver.execute_script("window.scrollBy(0, 220);")
button = driver.find_element(By.CSS_SELECTOR, "button#dataButton")
button.click()
WebDriverWait(driver, 5).until(EC.text_to_be_present_in_element_attribute((By.CSS_SELECTOR, "button#dataButton"), "class", "active"))
time.sleep(10)
iframes = driver.find_elements(By.CSS_SELECTOR, "iframe[src*='https://app.powerbi.com/reportEmbed'")
for i in iframes:
try:
driver.switch_to.frame(i)
iframes2 = driver.find_elements(By.CSS_SELECTOR, "iframe[src*='cvSandboxPack.html']")
for i2 in iframes2:
try:
driver.switch_to.frame(i2)
downloads = driver.find_elements(By.CSS_SELECTOR, "a[download='data.csv']")
if len(downloads) > 0:
ActionChains(driver).context_click(downloads[0]).perform()
# Selenium cannot communicate with system dialogs
time.sleep(1)
pyautogui.typewrite(['down','down','down','down','enter'])
time.sleep(1)
pyautogui.press('enter')
time.sleep(2)
return
except StaleElementReferenceException as e:
continue
finally:
driver.switch_to.frame(i)
iframes2 = WebDriverWait(driver, 5).until(EC.presence_of_all_elements_located((By.TAG_NAME, "iframe")))
except StaleElementReferenceException as e:
continue
finally:
driver.switch_to.default_content()
iframes = WebDriverWait(driver, 5).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "iframe[src*='https://app.powerbi.com/reportEmbed']")))

Related

Open all links in a new tab using xpath and run a command Selenium - Python

New to selenium. I am trying to have Selenium open a new tab for each/every link provided in a dynamic table. I would like to run the same command on each tab (input information). Below is my current code. The code allows me to search and access the data table using its xpath. In the xpath I put a * in the "tr[ ]" which allows me to select all the needed links in the table cells. However, the code only takes me to the first link in the data table and runs the command.
driver.get('https://www.#.com')
driver.maximize_window()
time.sleep(2)
link = driver.find_element('link text', 'Search')
link.click()
time.sleep(4)
table = driver.find_element('xpath', '//*[#id="TableBody"]/tr[*]/td[1]/div[2]/a')
table.click()
time.sleep(4)
#Enter text in the links page
high_field = driver.find_element('id', 'formBid')
high_field.send_keys('#')
low_field = driver.find_element('id', 'formBidMin')
low_field.send_keys('#')
low_field.send_keys(Keys.RETURN)
print('Text entered!')
time.sleep(10)
How can I solve the following:
Have each link open in a new tab
Run a command in each tab (same command)

OSU, Download links open beatmap page instead of downloading the beatmap file

Noticed that the beatmap packages that are available officially in OSU have 98% songs I don't care for to play. Same with the unofficial mega packs you can find that have 20gigs of songs on a per year basis 2011,2012,2013,2013,etc..
I did find that the "most favourites" page in osu: https://osu.ppy.sh/beatmapsets?sort=favourites_desc have a good chunk of songs that I like or would play.
So I tried to create a python script which would click the download button on every beatmap panel.
I learned alot during this process-->"Actions move_to_element (hover menu), Wait.until_clickable, Stale Element Exceptions, Scroll Page execute script(s).
Kept having a hard time with elements disappearing from Page/DOM to make a "for element in elements" work properly I decided to have it scroll multiple times to load more beatmaps and than scrape for HREF links with the word "Download" in it and this worked great for capturing "most" of the links. Atleast captured over 3000 unique links.
I put it in a text file and it looks like this:
...
https://osu.ppy.sh/beatmapsets/1457867/download
https://osu.ppy.sh/beatmapsets/881996/download
https://osu.ppy.sh/beatmapsets/779173/download
https://osu.ppy.sh/beatmapsets/10112/download
https://osu.ppy.sh/beatmapsets/996628/download
https://osu.ppy.sh/beatmapsets/415886/download
https://osu.ppy.sh/beatmapsets/490662/download
...
The "Download" button on each panel all have this HREF link. If you click the button you download the beatmap file which is a .osz filetype. However, if you "right-click -> copy-link" from the "Download" button and you open it from a new-page or new-tab it will re-direct to the beatmaps page and not download the file.
I make it work by using the Pandas module to read a .xlxs excel file for URLs and loop for each url. Once the url page is opened it clicks the Download button:
def read_excel():
import pandas as pd
df = pd.read_excel('book.xlsx') # Get all the urls from the excel
mylist = df['urls'].tolist() #urls is the column name
print(mylist) # will print all the urls
# now loop through each url & perform actions.
for url in mylist:
options = webdriver.ChromeOptions()
options.add_experimental_option('excludeSwitches', ['enable-logging'])
options.add_argument("user-data- dir=C:\\Users\\%UserName%\\AppData\\Local\\Google\\Chrome\\User Data\\Profile1")
driver = webdriver.Chrome(executable_path=driver_path, chrome_options=options)
driver.get(url)
try:
WebDriverWait(driver, 3).until(EC.alert_is_present(),'Timed out waiting for alert.')
alert = driver.switch_to.alert
alert.accept()
print("alert accepted")
except TimeoutException:
print("no alert")
time.sleep(1)
wait = WebDriverWait(driver, 10)
try:
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "body > div.osu-layout__section.osu-layout__section--full.js-content.beatmaps_show > div > div > div:nth-child(2) > div.beatmapset-header > div > div.beatmapset-header__box.beatmapset-header__box--main > div.beatmapset-header__buttons > a:nth-child(2) > span"))).click()
time.sleep(1)
except Exception:
print("Can't find the Element Download")
time.sleep(10)
download_file()
driver.close()
This a sequence "one at a time" function, the download_file() function is a loop which checks the download folder to see if there's a file being downloaded, if not it goes to the next url.
This works. Ofcourse the website as limitations. Can only download max 8 at a time and after a 100 to 200 downloads you can't download anymore and you have to wait a bit. but the loop keeps going and tries each URL unless you stop the script. Luckily you can see the last beatmap that was downloaded and reference it to where it is in the Excel spreadsheet and remove the rows above and start the script again. I'm sure I can code it so it stops the loop when there's no new file that pops up in the Download folder.
Finally the question: Is there a way so it opens these download links and downloads the file without having to click the "Download Button" after opening the page? It redirects to the beatmap page instead of downloading the file automatically. Must be some java/html data I don't know about.
def read_excel():
import pandas as pd
df = pd.read_excel('book.xlsx') # Get all the urls from the excel
mylist = df['urls'].tolist() #urls is the column name
print(mylist) # will print all the urls
# now loop through each url & perform actions.
for url in mylist:
options = webdriver.ChromeOptions()
options.add_experimental_option('excludeSwitches', ['enable-logging'])
options.add_argument("user-data- dir=C:\\Users\\%UserName%\\AppData\\Local\\Google\\Chrome\\User Data\\Profile1")
driver = webdriver.Chrome(executable_path=driver_path, chrome_options=options)
driver.get(url)
try:
WebDriverWait(driver, 3).until(EC.alert_is_present(),'Timed out waiting for alert.')
alert = driver.switch_to.alert
alert.accept()
print("alert accepted")
except TimeoutException:
print("no alert")
time.sleep(1)
wait = WebDriverWait(driver, 10)
try:
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "body > div.osu-layout__section.osu-layout__section--full.js-content.beatmaps_show > div > div > div:nth-child(2) > div.beatmapset-header > div > div.beatmapset-header__box.beatmapset-header__box--main > div.beatmapset-header__buttons > a:nth-child(2) > span"))).click()
time.sleep(1)
except Exception:
print("Can't find the Element Download")
time.sleep(10)
download_file()
driver.close()

selenium.common.exceptions.ElementNotInteractableException: Message: element not interactable after switching the frame

I want to automate filling google forms with the images in google drive so I am using python with selenium.
The first image I can automate, but I got the error when I repeated the same process for the second image.
I created a sample google form to test.
https://forms.gle/fZ439KckkbpNU68v5
time.sleep(5)
driver.find_element_by_xpath('x path to first add file button').click()
frame= driver.find_element_by_class_name('picker-frame')
driver.switch_to.frame(frame)
*the first time it is working time*
driver.switch_to_default_content()
time.sleep(5)
driver.find_element_by_xpath('xpath to secondadd file button').click()
frame= driver.find_element_by_class_name('picker-frame')
driver.switch_to.frame(frame)
*after this, it causes an error*
I could not use the send_keys() method because this is button type.
Full code:
from selenium import webdriver
import random,time
option=webdriver.ChromeOptions()
driver = webdriver.Chrome("path to chrome driver")
driver.get("https://forms.gle/RvBYTZG5xuWoD8QA7")
driver.find_element_by_xpath('//*[#id="identifierId"]').send_keys("gmail")
driver.find_element_by_xpath('//*[#id="identifierNext"]/div/button').click()
time.sleep(3)
driver.find_element_by_xpath('//*[#id="password"]/div[1]/div/div[1]/input').send_keys("password")
driver.find_element_by_xpath('//*[#id="passwordNext"]/div/button').click()
time.sleep(5)
driver.find_element_by_xpath('//*[#id="mG61Hd"]/div[2]/div/div[2]/div[1]/div/div/div[2]/div/div[2]/span').click()
frame1 = driver.find_element_by_class_name('picker-frame')
driver.switch_to.frame(frame1)
driver.find_element_by_xpath('//*[#id=":6"]').click()
time.sleep(5)
driver.find_element_by_xpath('//*[#id=":1a.docs.0.1oe7tESa9uNnUC75FdYuqzPtPbLYz7QFi"]/div[2]').click()
driver.find_element_by_xpath('//*[#id="picker:ap:2"]').click()
driver.switch_to_default_content()
driver.find_element_by_xpath('//*[#id="mG61Hd"]/div[2]/div/div[2]/div[2]/div/div/div[2]/div/div[2]').click()
frame2 = driver.find_element_by_class_name('picker-frame')
driver.switch_to.frame(frame2)
driver.find_element_by_xpath('//*[#id=":6"]').click()
time.sleep(3)
This is a guess since no HTML was provided but maybe the location in the iframe of the element isnt correct since if you give it the global iframe path it want work i suggest to check if the button has a specifique i frame and switch to that iframe and give it the path in that one.
Or the element path isn't unique that makes problems too.
I think you are typing it wrongly. You should try this code.
driver.switch_to.default_content()
The solution is to put the refresh after switching the frame. Without having to use the switch_to_default_content() at all.
driver.refresh()
driver.switch_to.alert.accept()
It works.

ElementClickInterceptedException: element click intercepted. Other element would receive the click: Selenium Python

I have seen other questions about this error but my case is that in my program the other element should receive the click. In details: the webdriver is scrolling through google search and it must click every website it finds but the program is preventing that. How can I make it NOT search the previous site it clicked?
This is the function. The program is looping it and after the first loop it scrolls down and the error occurs:
def get_info():
browser.switch_to.window(browser.window_handles[2])
description = WebDriverWait(browser, 10).until(
EC.presence_of_element_located((By.TAG_NAME, "h3"))
).text
site = WebDriverWait(browser, 10).until(
EC.presence_of_element_located((By.TAG_NAME, "cite"))
)
site.click()
url=browser.current_url
#removes the https:// and the / of the url
#to get just the domain of the website
try:
link=url.split("https://")
link1=link[1].split("/")
link2=link1[0]
link3=link2.split("www.")
real_link=link3[1]
except IndexError:
link=url.split("https://")
link1=link[1].split("/")
real_link=link1[0]
time.sleep(3)
screenshot=browser.save_screenshot("photos/"+"(" + real_link + ")" + ".png")
global content
content=[]
content.append(real_link)
content.append(description)
print(content)
browser.back()
time.sleep(5)
browser.execute_script("window.scrollBy(0,400)","")
time.sleep(5)
You can create a list of clicked website and check every time if that link is clicked or not. Here's the demo code :
clicked_website=[]
url=browser.current_url
clicked_website.append(url)
# Now while clicking
if <new_url> not in clicked_website:
<>.click()
This is just an idea how to implement. Your code is mess, I didn't understand clearly so, implement in your code by yourself.

can not switch to a new page using python selenium webdriver?

I need to scrape the CSV result of the website (https://requestmap.webperf.tools/). But I can’t.
This is the process I need to do:
1- load the website (https://requestmap.webperf.tools/)
2- enter a new website as an input (for example, https://stackoverflow.com/)
3- click submit
4- in the new page which opens after the submit, download the csv file at the end of the page
But I think the driver has the main page and doesn’t switch to the new page. That's why I can’t download the CSV file.
Would you please tell me how to do it?
here is my code:
options = webdriver.ChromeOptions()
options.add_argument('-headless')
options.add_argument('-no-sandbox')
options.add_argument('-disable-dev-shm-usage')
driver = webdriver.Chrome('chromedriver',options=options)
driver.get("https://requestmap.webperf.tools/")
driver.find_element_by_id("url").send_keys("https://stackoverflow.com/")
driver.find_element_by_id("submitter").click()
I have tested different ways to solve my issue:
First of all:
I got this code from here :
But it doesn't work.
window_after = driver.window_handles[1]
driver.switch_to.window(window_after)
I also tried this or this, but they are not working as well.
# wait to make sure there are two windows open
# it is not working
WebDriverWait(driver, 30).until(lambda d: len(d.window_handles) == 2)
# switch windows
# it is not working
driver.switch_to_window(driver.window_handles[1])
content = driver.page_source
soup = BeautifulSoup(content)
driver.find_elements_by_name("Download CSV")
So How can I get this issue solved?
Is there any other way in python to do so and switch to a new windows?

Categories

Resources