Please I have a web form that I use selenium script to autofill in data row by row. Each time I run add_event_button, a new row is created, then I autofill the date_field to the remarks_field. All the fields within one row have similar ID attributes. The only difference is when you click the variable add_event_button, the new row has an ID increment of 1 across all fields.
Here is what I have done:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support.ui import Select
import time
import pandas as pd
from time import sleep
add_event_button = wait.until(EC.element_to_be_clickable((By.XPATH, '//*[#id="tab-voyage-log-12780"]/div[2]/a[2]'))).click()
time.sleep(2)
date_field = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '#WGItem02_voyage_log-25_4')))
date_field.send_keys(date_time_row[1])
location_field = Select(driver.find_element(By.ID, 'WGItem09_voyage_log-2_4'))
location_field.select_by_index(8)
event_field = Select(driver.find_element(By.ID, 'WGItem04_voyage_log-2_4'))
event_field.select_by_index(4)
subevent_field = Select(driver.find_element(By.ID, 'WGItem05_voyage_log-2_4'))
subevent_field.select_by_index(3)
remarks_field = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '#WGItem06_voyage_log-2_4')))
remarks_field.send_keys(remarks_data_col[1])
How do I iterate through the above block of code to run up to 20 times. For example; WGItem05_voyage_log-2_4 will run till it stops at WGItem05_voyage_log-20_4 and remarks_field.send_keys(remarks_data_col[1]) runs till remarks_field.send_keys(remarks_data_col[20]).
for i in range(2,21):
remarks_field = wait.until(EC.element_to_be_clickable((By.XPATH, f"//*[#id='WGItem06_voyage_log-{i}_4']")))
remarks_field.send_keys(remarks_data_col[i-1])
Did you mean looping it like so and using f string to access the element with that xpath.
Related
i'm trying to run the following piece of code :
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Chrome('C:/Users/SoumyaPandey/Desktop/Galytix/Scrapers/data_ingestion/chromedriver.exe')
driver.get('https://www.cnhindustrial.com/en-us/media/press_releases/Pages/default.aspx')
years_urls = list()
#ctl00_ctl33_g_8893c127_d0ad_40f2_9856_d85936172f35_years --> id for the year filter
years_elements = driver.find_element_by_id('ctl00_ctl33_g_8893c127_d0ad_40f2_9856_d85936172f35_years').find_elements_by_tag_name('a')
for i in range(len(years_elements)):
years_urls.append(years_elements[i].get_attribute('href'))
newslinks = list()
for k in range(len(years_urls)):
url = years_urls[k]
driver.get(url)
#link-detailpage --> id for the newslinks in each year
news = driver.find_elements_by_class_name('link-detailpage')
for j in range(len(news)):
newslinks.append(news[j].find_element_by_tag_name('a').get_attribute('href'))
when I run this code, the newslinks list is empty at the end of execution. But if I run it line by line, by assigning the value of 'k' one by one, on my own, it runs successfully.
Where am I going wrong in the logic. Please help.
It seems there is too much redundant code. I would suggest use either linear xpath or css selector to identify the elements.
However some of the pages the new link not appeared you need to handle this using try..except.
Since you need to navigate each url I would suggest use explicit wait WebDriverWait()
Code:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
driver=webdriver.Chrome("C:/Users/SoumyaPandey/Desktop/Galytix/Scrapers/data_ingestion/chromedriver.exe")
driver.get("https://www.cnhindustrial.com/en-us/media/press_releases/Pages/default.aspx")
allyears=WebDriverWait(driver,10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR,"div#ctl00_ctl33_g_8893c127_d0ad_40f2_9856_d85936172f35_years a")))
yearsurl=[url.get_attribute("href") for url in allyears]
newslinks = list()
for yr in yearsurl:
driver.get(yr)
try:
for element in WebDriverWait(driver,5).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,"div.link-detailpage >a"))):
newslinks.append(element.get_attribute("href"))
except:
continue
print(newslinks)
OutPut:
['https://www.cnhindustrial.com/en-us/media/press_releases/2021/march/Pages/a-problem-solved-at-a-rate-of-knots-the-latest-Top-Story-available-on-CNHIndustrial-com.aspx', 'https://www.cnhindustrial.com/en-us/media/press_releases/2021/march/Pages/CNH-Industrial-acquires-a-minority-stake-in-Augmenta.aspx', 'https://www.cnhindustrial.com/en-us/media/press_releases/2021/march/Pages/CNH-Industrial-presents-YOUNIVERSE.aspx', 'https://www.cnhindustrial.com/en-us/media/press_releases/2021/march/Pages/Calling-of-the-Annual-General-Meeting.aspx', 'https://www.cnhindustrial.com/en-us/media/press_releases/2021/march/Pages/CNH-Industrial-completes-minority-investment-in-Monarch-Tractor.aspx', 'https://www.cnhindustrial.com/en-us/media/press_releases/2021/February/Pages/CNH-Industrial-N-V--announces-the-extension-by-one-additional-year-to-March-2026-of-its-syndicated-credit-facility.aspx', 'https://www.cnhindustrial.com/en-us/media/press_releases/2021/February/Pages/Working-for-a-safer-future-with-World-Class-Manufacturing.aspx', 'https://www.cnhindustrial.com/en-us/media/press_releases/2021/February/Pages/Behind-the-Wheel-CNH-Industrial-supports-the-growing-hemp-industry-in-North-America.aspx', 'https://www.cnhindustrial.com/en-us/media/press_releases/2021/February/Pages/CNH-Industrial-employees-in-Italy-to-receive-contractual-bonus-for-2020-results.aspx', 'https://www.cnhindustrial.com/en-us/media/press_releases/2021/February/Pages/2020-Fourth-Quarter-and-Full-Year-Results.aspx', 'https://www.cnhindustrial.com/en-us/media/press_releases/2021/january/Pages/The-Iveco-Defence-Vehicles-plant-in-Sete-Lagoas,-Brazil-and-the-New-Holland-Agriculture-facility-in-Croix,-France.aspx', 'https://www.cnhindustrial.com/en-us/media/press_releases/2021/january/Pages/CNH-Industrial-to-announce-2020-Fourth-Quarter-and-Full-Year-financial-results-on-February-3-2021.aspx', 'https://www.cnhindustrial.com/en-us/media/press_releases/2021/january/Pages/CNH-Industrial-publishes-its-2021-Corporate-Calendar.aspx', 'https://www.cnhindustrial.com/en-us/media/press_releases/2021/january/Pages/Iveco-Defence-Vehicles-supplies-third-generation-protected-military-GTF8x8-(ZLK-15t)-trucks-to-the-German-Army.aspx', 'https://www.cnhindustrial.com/en-us/media/press_releases/2021/january/Pages/STEYR-New-Holland-Agriculture-CASE-Construction-Equipment-and-FPT-Industrial-win-prestigious-2020-Good-Design%C2%AE-Awards.aspx', 'https://www.cnhindustrial.com/en-us/media/press_releases/2021/january/Pages/CNH-Industrial-completes-the-acquisition-of-four-divisions-of-CEG-in-South-Africa.aspx',so on...]
Update:
If you don't want use webdriverwait which is best practice then use time.sleep() since page needs some time to load and element should be visible before interacting it.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
driver = webdriver.Chrome("C:/Users/SoumyaPandey/Desktop/Galytix/Scrapers/data_ingestion/chromedriver.exe")
driver.get('https://www.cnhindustrial.com/en-us/media/press_releases/Pages/default.aspx')
years_urls = list()
time.sleep(5)
#ctl00_ctl33_g_8893c127_d0ad_40f2_9856_d85936172f35_years --> id for the year filter
years_elements = driver.find_elements_by_xpath('//div[#id="ctl00_ctl33_g_8893c127_d0ad_40f2_9856_d85936172f35_years"]//a')
for i in range(len(years_elements)):
years_urls.append(years_elements[i].get_attribute('href'))
print(years_urls)
newslinks = list()
for k in range(len(years_urls)):
url = years_urls[k]
driver.get(url)
time.sleep(3)
news = driver.find_elements_by_xpath('//div[#class="link-detailpage"]/a')
for j in range(len(news)):
newslinks.append(news[j].get_attribute('href'))
print(newslinks)
There is a popup asking you to accept cookies that you need to click beforehand.
Add this to your script:
WebDriverWait(driver, 5).until(EC.element_to_be_clickable((By.ID, "CybotCookiebotDialogBodyButtonAccept")))
driver.find_element_by_id("CybotCookiebotDialogBodyButtonAccept").click()
So the final result will be:
from selenium import webdriver
from selenium.webdriver.support.wait import WebDriverWait
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
driver = webdriver.Chrome('C:/Users/SoumyaPandey/Desktop/Galytix/Scrapers/data_ingestion/chromedriver.exe')
driver.get('https://www.cnhindustrial.com/en-us/media/press_releases/Pages/default.aspx')
# this part is added, together with the necessary imports
WebDriverWait(driver, 5).until(EC.element_to_be_clickable((By.ID, "CybotCookiebotDialogBodyButtonAccept")))
driver.find_element_by_id("CybotCookiebotDialogBodyButtonAccept").click()
years_urls = list()
#ctl00_ctl33_g_8893c127_d0ad_40f2_9856_d85936172f35_years --> id for the year filter
# years_elements = driver.find_element_by_css_selector("#ctl00_ctl33_g_8893c127_d0ad_40f2_9856_d85936172f35_years")
years_elements = driver.find_element_by_id('ctl00_ctl33_g_8893c127_d0ad_40f2_9856_d85936172f35_years').find_elements_by_tag_name('a')
for i in range(len(years_elements)):
years_urls.append(years_elements[i].get_attribute('href'))
newslinks = list()
for k in range(len(years_urls)):
url = years_urls[k]
driver.get(url)
#link-detailpage --> id for the newslinks in each year
news = driver.find_elements_by_class_name('link-detailpage')
for j in range(len(news)):
newslinks.append(news[j].find_element_by_tag_name('a').get_attribute('href'))
This is my first selenium project and I am trying to submit this form:
https://docs.google.com/forms/d/e/1FAIpQLSdwD1y2eBOoJ-hvk97wDGptfI9oYga8SqtUz7u3nrFbWM7hxw/viewform
I succeeded clicking Multiple choice and Checkboxes but I cant select the Drop down. I tried following with no success. All pointers are welcome:
import os
import requests
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
import time
from selenium.webdriver.common.action_chains import ActionChains
driver=webdriver.Chrome(ChromeDriverManager().install())
driver.maximize_window()
driver.get("https://docs.google.com/forms/d/e/1FAIpQLSdwD1y2eBOoJ-hvk97wDGptfI9oYga8SqtUz7u3nrFbWM7hxw/viewform")
multiFirst = driver.find_element_by_xpath("//*[#id=\"i5\"]/div[3]/div")
multiFirst.click()
checkBocSecond = driver.find_element_by_xpath("//*[#id=\"i22\"]/div[2]")
checkBocSecond.click()
dropdownThird = driver.find_element_by_xpath("//*[#id=\"mG61Hd\"]/div[2]/div/div[2]/div[3]/div/div/div[2]/div/div[2]/div[5]")
ActionChains(driver).move_to_element(dropdownThird).click(dropdownThird).perform()
dropdownThird = driver.find_element_by_xpath(
'(//span[#class="quantumWizMenuPaperselectContent exportContent"])[1]')
dropdownThird.click()
WebDriverWait(driver, 5000).until(EC.presence_of_element_located(
(By.XPATH, '(//*[#jscontroller="liFoG"]//*[contains(text(),"Option 1")])[2]'))).click()
your locator is incorrect use above locator
you have to import:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
Selenium provides Select method for selecting drop downs.
from selenium.webdriver.support.select import Select
selection =Select(driver.find_element_by_xpath("yourXPATH"))
3 ways to select
select_by_index
selection.select_by_index(yourIndex)
It selects by index starting from 0 to end of your drop-down.
select_by_visible_text
selection.select_by_visible_text('yourText')
It uses the text inside the drop down.
select_by_value
selection.select_by_value('Value')
It uses the value attribute of drop-down list.
This question already exists:
Condition to check if Selenium is done scrolling based on web element?
Closed 2 years ago.
I wrote a script to scrape images on TripAdvisor of hotels and I am able to iterate through all of them, my concern is whether to know I am finished scrolling within the popup window. I am unable to create a condition to break outside of my loop to then parse through all of the image urls and stays inside the loop infinitely. What should my if condition be in order to leave out of the loop? Any help is greatly appreciated!
# import dependencies
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
import re
import selenium
import io
import pandas as pd
import urllib.request
import urllib.parse
import requests
from bs4 import BeautifulSoup
import pandas as pd
from selenium.webdriver.common.action_chains import ActionChains
from selenium import webdriver
import time
from _datetime import datetime
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
options = webdriver.ChromeOptions()
options.headless=False
driver = webdriver.Chrome("/Users/rishi/Downloads/chromedriver 3")
driver.maximize_window()
prefs = {"profile.default_content_setting_values.notifications" : 2}
options.add_experimental_option("prefs", prefs)
#open up website
driver.get(
"https://www.tripadvisor.com/Hotel_Review-g28970-d84078-Reviews-Hyatt_Regency_Washington_on_Capitol_Hill-Washington_DC_District_of_Columbia.html#/media/84078/?albumid=101&type=2&category=101")
image_url = []
end = False
while not(end):
old_image_length = len(image_url)
#wait until element is found and then store all webelements into list
images = WebDriverWait(driver, 20).until(
EC.presence_of_all_elements_located(
(By.XPATH, '//*[#class="media-viewer-dt-root-GalleryImageWithOverlay__galleryImage--1Drp0"]')))
#iterate through visible images and add their url to list
for index, image in enumerate(images):
image_url.append(images[index].value_of_css_property("background-image"))
new_image_length = len(image_url)
#move to next visible images in the array which is the last one
driver.execute_script("arguments[0].scrollIntoView();", images[-1])
#wait one second
time.sleep(1)
#if the first and last image in the arrays are the same for visibility then get out
if(old_image_length == new_image_length):
end = True
#clean the list to provide clear links
for i in range(len(image_url)):
start = image_url[i].find("url(\"") + len("url(\"")
end = image_url[i].find("\")")
print(image_url[i][start:end])
#print(image_url)
#Rishiraj Kanugo You can check the visibility of last element in the popup to make sure that the popup is fully scrolled to the bottom.if(element.isVisible()){Syso("popup is fully scrolled down");}
I want to click first on the link of the table with text A218012216.
It seems that table/links code are hidden inside JS.
I tried several ways, but without success.
This is my code so far:
import time
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import Select
url0 ="https://ccrecordse.tarrantcounty.com/AssumedNames/SearchEntry.aspx"
driver = webdriver.Chrome(executable_path="D:\Python\chromedriver.exe")
driver.get(url0)
time.sleep(3)
#fill the form # select by visible text
selectStart = driver.find_element_by_id('x:11265151.0:mkr:3')
selectStart.send_keys('09/05/2019')
selectEnd = driver.find_element_by_id('x:1246303050.0:mkr:3')
selectEnd.send_keys('09/05/2019')
#submit the form
driver.find_element_by_id("cphNoMargin_SearchButtons2_btnSearch__5").click()
time.sleep(3)
driver.find_element_by_link_text('A218012216').click()
How can I get that information?
The problem is that you are trying to use find_element_by_link_text on an element that is not an <a> tag link_text is only for <a> tags...
In my solution, you will see I am using WebDriverWait this is best practice in selenium...
Also, I used the XPath locator with text()=the_text note that the the_text will need to change as the date changes (you are searching for the future date 09/05/2019 so it shows the current date therefor the text will change...)
The Solution:
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
url0 ="https://ccrecordse.tarrantcounty.com/AssumedNames/SearchEntry.aspx"
driver = webdriver.Chrome(executable_path=r"D:\Python\chromedriver.exe")
driver.get(url0)
time.sleep(3)
#fill the form # select by visible text
selectStart = driver.find_element_by_id('x:11265151.0:mkr:3')
selectStart.send_keys('02/19/2019')
selectEnd = driver.find_element_by_id('x:1246303050.0:mkr:3')
selectEnd.send_keys('02/19/2019')
#submit the form
driver.find_element_by_id("cphNoMargin_SearchButtons2_btnSearch__5").click()
the_text = "A219002410"
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//*[text()='"+the_text +"']"))).click()
Hope this helps you!
I've written a script in python using selenium to get the converted value of a certain amount. The amount produces converted value when the earlier is made to put in a placeholder. The newly produced value is found adjacent to the amount. When I put any amount manually in that placeholder, I get a converted value accordingly but when I do the same programmatically, the value remains unchanged and as a result my scraper gets 0 as value. How can I make it work?
Link to that webpage: weblink
The script I've tried with:
from selenium.webdriver import Chrome
from contextlib import closing
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
with closing(Chrome()) as driver:
wait = WebDriverWait(driver, 10)
driver.get("find_the_link_above")
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, ".OrderForm_input-box_XkGmi input[name='amount']"))).send_keys(100)
item = wait.until(EC.presence_of_element_located((By.CLASS_NAME,"OrderForm_total_6EL8d"))).text
print(item)
When I put any amount to the placeholder manually, the change can be seen like below:
But, when I do the same using the script, this is how it looks like:
I've marked the valuees with black color to let you know what I meant.
Problem is that you are sending the value too early so that value is not reflecting after entering the amount value. Here i am waiting for EUR SPREAD element to load before setting the amount value.You can use the same element or any other of your chose but make sure page loads completely with that object and then send the amount value.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
chrome_path = r"path"
driver = webdriver.Chrome(chrome_path)
wait = WebDriverWait(driver, 10)
driver.get("https://www.gdax.com/trade/LTC-EUR")
wait.until(EC.presence_of_element_located((By.XPATH, "//div[#class='OrderBookPanel_text_33dbp']")))
wait.until(EC.element_to_be_clickable((By.XPATH, "//input[#name='amount']"))).send_keys(100)
item = wait.until(EC.presence_of_element_located((By.CLASS_NAME,"OrderForm_total_6EL8d"))).text
print(item)
Hope this will solve your problem.