I'm trying to click on element by xpath that contains text from list.
categories = (WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, 'div.sph-MarketGroupNavBarButton '))))
categorylist = []
for category in categories:
text = category.text
categorylist.append(text)
print(text)
categoryindex = categories[3]
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, f"//div[text() ='{categoryindex}']"))).click()
print(categoryindex)
When I run it, the script creates the list and prints out list content and then I want to search for element that contains the text from the list and click on it, but this part of code doesn't work : It doesn't click or print the specified text. Nothing happens and I get timeout error.
categoryindex = categories[3]
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, f"//div[text() ='{categoryindex}']"))).click()
print(categoryindex)
When I replace {categoryindex} with actual text from list it clicks on it.
Change
categoryindex = categories[3]
to
categoryindex = categorylist[3]
Related
I want to get the locator of this element (5,126,
601) but seem cant get it normally.
I think it will have to hover the mouse to the element and try to get the xpath but still I cant hover my mouse into it because it an SVG element . Any one know a way to get the locator properly?
here is the link to the website: https://fundingsocieties.com/progress
Well, this element is updated only by hovering over the chart.
This is the unique XPath locator for this element:
"//*[name()='text']//*[name()='tspan' and(contains(#style,'bold'))]"
The entire Selenium command can be:
total_text = driver.find_element(By.XPATH, "//*[name()='text']//*[name()='tspan' and(contains(#style,'bold'))]").text
This can also be done with this CSS Selector: text tspan[style*='bold'], so the Selenium command could be
total_text = driver.find_element(By.CSS_SELECTOR, "text tspan[style*='bold']").text
Well, CSS Selector looks much shorter :)
Clicking on each node in turn will lead to the accompanying text being placed in the highcharts-label element. This text can then be retrieved and the Quarter (1st tspan) be linked to the Total value (4th tspan) that you desire.
url="https://fundingsocieties.com/progress"
driver.get(url)
chart = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, "//div[#data-highcharts-chart='0']"))
)
markers = chart.find_elements(By.XPATH, "//*[local-name()='g'][contains(#class,'highcharts-markers')]/*[local-name()='path']")
for m in markers:
m.click()
try:
element = WebDriverWait(driver, 2).until(
EC.presence_of_element_located((By.XPATH, "//*[local-name()='g'][contains(#class,'highcharts-label')]/*[local-name()='text']"))
)
tspans = element.find_elements(By.XPATH, "./*[local-name()='tspan']")
if len(tspans) > 3:
print ("%s = %s" % (tspans[0].text, tspans[3].text))
except TimeoutException:
pass
The output is as follows:
Q2-2015 = 2
Q3-2015 = 12
....
Q1-2022 = 5,076,978
Q2-2022 = 5,109,680
Q3-2022 = 5,122,480
Q4-2022 = 5,126,601
I am attempting to scrape data through multiple pages (36) from a website to gather the document number and the revision number for each available document and save it to two different lists. If I run the code block below for each individual page, it works perfectly. However, when I added the while loop to loop through all 36 pages, it will loop, but only the data from the first page is saved.
#sam.gov website
url = 'https://sam.gov/search/?index=sca&page=1&sort=-modifiedDate&pageSize=25&sfm%5Bstatus%5D%5Bis_active%5D=true&sfm%5BwdPreviouslyPerformedWrapper%5D%5BpreviouslyPeformed%5D=prevPerfNo%2F'
#webdriver
driver = webdriver.Chrome(options = options_, executable_path = r'C:/Users/439528/Python Scripts/Spyder/chromedriver.exe' )
driver.get(url)
#get rid of pop up window
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, '#sds-dialog-0 > button > usa-icon > i-bs > svg'))).click()
#list of revision numbers
revision_num = []
#empty list for all the WD links
WD_num = []
substring = '2015'
current_page = 0
while True:
current_page += 1
if current_page == 36:
#find all elements on page named "field name". For each one, get the text. if the text is 'Revision Date'
#then, get the 'sibling' element, which is the actual revision number. append the date text to the revision_num list.
elements = driver.find_elements_by_class_name('sds-field__name')
wd_links = driver.find_elements_by_class_name('usa-link')
for i in elements:
element = i.text
if element == 'Revision Number':
revision_numbers = i.find_elements_by_xpath("./following-sibling::div")
for x in revision_numbers:
a = x.text
revision_num.append(a)
#finding all links that have the partial text 2015 and putting the wd text into the WD_num list
for link in wd_links:
wd = link.text
if substring in wd:
WD_num.append(wd)
print('Last Page Complete!')
break
else:
#find all elements on page named "field name". For each one, get the text. if the text is 'Revision Date'
#then, get the 'sibling' element, which is the actual revision number. append the date text to the revision_num list.
elements = driver.find_elements_by_class_name('sds-field__name')
wd_links = driver.find_elements_by_class_name('usa-link')
for i in elements:
element = i.text
if element == 'Revision Number':
revision_numbers = i.find_elements_by_xpath("./following-sibling::div")
for x in revision_numbers:
a = x.text
revision_num.append(a)
#finding all links that have the partial text 2015 and putting the wd text into the WD_num list
for link in wd_links:
wd = link.text
if substring in wd:
WD_num.append(wd)
#click on next page
click_icon = WebDriverWait(driver, 5, 0.25).until(EC.visibility_of_element_located([By.ID,'bottomPagination-nextPage']))
click_icon.click()
WebDriverWait(driver, 5).until(EC.presence_of_element_located((By.ID, 'main-container')))
Things I've tried:
I added the WebDriverWait in order to slow the script down for the page to load and/or elements to be clickable/located
I declared the empty lists outside the loop so it does not overwrite over each iteration
I have edited the while loop multiple times to either count up to 36 (while current_page <37) or moved the counter to the top or bottom of the loop)
Any ideas? TIA.
EDIT: added screenshot of 'field name'
I have refactor your code and made things very simple.
driver = webdriver.Chrome(options = options_, executable_path = r'C:/Users/439528/Python Scripts/Spyder/chromedriver.exe' )
revision_num = []
WD_num = []
for page in range(1,37):
url = 'https://sam.gov/search/?index=sca&page={}&sort=-modifiedDate&pageSize=25&sfm%5Bstatus%5D%5Bis_active%5D=true&sfm%5BwdPreviouslyPerformedWrapper%5D%5BpreviouslyPeformed%5D=prevPerfNo%2F'.format(page)
driver.get(url)
if page==1:
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, '#sds-dialog-0 > button > usa-icon > i-bs > svg'))).click()
elements = WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH,"//a[contains(#class,'usa-link') and contains(.,'2015')]")))
wd_links = WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH,"//div[#class='sds-field__name' and text()='Revision Number']/following-sibling::div")))
for element in elements:
revision_num.append(element.text)
for wd_link in wd_links:
WD_num.append(wd_link.text)
print(revision_num)
print(WD_num)
if you know only 36 pages to iterate you can pass the value in the url.
wait for element visible using webdriverwait
construct your xpath in such a way so can identify element uniquely without if, but.
console output on my terminal:
I am trying to give input to the 'Currency I have' text box. This input is in the variable 'currency'. After I give the input in the text box, it displays a number of options. I want to do a case insensitive match on the 3 letter currency code I enter and the options the text box displays in the drop-down, and select the correct one.
The page I am testing: https://www.oanda.com/fx-for-business/historical-rates
currency = currency_list.loc[i,'currency']
print(f'\nFETCHING DATA FOR : {currency}')
df=pd.DataFrame()
#input our currency in "currency I have" text-box
WebDriverWait(driver, 2).until(EC.presence_of_element_located((By.CSS_SELECTOR,'#havePicker > div'))).click()
currency_have = WebDriverWait(driver, 2).until(EC.presence_of_element_located((By.CSS_SELECTOR,'#havePicker > div > input')))
try:
currency_have.clear()
except:
pass
currency_have.send_keys(currency)
options = WebDriverWait(driver, 2).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,'#havePicker > div > ul li')))
div_tags = [li_tags.find_elements_by_tag_name('div') for li_tags in options]
for div_tag in div_tags:
test = div_tag[1]
if (test.text.casefold()) == (currency.casefold()):
return test
else:
continue
the return statement in my code is incorrect. How do I proceed further to achieve my goal?
Please, any help would be appreciated. I'm new to selenium.
In order to wait for dropdown result, the order of your actions should be the following:
Clear "Currency I have" dropdown
Click it (not completely sure if this action is required, test it)
Wait for all values
Send value ("Australian dollar") # not AUD as there is also Saudi Riyal in search results
This part of code should look like below:
wait = WebDriverWait(driver, 10)
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, '#havePicker > div'))).click()
currency_have = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CSS_SELECTOR, '#havePicker > div > input')))
currency_have.clear()
currency_have.click()
options = WebDriverWait(driver, 10).until(
EC.presence_of_all_elements_located((By.CSS_SELECTOR, '#havePicker > div > ul li')))
currency_have.send_keys("Australian dollar")
Please note that you have two results on AUD. That's why I used the full currency name. If you want to use just aud, use split() Check here how to use it How to extract the substring between two markers?
Then improve your for loop.
I want to click the link tied to the label for this td.
I can use the onclick to find one item link,but the name changes from HemoGlobin A1C, to HGB A1c, etc and the onclick has no unique ID to search for everytime.
using this now:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//td[contains(#onclick, '%s' )]" % testname))).click()
testname = 'A1c'
Please try this:
testname= "a1c"
try:
elem = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, '//td[contains(translate(text(), "AC", "ac"), {})]/following-sibling::td[3]//td'.format(testname))))
except TimeoutException:
print("Element not found")
else:
elem.click()
Explanation:
//td[contains(translate(text(), "AC", "ac"), testname)]: First find a td element which contains text 'A1C' or 'A1c' (or 'a1C' or 'a1c'). Here transalte() is a xpath function which will replace all 'A' & 'C' with 'a' & 'c'.
/following-sibling::td[3]//td Then we have to go to a sibling of that td element, which in your case is a third sibling of same type, and then we find the child element td in it.
Please try this one to check whether works.
testname = "A1c"
element = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//td[text()[contains(.,'" + testname + "')]]")))
element.click
Please help. I'm locating, printing and saving the printed element.text to a csv file. Something isn't right. Basically if I just print the element without click to a new page, all elements from the specified positions of the xpath will get output,see code below.
while True:
with open('C:/Python34/email.csv','a') as f:
z=csv.writer(f, delimiter='\t',lineterminator = '\n',)
row = []
for sub_list in (driver.find_elements_by_xpath("//*[#id='wrapper']/div[2]/div[2]/div/div[2]/div[1]/div[3]/div[1]/div[2]/div/div[2]/div/div[2]/div/div[position() = 1 or position() = 2 or position() = 3]")):
print(sub_list.text,end=" ")
However, if I include clicks in the loop(see code in below) it will only print the element of one position only(e.g. div[1]). e.g. in the first iteration it will only print element in div[1], the next iteration will only print element in div[2] and so forth. Please enlighten me. With thanks
for sub_list in (driver.find_elements_by_xpath("//*[#id='wrapper']/div[2]/div[2]/div/div[2]/div[1]/div[3]/div[1]/div[2]/div/div[2]/div/div[2]/div/div[position() = 1 or position() = 2 or position() = 3]")):
print(sub_list.text,end=" ")
z.writerow(sub_list.text)
WebDriverWait(driver, 50).until(EC.visibility_of_element_located((By.XPATH,'//*[#id="detail-pagination-next-btn"]/span')))
WebDriverWait(driver, 50).until(EC.visibility_of_element_located((By.XPATH,'//*[#id="detail-pagination-next-btn"]')))
WebDriverWait(driver, 50).until(EC.visibility_of_element_located((By.ID,'detail-pagination-next-btn')))
WebDriverWait(driver, 50).until(EC.element_to_be_clickable((By.ID,'detail-pagination-next-btn')))
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CLASS_NAME, "detail-company-selection")))
time.sleep(5)
c=driver.find_element_by_id('detail-pagination-next-btn')
c.click()
time.sleep(5)