I hope you're fine.
I'm scraping the logos of some websites. I'm using the next code to localize them. I don't use a tag only the * because the class or attribute that contains the substring 'logo' there is not always in a <div> or <a> tags.
driver.find_element(By.CSS_SELECTOR, "*[class*='logo']")
I have obtained some of them but in some cases the 'class' doesn't have the substring 'logo'. I've checked some websites and the logo has attributes like 'id', 'alt' or 'name' that contains the substring 'logo'.
So I want to know if is there some condition like OR to applied it and if there is no match with 'class' then check in 'id', etc.
I tried with these options but both launch an error:
driver.find_element(By.CSS_SELECTOR, "*[class*='logo'] | *[id*='logo']")
driver.find_element(By.CSS_SELECTOR, "*[class*='logo'] || *[id*='logo']")
In both cases the error is:
selenium.common.exceptions.InvalidSelectorException: Message: invalid selector: An invalid or illegal selector was specified
You can use , to group multiple CSS selectors.
driver.find_element(By.CSS_SELECTOR, "[class*='logo'], [id*='logo']")
If you are not specific about using CSS_SELECTOR, you can try using XPATH as below:
//*[#*='logo']
Code should be:
driver.find_element(By.XPATH, "//*[#*='logo']")
This XPath expression will search the entire DOM(//*) for all attributes(#*) like class,id,name etc. which has the value =logo
Related
Currently I have this line of code which correctly selects this type of object on the webpage I'm trying to manipulate with Selenium:
pointsObj = driver.find_elements(By.CLASS_NAME,'treeImg')
What I need to do is add in a partial string match condition as well which looks in the section "CLGV (AHU-01_ahu_ChilledWtrVlvOutVolts)" in the line below.
<span class="treeImg v65point" style="cursor:pointer;">CLGV (AHU-01_ahu_ChilledWtrVlvOutVolts)</span>
I found online there's the ChainedBy option but I can't think of how to reference that text in the span. Do I need to use XPath? I tried that for a second but I couldn't think of how to parse it.
Refering both the CLASS_NAME and the innerText you can use either of the following locator strategies:
xpath using the classname treeImg and partial innerText:
pointsObj = driver.find_elements(By.XPATH,"//span[contains(#class, 'treeImg') and contains(., 'AHU-01_ahu_ChilledWtrVlvOutVolts')]")
xpath using all the classnames and entire innerText:
pointsObj = driver.find_elements(By.XPATH,"//span[#class='treeImg v65point' and text()='CLGV (AHU-01_ahu_ChilledWtrVlvOutVolts)']")
I'm trying to automize Linguee dictionary using Selenium.
For instance, in the image I would like to get an array with [de, para, por, con]. In order to get this, I wrote the following code
from selenium import webdriver
from selenium.webdriver.firefox.service import Service
from selenium.webdriver.common.by import By
s=Service(r"C:\Users\Usuario\python\geckodriver.exe")
driver = webdriver.Firefox(service=s)
driver.get('https://www.linguee.fr/frances-espanol/')
consent_button=driver.find_element(By.CSS_SELECTOR, "div[class='sn-b-def sn-blue']")
consent_button.click()
search_input=driver.find_element(By.CSS_SELECTOR, "input[id='queryinput']")
search_input.send_keys("de")
buttonSearch=driver.find_element(By.CSS_SELECTOR, "button[alt='Rechercher dans le dictionnaire français-espagnol']")
buttonSearch.click()
wortType=driver.find_element(By.CSS_SELECTOR, "span[class='tag_wordtype']").text
print( wortType)
translations=driver.find_element(By.CSS_SELECTOR, "div[class='isMainTerm'] a[class='dictLink']").text
print(translations)
The code work correctly, but it returns only the first translations ("en" in the image) while in the browser console I get 4. So, I have been reading and trying differents ways of fix the problem, like change the CSS_SELECTOR
translations=driver.find_element(By.CSS_SELECTOR,"div[class='isMainTerm'] div[class='exact'] div[class='lemma featured'] div[class='lemma_content'] div[gid=0] div[class='translation_lines'] div[class='translation sortablemg featured'] a")
translations=driver.find_element(By.CSS_SELECTOR, "div[class='isMainTerm'] a[class='dictLink']").text
translations=driver.find_element(By.CSS_SELECTOR, "div[class='isMainTerm'] a[class='dictLink']")
translations=driver.find_element(By.CSS_SELECTOR,"div.isMainTerm div.lemma_content a.dictLink").text
translations=driver.find_element(By.CSS_SELECTOR, "div[class='isMainTerm']> a[class='dictLink featured']").text
I have also used XPath but also returns only one.
I have read differents post in stackoverflow with similar problems like this enter link description here, but the problem persists.
Sorry this is long-winded, but can someone guide me to why it's doing this and what my options are?
This is how find_element method works.
It returns the first matching element on the DOM it finds.
So in case you are passing it locator matching several elements on the page the first matching element on the DOM will be returned.
In case you want to get all the elements matching that locator find_elements method should be used instead.
So, in case you want to get all the texts in elements matching div[class='isMainTerm'] a[class='dictLink'] CSS Selector instead of
translations=driver.find_element(By.CSS_SELECTOR, "div[class='isMainTerm'] a[class='dictLink']").text
print(translations)
You can use
translations = driver.find_elements(By.CSS_SELECTOR, "div[class='isMainTerm'] a[class='dictLink']")
for translation in translations:
print(translation.text)
BTW, it is only this CSS Selector locator div[class='isMainTerm'] a[class='dictLink featured'] is matching 4 elements as you mentioned and it looks like this locator you should use here while div[class='isMainTerm'] a[class='dictLink'] locator is matching 81 elements! on that page.
Hi how can I get an element by attribute and the attribute value in Python Selenium?
For example I have class="class1 class2 class3".
Now I want to get the element with the attribute class what ca.rries the classes "class1 class2 class3".
Is this possible?
If I use xpath, I always need to add the element type, input, option,...
I try to avoid the element type since it varies sometimes.
While constructing locators considering css-selectors or xpath you have to use the different attributes and the attribute-values to identify the WebElement uniquely within the DOM Tree.
The generic way is:
Using css_selector:
button.classname[attributeA='attributeA_value'][attributeB='attributeB_value']
Using xpath and attributes:
//button[#attributeA='attributeA_value'][#attributeB='attributeB_value']
As an example, for an element like:
<button type="button" aria-hidden="true" class="close alert alert-close" data-notify="dismiss">Close</button>
You can identify the Close element using either of it's the attributes and the corresponding attribute-values using either of the Locator Strategies:
Using css_selector:
button.close.alert.alert-close[data-notify='dismiss']
classes-> ^ ^^ ^^^ data-notify ^^^^ attribute
Using xpath and attributes:
//button[#class='close alert alert-close' and #data-notify='dismiss']
class attributes ^ ^^ ^^^ data-notify ^^^^ attribute
Using xpath and innerText:
//button[text()='Close']
innerText ^
The CSS selectors would be formatted like this:
'[attribute]'
'[attribute="value"]'
For example, the selector for the input field on google.com would be:
'input[name="q"]'
to answer this part
If I use xpath, I always need to add the element type, input,
option,... I try to avoid the element type since it varies sometimes.
you can use //* and then attribute type and attribute value.
//*[#class='class1 class2 class3']
//* represent any or all nodes
This question already has answers here:
selenium.common.exceptions.InvalidSelectorException with "span:contains('string')"
(2 answers)
Closed 1 year ago.
When I try to use :contains in Selenium's By.CSS_SELECTOR, such as
presence = EC.presence_of_element_located((By.CSS_SELECTOR, ".btn:contains('Continue Shopping')"))
or
presence = EC.presence_of_element_located((By.CSS_SELECTOR, ".btn\:contains('Continue Shopping')"))
or
presence = EC.presence_of_element_located((By.CSS_SELECTOR, ".btn\\:contains('Continue Shopping')"))
the Python program crashes with the error
Exception: Message: invalid selector: An invalid or illegal selector was specified
(Session info: chrome=95.0.4638.54)
Is it possible to use :contains in Selenium? The CSS selector
$('.btn:contains("Continue Shopping")')
works fine in Chrome's JS console.
Using Chrome 95.0.4638.54, ChromeDriver 95.0.4638.54, Python 3.10 on Ubuntu 20.04.
The selector :contains('text') is a jQuery selector, not a valid CSS selector like Selenium is expecting. I'm assuming the reason it works on the page via Chrome's DevTools console is because the page has jQuery defined on it.
Unfortunately, I do not believe you can directly select an element via its text using a CSS selector (link).
You have two options as far as I can see:
Alter your selector to be class or ID based (easiest)
Create a Selenium utility to run a JS script that uses this jQuery selector; e.g. execute_script("jQuery(" + id + ":contains('" + text + "')", id, text)
As mentioned by Aspok your CSS locators are not a valid CSS locators.
To locate element based on it text you can use XPath locator, something like:
//*[contains(#class,'btn') and(contains(text(),'Continue Shopping'))]
In case btn is the only class name attribute of that element your XPath can be
//*[#class='btn' and(contains(text(),'Continue Shopping'))]
As explained by #aspok, it is not a valid css selector.
In case you would like to have XPath for the same, and .btn is class and have text/partial text
Continue Shopping
You can try the below XPath :
//*[contains(text(),'Continue Shopping')]
or
//*[contains(text(), 'Continue Shopping') and contains(#class, 'btn')]
Please check in the dev tools (Google chrome) if we have unique entry in HTML DOM or not.
xpath that you should check :
//*[contains(text(), 'Continue Shopping') and contains(#class, 'btn')]
Steps to check:
Press F12 in Chrome -> go to element section -> do a CTRL + F -> then paste the xpath and see, if your desired element is getting highlighted with 1/1 matching node.
Also, Just letting you know that, //* can be replaced by tag name, if you found multiple matching nodes.
I am trying to select "seltarget" from a list of other options.
HTML:
<div class="filter-item" data-reactid=".3.1.0.0.1.0.0.3.0.0.0.3">seltarget</div>
When exported to a python file it recommends that I locate the element by xpath:
driver.find_element_by_xpath("//div[#id='quickswitcher']/div/nav/ul/li/div/div/menu/div[2]/div/div/div/div[5]").click()
This is working for the time being because "seltarget" just happens to be 5th in the list. The above line will select whichever option is 5th in the list. I want to be more specific and select the element "seltarget"
Locating by link_text works only with <a> tags. To locate this element you can use the class if it unique
driver.find_element_by_class_name('filter-item').click()
Or the data-reactid attribute
driver.find_element_by_css_selector('[data-reactid*="3.1.0.0.1.0.0.3.0.0.0.3"]').click()
To locate by the text you can also use css_selector
driver.find_element_by_css_selector('div:contains("seltarget")').click()
Or xpath
driver.find_element_by_xpath('//div[contains(text(), "seltarget")]').click()
If you want to find the div by its text, you can use the following.
driver.find_element_by_xpath("//div[text()='seltarget']")