I'm automating the process of filling some forms online. The problem is that there are many individual elements whose children have basically the same ID of the stuff I want to find and fill. So my idea was to first find the parent I needed using Selenium and then go from there.
for range in cards:
cardID = driver.find_element(By.XPATH, "//a[contains(text(),'{cardis}/')]/ancestor::tr".format(cardis=allCards_NUMBER_List[range]))
cardREG_PRICE = cardID.find_element(By.XPATH, "input[contains(id(), 'txt_preco_')]")
But when I run this it only said that it can't find cardREG_PRICE. The ID name is correct, and from what I've read the XPATH structure should work. How can I fix this?
Your xpath is incorrect.That's why it is failing.
Instead of this
cardREG_PRICE = cardID.find_element(By.XPATH, "input[contains(id(), 'txt_preco_')]")
if should be like.
cardREG_PRICE = cardID.find_element(By.XPATH, ".//input[contains(#id, 'txt_preco_')]")
first thing id is an attribute and should pass with #, second thing // denote the node, Third thing . means intermediate child of the parent.
Related
I'd like to get some advise on how to find a value in certain xpath value, time in my case: 7:30pm.
Then in the same level click on an add button. (there are many other same button in the page with a different time value). Reference to attached picture, really appreciate any help.
Probably every web browser in DevTools has function to get XPATH on right click.
But you could try also find unique ID or CLASS (or other unique value - ie. style=...) for element. Or unique value for its parent or grandparent.
If there are many similar elements then you can get all of them and later use index - all_times[1] - to work only with one element.
It seems your element has unique headers="Times" which you could use in xpath.
You can also relative xpath (starting with dot .) and search partially - first find tr, next find .//td[#headers="Times"] in this tr and next find .//td[#headers="Avaliable"]/div/a in the same tr
EDIT:
If I couldn't find unique elements then I would get all tr because it is parent for Times and Avaliable and I can use relative xpath to search one Times and one Avaliable in single tr - and then use for-loop
all_tr = html.xpath("//tr")
for tr in all_tr:
# relative xpath in `tr` (using `.` at start)
header = tr.xpath('.//td[#headers="Times"')
if "7:30" in header[0].text:
# relative xpath in `tr` (using `.` at start)
avaliable = tr.xpath('.//td[#headers="Avaliable"]')
avaliable[0].click()
This question has already been answered and one of the easiest ways is to get the tag name, if already known, within the element
child_elements = element.find_elements_by_tag_name("<tag name>")
However, for the following element pasted, only 9 out of 25 instances of the tag name is returned. I am novice in JavaScript and thus, I am not able to zero down on the reason. In this example, I am trying to get the dt tag within the ol element. The code snippet I am using for that is,
par_element = browser.find_element_by_class_name('search-results__result-list')
child_elements = par_element.find_elements_by_tag_name("dt")
The element skeleton/structure from the page source is shown in the image below:
(the structure is the same for all the div tags, as one is expanded to show for example.
I have also tried getting the class name result-lockup__name directly, and it still returns only 9 out of the 25 instances. What could be the reason?
EDIT
Initially,all the elements were not loaded, and thus I had to scroll through the page by
browser.execute_script('window.scrollTo(0,document.body.scrollHeight)')
When the problem occurred once again, and I was not able to figure out, I raised this question. Apparently, it looks like even the scroll is not helping, as certain elements look hidden
After manually scrolling through them again, keeping the code in pause, I was able to "enable" them.
Is this a type of mask to save sites from being scraped? I feel now that I would probably have to scroll up in increments to reveal them all, but is there a smarter way?
The elements are loading dynamically and you need to scroll the page slowly to get all the child elements.Try the below code hopefully it will work.This is just an workaround.
element_list=[]
while True:
browser.find_element_by_tag_name("body").send_keys(Keys.DOWN)
time.sleep(2)
listlen_before=len(element_list)
par_element = browser.find_element_by_class_name('search-results__result-list')
child_elements = par_element.find_elements_by_tag_name("dt")
for ele in child_elements:
if ele.text in element_list:
continue
else:
element_list.append(ele.text)
listlen_after = len(element_list)
if listlen_before==listlen_after:
break
So I have this site and I'm trying to obtain the location and size of an element based on this xpath "//div[#class='titlu']"
How you can see that is visible and has nothing special.
Now the problem I've faced is that when I'm doing the search for xpath like this
e = self.driver.find_element_by_xpath(xpath) the location and size
of e are both 0
Also, for some reason, if I'm trying to get the text like this:
e.text is going to show me an empty string, and I need to get the actual text in this way e.get_attribute("textContain")
So do you have any idea how can I get the location and size of this element?
There are two elements matching this xpath. driver.find_element_by_xpath returns the first one while you are looking for the second one. Use the ancestor <div> with id attribute for unique xpath
"//div[#id='content-detalii']//div[#class='titlu']"
I asked my previous question here:
Xpath pulling number in table but nothing after next span
This worked and i managed to see the number i wanted in a firefox plugin called xpath checker. the results show below.
so I know i can find this number with this xpath, but when trying to run a python scrpit to find and save the number it says it cannot find it.
try:
views = browser.find_element_by_xpath("//div[#class='video-details-inside']/table//span[#class='added-time']/preceding-sibling::text()")
except NoSuchElementException:
print "NO views"
views = 'n/a'
pass
I no that pass is not best practice but i am just testing this at the moment trying to find the number. I'm wondering if i need to change something on the end of the xpath like .text as the xpath checker normally shows a results a little differently. Like below:
i needed to use the xpath i gave rather than the one used in the above picture because i only want the number and not the date. You can see part of the source in my previous question.
Thanks in advance! scratching my head here.
The xpath used in find_element_by_xpath() has to point to an element, not a text node and not an attribute. This is a critical thing here.
The easiest approach here would be to:
get the td's text (parent)
get the span's text (child)
remove child's text from parent's
Code:
span = browser.find_element_by_xpath("//div[#class='video-details-inside']/table//span[#class='added-time']")
td = span.find_element_by_xpath('..')
views = td.text.replace(span.text, '').strip()
I want to get an XPATH-Value from a Steamstoresite, e.g. http://store.steampowered.com/app/234160/. On the right side are 2 boxes. The first one contains Title, Genre, Developer ... I just need the Genre here. There is a different count on every game. Some have 4 Genres, some just one. And then there is another block, where the gamefeatures are listet (like Singleplayer, Multiplayer, Coop, Gamepad, ...)
I need all those values.
Also sometimes there is an image between (PEGI/USK)
http://store.steampowered.com/app/233290.
import requests
from lxml import html
page = requests.get('http://store.steampowered.com/app/234160/')
tree = html.fromstring(page.text)
blockone = tree.xpath(".//*[#id='main_content']/div[4]/div[3]/div[2]/div/div[1]")
blocktwo = tree.xpath(".//*[#id='main_content']/div[4]/div[3]/div[2]/div/div[2]")
print "Detailblock:" , blockone
print "Featureblock:" , blocktwo
This is the code I have so far. When I try it it just prints:
Detailblock: [<Element div at 0x2ce5868>]
Featureblock: [<Element div at 0x2ce58b8>]
How do I make this work?
xpath returns a list of matching elements. You're just printing out that list.
If you want the first element, you need blockone[0]. If you want all elements, you have to loop over them (e.g., with a comprehension).
And meanwhile, what do you want to print for each element? The direct inner text? The HTML for the whole subtree rooted at that element? Something else? Whatever you want, you need to use the appropriate method on the Element type to get it; lxml can't read your mind and figure out what you want, and neither can we.
It sounds like what you really want is just some elements deeper in the tree. You could xpath your way there. (Instead of going through all of the elements one by one and relying on index as you did, I'm just going to write the simplest way to get to what I think you're asking for.)
genres = [a.text for a in blockone[0].xpath('.//a')]
Or, really, why even get that blockone in the first place? Why not just xpath directly to the elements you wanted in the first place?
gtags = tree.xpath(".//*[#id='main_content']/div[4]/div[3]/div[2]/div/div[1]//a")
genres = [a.text for a in gtags]
Also, you could make this a lot simpler—and a lot more robust—if you used the information in the tags instead of finding them by explicitly walking the structure:
gtags = tree.xpath(".//div[#class='glance_tags popular_tags']//a")
Or, since there don't seem to be any other app_tag items anywhere, just:
gtags = tree.xpath(".//a[#class='app_tag']")