find text box by web driver python - python

I'm new in python, web driver in particular and I'm trying to find a text-box - the source code looks like this :
I've tried this :
box = driver.find_element_by_class_name('_3F6QL._2WovP')
though no success.
I'll be happy to add more information if needed - as I said I'm new here. appreciate the help

The problem you have, I think, is that the class is compound - comprises of two classes: _3F6QL and _2WovP.
Selenium doesn't allow for finding elements by a compound class name.
Try this:
box = driver.find_element_by_xpath("//*[contains(#class, '_3F6QL') and contains(#class, '_2WovP')]")
or:
box = driver.find_element_by_xpath("//*[contains(#class, '_3F6QL') and contains(#tabindex, '-1')]")
(Not sure about the latter, though).
Also this should work:
box = driver.find_element_by_xpath("//*[contains(#class, '_1Plpp')]/div")

Related

Facing issue with selenium when I try to use "By.CSS_SELECTOR"

I'm trying to build a script, that can click on the Facebook group category "join" button, when certain conditions are met.
The script is already able to navigate "https://www.facebook.com/search/groups/?q=nature_lover" path using selenium.
Image: https://i.stack.imgur.com/3QJhy.png
After navigating to that path I used this code to handle, each group component data.
all_group_elements = self.driver.find_elements(By.CSS_SELECTOR, "div[role=article]")
for group_element in group_elements:
group_name = str(element.text.split('\n')[0])
group_button = str(element.text.split('\n')[-1])
if group_button=="Join":
group_button_target = f"Join Group {group_name}"
if group_button=="Follow Group":
group_button_target = f"Follow Group {group_name}"
# I used this code to target and click the "join" button.
self.driver.find_element(By.CSS_SELECTOR, f"div[aria-label={group_button_target}]").click()
I'm also using "WebDriverWait" in the script. What is the issue here?
Your issue is with f"div[aria-label={group_button_target}]"
That translates to something like "div[aria-label=Join Group NAME]"
That's a problem, because the value of the attribute contains spaces and you need quotes around the value if there are spaces.
Eg:
Bad: 'TAG[ATTRIBUTE=SOME VALUE]'
Good: 'TAG[ATTRIBUTE="SOME VALUE"]'
Those quotes are important if the value contains spaces. You may want to change that line to:
self.driver.find_element(By.CSS_SELECTOR, f'div[aria-label="{group_button_target}"]').click()

How to find element in nested classes in Selenium (python)?

I am trying to scrape reviews from this website: https://www.goodreads.com/book/show/4865.How_to_Win_Friends_and_Influence_People?from_search=true&from_srp=true&qid=zsfs3jEPvd&rank=1
Reviews are hidden down many nested classes, I am trying to reach them but facing issues. I am fairly new to selenium. So far, I tried:
'''
a = driver.find_element("class name", "BookPage__reviewsSection")
for i in a.find_element("xpath", "//* [#id='ReviewsSection']").find_elements("class name",'lazyload-wrapper '):
print(i.find_element("xpath","//div[#class='ReviewsList']").text)
'''
The print statement outputs:
Friends & Following
Create a free account to discover what your friends think of this book!
Friends & Following
Create a free account to discover what your friends think of this book!
According to the output it just finds 'BookPage__reviewsSection' class and then 'ReviewsList' class which explains the output. Why doesn't it find 'lazyload-wrapper' class and then 'ReviewsList' class inside it?
I appreciate the help.
#nikhil bhati, you can try the following code. Basically I directly took the Xpath for all the review comments. Let me know if this helps or you wanted some other output. Sorry I have not tried your way of finding the element.
driver.get("https://www.goodreads.com/book/show/4865.How_to_Win_Friends_and_Influence_People?from_search=true&from_srp=true&qid=zsfs3jEPvd&rank=1")
allReviewTexts = driver.find_elements("xpath", "//div[#id='other_reviews']//div[#id='bookReviews']//span[contains(#id, 'reviewTextContainer')]//span[contains(#id, 'freeTextContainer')]")
print(len(allReviewTexts))
for i in allReviewTexts:
print(i.text)

Python folium: Present content dependent on fields=['id'] in GeoJsonPopup

I created a Map using python folium in jupyter lab. On the Map I display some geoJson-Files as shapes.
What Works so far:
The Shapes from the GeoJson file are displayed nicely on the map. I can change the color of the shapes based on a self generated style_function which checks feature['properties']['id'] to adjust the style type accordingly.
I'm also able to get a GeoJsonPopup on_click to a shape. The Popup shows id and the content of the id property of that shape.
geo_popup = folium.GeoJsonPopup(fields=['id'])
my_json = folium.GeoJson(file.path, style_function=style_function, popup=geo_popup)
my_json.add_to(map)
What I want:
I want to display in the popup some content based on the id. Very basic Example: if id = 1 i want to display 'This is the region Alpha' or if id = 2 -> 'This area is beautiful'.
Alternatively, if that is not possible, I would like to present a Link in the Popup where i can Access a Page with a parameter to show dedicated content for that id.
What I tried
I tried to derive a class from folium.GeoJsonPopup and somehow write content to the render function. But, however, I don't really get how it works and therefor all I did wasn't successful. Probably I took somewhere the wrong path and the solution is pretty easy.
Thanks for advice!
I followed the linked sample in the comment to the question. Therefore I had to add the needed dict entries to the the features properties.
Therefore I can link to this question. I used the .update from the solutions last comment to add the values.

Can't spy on CheckListBox with AutoIt

I can't spy on CheckListBox object (I think Delphi) in a window frame with AutoIt. It can't see anything in the area. I need to get the list of items from the area and possibly select one the items.
I am using python and robotframework.
I also tried using ControlListView:
self.get_autoit().ControlListView("Setup - XXXXX", "Select the XXXX", "[CLASS:TNewCheckListBox; INSTANCE:1]", "GetText")
But it throws:
com_error: (-2147352561, 'Parameter not optional.', None, None)
The error seems to be an issue with pywinauto.
Anyway I can not get the list of items from this annoying object.
The result from autoit spy is in screenshot:
Can anyone please suggest a good way to access the list of items in this unidentified area?
I can see the inside items from inspect.exe:
Please see the detailed answer from Vasily in the comments. However to summarize:
In the original question, I was trying to get the list of items from CheckListBox using pyautoit however as it was not working. So, as suggested by Vasily, I used pywinauto (another automation tool) in UIA mode and following worked for me:
self.Wizard = Application(backend="uia").connect(title = self.installerTitle) #connect the application
self.Wizard.InstallerDialog.TreeView.wait('visible', timeout=150) #wait for tree view to load
items = self.Wizard.InstallerDialog.TreeView.children() #get the children of tree view
for item in items: #iterate through items, radio button in this case
if item.window_text() == "item_name_to_select":
item.click_input() #click radio button if the text is what we are looking for
return
print "no item found with name: item_name_to_select"
The most helpful trick was to use print_control_identifiers() method in pywinauto to get the identifiers of the control. Also the inspect.exe in uia mode helped in identifying the objects.

python lxml xpath AttributeError (NoneType) with correct xpath and usually working

I am trying to migrate a forum to phpbb3 with python/xpath. Although I am pretty new to python and xpath, it is going well. However, I need help with an error.
(The source file has been downloaded and processed with tagsoup.)
Firefox/Firebug show xpath: /html/body/table[5]/tbody/tr[position()>1]/td/a[3]/b
(in my script without tbody)
Here is an abbreviated version of my code:
forumfile="morethread-alte-korken-fruchtweinkeller-89069-6046822-0.html"
XPOSTS = "/html/body/table[5]/tr[position()>1]"
t = etree.parse(forumfile)
allposts = t.xpath(XPOSTS)
XUSER = "td[1]/a[3]/b"
XREG = "td/span"
XTIME = "td[2]/table/tr/td[1]/span"
XTEXT = "td[2]/p"
XSIG = "td[2]/i"
XAVAT = "td/img[last()]"
XPOSTITEL = "/html/body/table[3]/tr/td/table/tr/td/div/h3"
XSUBF = "/html/body/table[3]/tr/td/table/tr/td/div/strong[position()=1]"
for p in allposts:
unreg=0
username = None
username = p.find(XUSER).text #this is where it goes haywire
When the loop hits user "tompson" / position()=11 at the end of the file, I get
AttributeError: 'NoneType' object has no attribute 'text'
I've tried a lot of try except else finallys, but they weren't helpful.
I am getting much more information later in the script such as date of post, date of user registry, the url and attributes of the avatar, the content of the post...
The script works for hundreds of other files/sites of this forum.
This is no encode/decode problem. And it is not "limited" to the XUSER part. I tried to "hardcode" the username, then the date of registry will fail. If I skip those, the text of the post (code see below) will fail...
#text of getpost
text = etree.tostring(p.find(XTEXT),pretty_print=True)
Now, this whole error would make sense if my xpath would be wrong. However, all the other files and the first numbers of users in this file work. it is only this "one" at position()=11
Is position() uncapable of going >10 ? I don't think so?
Am I missing something?
Question answered!
I have found the answer...
I must have been very tired when I tried to fix it and came here to ask for help. I did not see something quite obvious...
The way I posted my problem, it was not visible either.
the HTML I downloaded and processed with tagsoup had an additional tag at position 11... this was not visible on the website and screwed with my xpath
(It probably is crappy html generated by the forum in combination with tagsoups attempt to make it parseable)
out of >20000 files less than 20 are afflicted, this one here just happened to be the first...
additionally sometimes the information is in table[4], other times in table[5]. I did account for this and wrote a function that will determine the correct table. Although I tested the function a LOT and thought it working correctly (hence did not inlcude it above), it did not.
So I made a better xpath:
'/html/body/table[tr/td[#width="20%"]]/tr[position()>1]'
and, although this is not related, I ran into another problem with unxpected encoding in the html file (not utf-8) which was fixed by adding:
parser = etree.XMLParser(encoding='ISO-8859-15')
t = etree.parse(forumfile, parser)
I am now confident that after adjusting for strange additional and multiple , and tags my code will work on all files...
Still I will be looking into lxml.html, as I mentioned in the comment, I have never used it before, but if it is more robust and may allow for using the files without tagsoup, it might be a better fit and save me extensive try/except statements and loops to fix the few files screwing with my current script...

Categories

Resources