Why I cannot find some Javascript objects with Selenium? - python

I want to read the InitialChatFriendsList from my Facebook profile with Selenium.
When I login into Facebook manually and display the page source I easily can find
the element. However, when I use Selenium I cannot find the string:
source = browser.page_source
print 'InitialChatFriendsList' in source
# False
Why I cannot find these Javascript elements?

I think you did not configure the javascript files in eclipse. So please remove all files java eclipse (selenium) files and configure the files newly.

Related

How to extract embedded link from a webpage, having no iframe and not showing anthing on the network tab...?

This Image shows my problem
In the above image, the link inside the tag is the clickable link; it triggers a prompt to download the pdf file whose actual source link is https://lms.nust.edu.pk/portal/pluginfile.php/1504453/mod_resource/content/0/APG-Mutual-Evaluation-Report-Pakistan-October%202019.pdf
I am using Selenium to find the links specified by XPath like this
bigger_tag = driver.find_elements(By.XPATH, "//div[#class='activityinstance']//a[#class='aalink'][contains(#href, 'https://lms.nust.edu.pk/portal/mod/resource/view.php?') or contains(#href, 'https://lms.nust.edu.pk/portal/mod/url/view.php')]")
How do I extract such links from the webpage?
Since the site I am trying to scrape is a protected site and requires login credentials hence sharing the code would be fruitless here. I just want to know what's the standard procedure in a case where you can't find embedded links in the developer's tools. No Iframe, No Server request visible in the Network tab. Nothing.

Why does a list appear as a comment with Python Beautiful Soup?

I am trying to scrape the addresses of Dunkin' locations using this website: https://www.dunkindonuts.com/en/locations?location=10001. However, when trying to access the list of each Dunkin' on the web page, it shows up as comment. How do I access the list? I've never done web scraping before.
Here's my current code, I'm expecting a list of Dunkin' stores which I can then extract the addresses from.
requests.get() will return the raw HTML for a web page. This is only the beginning of the journey when you view this page in the browser. Your browser will parse that HTML to create the DOM. It will load other resources, such as images and scripts from other files. Then it will execute those scripts. In the modern web, those scripts will modify the DOM to give the page that you finally see in the browser. requests alone doesn't give you all that.
One solution is to use a library that loads the HTML into a browser and does all of the magic. selenium is one such library.

Get current HTML from browser tab with Python

I know there are plenty ways to get a HTML source passing the page url.
But is there a way to get the current html of a page if it displays data after some action ?
For example: A simple html page with a button (thats the source html) that displays random data when you click it.
Thanks
I believe you're looking for a tool collectively known as a "headless browser". The only one I've used that is available in Python (and can vouch for) is Selenium WebDriver, but there are plenty to choose from if you're searching up headless browsers for Python.
https://pypi.org/project/selenium
With this you should be able to programmatically load a web page, look up and click the button in the virtually rendered DOM, then lookup the innerHTML property of the targeted element.

Extract embedded script from web page

I have a link i want to scrape the content from that looks like this:
https://www.whatever.com/getDescModuleAjax.htm?productId=32663684002&t=1478698394335
But when i want to open it with selenium it won't work. When i load it in a normal Browser it opens as plain Text with the Html in a bracket like this:
window.productDescription='<div style="clea....
#I want this
....n.jpg" width="950"/></p></div>'";
I was thinking i will Download the source code as plain text and extract the content i need using Bs4. But this can't be the best solution. is there a way to ignore the tags and load the web page normally using selenium and python?
If all the source code is inside of JS variable:
window.variable="<div>...</div>" then you probably can't use bs4 to resolve it since bs4 works for pure html DOM nodes.
Is there a way to ignore the tags and load the web page normally using selenium and python
Most likely Selenium should be able to force on-page JS to get executed and load variable content into page's DOM. Try to search where window.productDescription or productDescription expression is applied/used (in which onloaded .js files)?

How to add clickable links to open a folder/file in output of a print statement? (Python desktop application)

I wanted to add clickable links to python output. I am outputting the file path & it's contents. Can someone please tell me how to make them clickable, so that when user clicks on them to navigate there directly.
I found this one, but it's for Django web part of Python. I am looking for desktop links
How to add clickable links to a field in Django admin?
Thanks in advance,
Phani
Are you talking about straight forward Python, no frameworks? Why not just print what you want?
print('example text')
I have the same issue, but only for win os.
As talking above you need to generate a link like this 'file:///C:/your/path/'
def show_firm_url(self, obj):
return '%s' % (obj.firm_url, obj.firm_url)
show_firm_url.allow_tags = True
By the security reason there is no way to open your local folder via win explorer, so you need some toher software. I've use Local explorer addin for chrome.
But you can see an error when click on link if your path contains special symbols or non-english characters when generating url link because html convert this characters like '%0H' and this strings will not match with your local path. For this I have not answer.

Categories

Resources