Website - Load Data Intensive Content After the Entire Page Loads

Website - Load Data Intensive Content After the Entire Page Loads - python

I'll get right into it. What I have is a div that takes a while to load because it involves calling API's and indexing content. The div itself takes much longer to load when compared to the rest of the page. What I would like to do is load the entire page, and then once the data is fetched, it loads the div on its own. Maybe put like a loading animation in place while this happens. I'm just wondering what the best way would be to accomplish this.
I'm not sure if this is relevant for this question but I am using Google App Engine in the Python environment.
Thank you!!

Start fetching the data after the DOM loads.
<html>
<body>
</body>
//load data here !
</html>
Alternatively you can use jQuery DOM Ready event to load your data after the DOM elements have loaded:
<html>
<script>
$(document).ready(function() {
// DOM is loaded, get data here
});
</script>
<body>
</body>
</html>

Related

Selenium raw page source

I am trying to get the source code of a particular site with the help of Selenium with:
Python code:
driver.page_source
But it returns it after it has been encoded.
The raw file:
<html>
<head>
<title>AAAAAAAA</title>
</head>
<body>
</body>
When press 'View page source' inside Chrome, I saw the correct source raw without encoding.
How can this be achieved?

You can try using Javascript instead of Python builtin code to get the page source.
javascriptPageSource = driver.execute_script("return document.body.outerHTML;")

How to load js scripts from internet mentioned in html file from flask

i have in my html file script location like
<script defer src="https://use.fontawesome.com/releases/v5.0.8/js/all.js">
now when i am using this in flask render_template("test.html"), i think it's not able to load these files over the internet which is why i don't see it loaded properly. what's the way to do in flask so that i can load all js, css over internet loaded properly.
please help.
thanks a lot,
Sudip

Remove the defer tag, and put the script tag in the header. That makes sure it will load before you need it. When deferred, your page loaded first, so the font was not available yet:
<head>
<script src="https://use.fontawesome.com/releases/v5.0.8/js/all.js">
</head>

Fetch page with Scrapy, execute JS and extract variable

I have a project using the python screen-scraping framework scrapy. I created a spider that loads all <script> tags and processes the second one. This is because within the test data I gathered, the data I need, was in the second <script> tag.
But now I have a problem, whereas some pages contain the data I want in some other script tags (#3 or #4). Further obstacle is that mostly the second line of the second javascript tag has the JSON I want. But depending on the page, this could also be the 3rd or the 4th line.
Consider this simple HTML file:
<html>
<head>
<title> Test </title>
</head>
<body>
<p>
This is a text
</p>
<script type="text/javascript">
var myJSON = {
a: "a",
b: 42
}
</script>
</body>
</html>
I can access myJSON.b and get 42 if I open this page in my browser (firefox) and go to the developer tools and console.log(myJSON.b)
So my Question is: How can I extract JavaScript variable or JSON from a scrapy-fetched-page?

I had run into a similar issue before and I solved it by extracting the text in the script tag using something like (based on your sample HTML file):
response.xpath('//script/text()')
After that I used a regular expression to extract the required data in JSON format. So, using the selector above and your sample HTML, something close to:
pattern = r'i-suck-at-regular-expressions'
json_data = response.xpath('//script/text()').re_first(pattern)
Next, you should be able to use the json library to load the data as a python dictionary like so:
json.loads(json_data)
And it should return something similar to:
{"a": "a", "b": 42}

Open IE browser window using python when link is clicked

I need a url to be opened in a IE browser specifically.
I know the following code is wrong but I don't know what else to try.
How can I achieve that through python?
template
Open in IE
urls.py
url(r'^open-ie$', views.open_in_ie, name='ie'),
views.py
import webbrowser
def open_in_ie(request):
ie = webbrowser.get(webbrowser.iexplore)
return ie.open('https://some-link.com')
Again, I know this is wrong and it tries to open the ie browser at a server level. Any advices? Thank you!

Shot Answer: You can't.
Long Answer: If user is using IE to view your website you can open links in other browsers. But if user is using any other browser(firefox, chrome, etc.) all links will open in same browser, you can't access other browsers. So in your case answer is no, because you are trying to open IE from some other browser.
Here is the code to open another browser from IE if you are interested:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
<head>
<title>HTA Test</title>
<hta:application applicationname="HTA Test" scroll="yes" singleinstance="yes">
<script type="text/javascript">
function openURL()
{
var shell = new ActiveXObject("WScript.Shell");
shell.run("http://www.google.com");
}
</script>
</head>
<body>
<input type="button" onclick="openURL()" value="Open Google">
</body>
</html>
Code from here

using urllib and beautifulsoup to find values inside "hidden" tags

i want to know if it is possible to display the values of hidden tags. im using urllib and beautifulsoup but i cant seem to get what i want.
the html code im using is written below: (saved as hiddentry.html)
<html>
<head>
<script type="text/javascript">
//change hidden elem value
function changeValue()
{
document.getElementById('hiddenElem').value = 'hello matey!';
}
//this will verify if i have successfully changed the hiddenElem's value
function printHidden()
{
document.getElementById('displayHere').innerHTML = document.getElementById('hiddenElem').value;
}
</script>
</head>
<body>
<div id="hiddenDiv" style="position: absolute; left: -1500px">
<!--i want to find the value of this element right here-->
<span id="hiddenElem"></span>
</div>
<span id="displayHere"></span>
<script type="text/javascript">
changeValue();
printHidden();
</script>
</body>
</html>
what i want to print is the value of element with id hiddenElem.
to do this i tried using urllib and beautifulsoup combo. the code i used is:
from BeautifulSoup import BeautifulSoup
import urllib2
import urllib
mysite = urllib.urlopen("http://localhost/hiddentry.html")
soup = BeautifulSoup(mysite)
print soup.prettify()
print '\n\n'
areUthere = soup.find(id="hiddenElem").find(text=True)
print areUthere
what i am getting as output though is None.
any ideas? is what i am trying to accomplish even possible?

beautifulsoup parses the html that it gets from the server. If you want to see generated values, you need to somehow execute the embedded javascript on the page before passing the string to beautifulsoup. Once you run the javascript, you'll pass the modified DOM html to beautifulsoup.
As far as browser emulation:
this combo from the creator of jQuery looks interesting
SO question bringing the browser to the server
and SO question headless internet browser
Using browser emulation, you should be able to pull down the base HTML, run browser emulation to execute the javascript, and then take the modified DOM HTML and jam it into beautifulsoup.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Website - Load Data Intensive Content After the Entire Page Loads - python

Related

Selenium raw page source

How to load js scripts from internet mentioned in html file from flask

Fetch page with Scrapy, execute JS and extract variable

Open IE browser window using python when link is clicked

using urllib and beautifulsoup to find values inside "hidden" tags

Categories

Resources