Scrape Website that is running meteor, using python requests - python

You see, there is this website called edabit. All you need to know is that I want to scrape the website without using selenium.
I want to learn how.
What does selenium do under the hood that makes it step forward. I make a request and get html that references to a meteor runtime config. Once I get to that stage, how do I go from there?
import requests
url = "https://edabit.com/challenge/ARr5tA458o2tC9FTN"
r = requests.get(url)
print(r.text,r,sep="\n\n\t")
⇩⇩⇩⇩Outputs⇩⇩⇩⇩
<!DOCTYPE html>
<html lang="en">
<head>
<link rel="stylesheet" type="text/css" class="__meteor-css__" href="/c039cef9b47481baf0e9a343e536154438171f0f.css?meteor_css_resource=true">
<link rel="stylesheet" type="text/css" class="__meteor-css__" href="/52f02f273b0d7fd6013f016f05c3645aa114c8e6.css?meteor_css_resource=true">
<meta name="fragment" content="!">
<link href="https://fonts.googleapis.com/css?family=Lato" rel="stylesheet">
<link href="https://edabit-fonts.s3-us-west-1.amazonaws.com/avenir.css?family=Avenir" rel="stylesheet">
<link rel="icon" type="image/png" href="https://s3.amazonaws.com/edabit-images/logo_main_medium.png">
<title>Edabit</title>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link href="https://fonts.googleapis.com/css?family=Lato" rel="stylesheet">
<link href="https://edabit-fonts.s3-us-west-1.amazonaws.com/avenir.css?family=Avenir" rel="stylesheet">
<link rel="icon" type="image/png" href="https://s3.amazonaws.com/edabit-images/logo_main_medium.png">
<script src="https://script.tapfiliate.com/tapfiliate.js" type="text/javascript" async></script>
<script type="text/javascript">
(function(t,a,p){t.TapfiliateObject=a;t[a]=t[a]||function(){
(t[a].q=t[a].q||[]).push(arguments)}})(window,'tap');
</script>
</head>
<body>
<script type="text/javascript">__meteor_runtime_config__ = JSON.parse(decodeURIComponent("%7B%22meteorRelease%22%3A%22METEOR%401.8.2%22%2C%22gitCommitHash%22%3A%22b231bf6df48b60606f5acf2f54427d52feb3711f%22%2C%22meteorEnv%22%3A%7B%22NODE_ENV%22%3A%22production%22%2C%22TEST_METADATA%22%3A%22%7B%7D%22%7D%2C%22PUBLIC_SETTINGS%22%3A%7B%22analyticsSettings%22%3A%7B%22Google%20Analytics%22%3A%7B%22trackingId%22%3A%22UA-91229704-1%22%7D%7D%2C%22hotjar%22%3A%7B%22hjid%22%3A%22399651%22%2C%22hjsv%22%3A%221%22%7D%2C%22pricing%22%3A%7B%22lifetime%22%3A299%2C%22monthly%22%3A39%2C%22yearly%22%3A120%7D%2C%22stripeKey%22%3A%22pk_live_OW5zXRZem0eb8x31ZSaET8xO%22%2C%22tap%22%3A%7B%22accountId%22%3A%2218539-483fb9%22%2C%22integration%22%3A%22stripe%22%7D%2C%22trialLimit%22%3A15%7D%2C%22ROOT_URL%22%3A%22https%3A%2F%2Fedabit.com%22%2C%22ROOT_URL_PATH_PREFIX%22%3A%22%22%2C%22kadira%22%3A%7B%22appId%22%3A%22Mz7xd9SXTmvc2Cyw5%22%2C%22endpoint%22%3A%22https%3A%2F%2Fapm-engine.meteor.com%22%2C%22clientEngineSyncDelay%22%3A10000%2C%22enableErrorTracking%22%3Atrue%7D%2C%22autoupdate%22%3A%7B%22versions%22%3A%7B%22web.browser%22%3A%7B%22version%22%3A%224c5f400ef296a1f363c2ac037bcca994a67c05a8%22%2C%22versionRefreshable%22%3A%22eabbe41e49f1cfb764e2a02a33e852728e7a26f0%22%2C%22versionNonRefreshable%22%3A%22799cb33096d2bcc3b00eb190b902cd6840ce9e86%22%7D%2C%22web.browser.legacy%22%3A%7B%22version%22%3A%22f45f0508d12468a843658947c98fad067be0fea6%22%2C%22versionRefreshable%22%3A%22eabbe41e49f1cfb764e2a02a33e852728e7a26f0%22%2C%22versionNonRefreshable%22%3A%225fdcb5621ec7f5351fe9c6d265dc52791133253a%22%7D%7D%2C%22autoupdateVersion%22%3Anull%2C%22autoupdateVersionRefreshable%22%3Anull%2C%22autoupdateVersionCordova%22%3Anull%2C%22appId%22%3A%226oe24v3kjymx1952geg%22%7D%2C%22appId%22%3A%226oe24v3kjymx1952geg%22%2C%22isModern%22%3Afalse%7D"))</script>
<script type="text/javascript" src="/fa82c61660f6e946bc1c9dfcc6f33af930712e50.js?meteor_js_resource=true"></script>
</body>
</html>
<Response [200]>

Related

what am i doing wrong? CSS files don't work in Django project

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Japan</title>
<link rel="stylesheet" href="css/reset.css">
<link rel="stylesheet" href="css/style.css">
</head>
what must i do to work css files in django project?
You need the add the static URL :
<link rel="stylesheet" href="{{ STATIC_URL }}/css/reset.css" />
<link rel="stylesheet" href="{{ STATIC_URL }}/css/style.css" />

CSS and JS Not Working on Flask Framework

Css and Javascript are not working on my website
I'm using Flask framework on PythonAnywhere
Directory sturcture:
home/dubspher/mysite/
- README.md
- app.py
- index.html
Static/
- css
- fonts
- images
- js
- sass
Original version of HTML:
<html>
<head>
<title>Dubspher.</title>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<meta name="description" content="" />
<meta name="keywords" content="" />
<!--[if lte IE 8]><script src="css/ie/html5shiv.js"></script><![endif]-->
<script src="js/jquery.min.js"></script>
<script src="js/jquery.dropotron.min.js"></script>
<script src="js/jquery.scrollgress.min.js"></script>
<script src="js/jquery.scrolly.min.js"></script>
<script src="js/jquery.slidertron.min.js"></script>
<script src="js/skel.min.js"></script>
<script src="js/skel-layers.min.js"></script>
<script src="js/init.js"></script>
<noscript>
<link rel="stylesheet" href="css/skel.css" />
<link rel="stylesheet" href="css/style.css" />
<link rel="stylesheet" href="css/style-xlarge.css" />
</noscript>
<!--[if lte IE 9]><link rel="stylesheet" href="css/ie/v9.css" /><![endif]-->
<!--[if lte IE 8]><link rel="stylesheet" href="css/ie/v8.css" /><![endif]-->
</head>
Modified Version but not working:
<html>
<head>
<title>Dubspher.</title>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<meta name="description" content="" />
<meta name="keywords" content="" />
<noscript>
<link href="{{ url_for('static', filename='css/style.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/skel.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/style-xlarge.css') }}" rel="stylesheet">
</noscript>
<!--[if lte IE 9]><link rel="stylesheet" href="css/ie/v9.css" /><![endif]-->
<!--[if lte IE 8]><link rel="stylesheet" href="css/ie/v8.css" /><![endif]-->
</head>
<body class="landing">
<!--[if lte IE 8]><script src="css/ie/html5shiv.js"></script><![endif]-->
<script type="text/javascript" src="{{ url_for('static', filename='js/jquery.min.js"') }}"></script>
<script type="text/javascript" src="{{ url_for('static', filename='js/jquery.dropotron.min.js"') }}"></script>
<script type="text/javascript" src="{{ url_for('static', filename='js/jquery.scrollgress.min.js"') }}"></script>
<script type="text/javascript" src="{{ url_for('static', filename='js/jquery.scrolly.min.js"') }}"></script>
<script type="text/javascript" src="{{ url_for('static', filename='js/jquery.slidertron.min.js"') }}"></script>
<script type="text/javascript" src="{{ url_for('static', filename='js/skel.min.js"') }}"></script>
<script type="text/javascript" src="{{ url_for('static', filename='js/skel-layers.min.js"') }}"></script>
<script type="text/javascript" src="{{ url_for('static', filename='js/init.js"') }}"></script>
As you can see I moved JS from as suggested on another post! And also moved all the stylesheet files to a new folder called static!
I cleared the cache on my browser but I can see that 404 errors in the inspect mode.
http://dubspher.pythonanywhere.com
App.py
from flask import Flask
# set the project root directory as the static folder, you can set others.
app = Flask(__name__, static_folder='/home/dubspher/mysite/')
#app.route('/')
def static_file():
return app.send_static_file('index.html')
if __name__ == "__main__":
app.run()
With the files laid out like you currently have them, move your index.html file into the root of static subfolder.
home/dubspher/mysite/
- README.md
- app.py
Static/
- index.html
- css
- fonts
- images
- js
- sass
And use the following app.py
from flask import Flask
# set the project root directory as the static folder, you can set others.
app = Flask(__name__, static_url_path="/static", static_folder='/home/dubspher/mysite/static')
#app.route('/')
def static_file():
return app.send_static_file('index.html')
if __name__ == "__main__":
app.run()
And change your index.html links to:
<html>
<head>
<title>Dubspher.</title>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<meta name="description" content="" />
<meta name="keywords" content="" />
<!--[if lte IE 8]><script src="css/ie/html5shiv.js"></script><![endif]-->
<script src="/static/js/jquery.min.js"></script>
<script src="/static/js/jquery.dropotron.min.js"></script>
<script src="/static/js/jquery.scrollgress.min.js"></script>
<script src="/static/js/jquery.scrolly.min.js"></script>
<script src="/static/js/jquery.slidertron.min.js"></script>
<script src="/static/js/skel.min.js"></script>
<script src="/static/js/skel-layers.min.js"></script>
<script src="/static/js/init.js"></script>
<noscript>
<link rel="stylesheet" href="/static/css/skel.css" />
<link rel="stylesheet" href="/static/css/style.css" />
<link rel="stylesheet" href="/static/css/style-xlarge.css" />
</noscript>
<!--[if lte IE 9]><link rel="stylesheet" href="/static/css/ie/v9.css" /><![endif]-->
<!--[if lte IE 8]><link rel="stylesheet" href="/static/css/ie/v8.css" /><![endif]-->
</head>
Your stylesheets are referenced in your javascript. Modify init.js so that it references the correct paths:
global: { href: '/static/css/style.css', containers: 1400, grid: { gutters: ['2em', 0] } },
xlarge: { media: '(max-width: 1680px)', href: '/static/css/style-xlarge.css', containers: 1200 },
large: { media: '(max-width: 1280px)', href: '/static/css/style-large.css', containers: 960, grid: { gutters: ['1.5em', 0] }, viewport: { scalable: false } },
medium: { media: '(max-width: 980px)', href: '/static/css/style-medium.css', containers: '90%', grid: { zoom: 2 } },
small: { media: '(max-width: 736px)', href: '/static/css/style-small.css', containers: '90%!', grid: { gutters: ['1.25em', 0], zoom: 3 } },
xsmall: { media: '(max-width: 480px)', href: '/static/css/style-xsmall.css' }

How should I program logging into a website and completing actions

Scenario: I want to log into a web page and request an export of data that will be sent to me via email. I'm most familiar with python so I start with this:
import mechanize
br = mechanize.Browser()
br.set_handle_robots(False) # ignore robots
br.set_handle_refresh(False) # can sometimes hang without this
br.addheaders = [('User-agent', 'Firefox')]
response = br.open("http://weightgurus.com/")
print response.read() # the text of the page
for form in br.forms():
print "Form name:", form.name
print form
Here is the repsonse HTML:
<!DOCTYPE html>
<html>
<head>
<meta name="viewport" content="width=device-width, height=device-height, initial-scale=1, user-scalable=0"/>
<link rel="stylesheet" type="text/css" href="css/login.css" />
<link rel="stylesheet" type="text/css" href="css/app.css" />
<link rel="stylesheet" type="text/css" href="css/colorbox.css" media="all" />
<link rel="stylesheet" type="text/css" href="css/animate.css" media="all" />
<link rel="stylesheet" type="text/css" href="css/datepicker.css" media="all" />
</head>
<body>
<script>
window.module = {}
</script>
<script type="text/javascript" src="js/dependencies.js"></script>
<script type="text/javascript" src="js/bootstrap-datepicker.js"></script>
<script type="text/javascript" src="js/components.js"></script>
<script type="text/javascript" src="js/app.js"></script>
<script type="text/javascript" src="js/login.js"></script>
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-29982844-1', 'auto');
ga('send', 'pageview');
</script>
</body>
</html>
When I look at the HTML from a browser I get this section:
<div id="loginPrompt"><form id="form" class="centerText"><div class="formGroup">Email:<input id="email" type="email" class="field"></div><div class="formGroup">Password:<input id="password" type="password" class="field"></div><div id="forgotPassword">Forgot password?</div><input id="goCatcher" type="submit" style="position:absolute;margin-left:-10000px;"></form><div class="submitContainer"><div id="submit" class="h-medium popup-title text-center">Log in</div></div></div>
The problem is there are not forms listed for this website. How do I fix this?

Trying to fix css paths in Django

I'm working locally on a django project with bootstrap . The structure screenshot is above:
I have the following in index.html:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="description" content="" />
<meta name="author" content="" />
<title>Landing Page - Start Bootstrap Theme</title>
<!-- Bootstrap Core CSS -->
{% load staticfiles %}
<link href="{% static "css/bootstrap.min.css" %}" rel="stylesheet" />
<!-- Custom CSS -->
<link href="{% static "css/landing-page.css" %}" rel="stylesheet" />
<!-- Custom Fonts -->
<link href="{% static "font-awesome-4.2.0/css/font-awesome.min.css" %}" rel="stylesheet" type="text/css" />
<link href="http://fonts.googleapis.com/css?family=Lato:300,400,700,300italic,400italic,700italic" rel="stylesheet" type="text/css" />
<!-- HTML5 Shim and Respond.js IE8 support of HTML5 elements and media queries -->
<!-- WARNING: Respond.js doesn't work if you view the page via file:// -->
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/libs/html5shiv/3.7.0/html5shiv.js"></script>
<script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
In my settings.py I have:
STATIC_URL = '/static/'
TEMPLATE_DIRS = (
os.path.join(BASE_DIR, 'templates'),
When I open "http://onetwentyseven.0.0.1:8000/index/", I see the html but the bootstrap css styling is not present. in dev tools I see:
How can I fix the CSS paths?
This seems to be a syntax error. If you move the css directory from static>app1>css to static>css. It should work just fine

ng-Grid 'Selection' wont show up in browser using Flask

Using Master/Details Example on pluker
http://plnkr.co/edit/CncDWCktXTuBQdDVfuVv?p=preview
I copied everthing local file and just launching the html
in firefox and chrome. Everything works.
But when I try to server the page via a Flask python server. I
dont see selection in html as output. It looks like that {{ ?? }}
in not showing up....
app.py: trimmed down flask app
from flask import Flask, render_template, request, json, Response
app = Flask(__name__)
DEBUG = True
#app.route("/index", methods=['GET', 'POST'])
def selected_version():
return render_template("index.html")
if __name__ == "__main__":
app.run(debug=True)
The index.html:
<!DOCTYPE html>
<html ng-app="myApp">
<head lang="en">
<meta charset="utf-8">
<title>Custom Plunker</title>
<link rel="stylesheet" type="text/css" href="http://angular-ui.github.com/ng-grid/css/ng-grid.css" />
<link rel="stylesheet" type="text/css" href="../static/css/style.css" />
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.8.0/jquery.min.js"></script>
<script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.0.2/angular.min.js"></script>
<script type="text/javascript" src="http://angular-ui.github.com/ng-grid/lib/ng-grid.debug.js"></script>
<script type="text/javascript" src="../static/js/newAppMain.js"></script>
</head>
<body ng-controller="MyCtrl">
<div class="gridStyle" ng-grid="gridOptions"></div>
<div class="selectedItems"> {{mySelections}} </div>
</body>
</html>
The view Source of index.html
<!DOCTYPE html>
<html ng-app="myApp">
<head lang="en">
<meta charset="utf-8">
<title>Custom Plunker</title>
<link rel="stylesheet" type="text/css" href="http://angular-ui.github.com/ng-grid/css/ng-grid.css" />
<link rel="stylesheet" type="text/css" href="../static/css/style.css" />
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.8.0/jquery.min.js"></script>
<script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.0.2/angular.min.js"></script>
<script type="text/javascript" src="http://angular-ui.github.com/ng-grid/lib/ng-grid.debug.js"></script>
<script type="text/javascript" src="../static/js/newAppMain.js"></script>
</head>
<body ng-controller="MyCtrl">
<div class="gridStyle" ng-grid="gridOptions"></div>
<div class="selectedItems"></div>
</body>
</html>

Categories

Resources