Scenario: I want to log into a web page and request an export of data that will be sent to me via email. I'm most familiar with python so I start with this:
import mechanize
br = mechanize.Browser()
br.set_handle_robots(False) # ignore robots
br.set_handle_refresh(False) # can sometimes hang without this
br.addheaders = [('User-agent', 'Firefox')]
response = br.open("http://weightgurus.com/")
print response.read() # the text of the page
for form in br.forms():
print "Form name:", form.name
print form
Here is the repsonse HTML:
<!DOCTYPE html>
<html>
<head>
<meta name="viewport" content="width=device-width, height=device-height, initial-scale=1, user-scalable=0"/>
<link rel="stylesheet" type="text/css" href="css/login.css" />
<link rel="stylesheet" type="text/css" href="css/app.css" />
<link rel="stylesheet" type="text/css" href="css/colorbox.css" media="all" />
<link rel="stylesheet" type="text/css" href="css/animate.css" media="all" />
<link rel="stylesheet" type="text/css" href="css/datepicker.css" media="all" />
</head>
<body>
<script>
window.module = {}
</script>
<script type="text/javascript" src="js/dependencies.js"></script>
<script type="text/javascript" src="js/bootstrap-datepicker.js"></script>
<script type="text/javascript" src="js/components.js"></script>
<script type="text/javascript" src="js/app.js"></script>
<script type="text/javascript" src="js/login.js"></script>
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-29982844-1', 'auto');
ga('send', 'pageview');
</script>
</body>
</html>
When I look at the HTML from a browser I get this section:
<div id="loginPrompt"><form id="form" class="centerText"><div class="formGroup">Email:<input id="email" type="email" class="field"></div><div class="formGroup">Password:<input id="password" type="password" class="field"></div><div id="forgotPassword">Forgot password?</div><input id="goCatcher" type="submit" style="position:absolute;margin-left:-10000px;"></form><div class="submitContainer"><div id="submit" class="h-medium popup-title text-center">Log in</div></div></div>
The problem is there are not forms listed for this website. How do I fix this?
Related
You see, there is this website called edabit. All you need to know is that I want to scrape the website without using selenium.
I want to learn how.
What does selenium do under the hood that makes it step forward. I make a request and get html that references to a meteor runtime config. Once I get to that stage, how do I go from there?
import requests
url = "https://edabit.com/challenge/ARr5tA458o2tC9FTN"
r = requests.get(url)
print(r.text,r,sep="\n\n\t")
⇩⇩⇩⇩Outputs⇩⇩⇩⇩
<!DOCTYPE html>
<html lang="en">
<head>
<link rel="stylesheet" type="text/css" class="__meteor-css__" href="/c039cef9b47481baf0e9a343e536154438171f0f.css?meteor_css_resource=true">
<link rel="stylesheet" type="text/css" class="__meteor-css__" href="/52f02f273b0d7fd6013f016f05c3645aa114c8e6.css?meteor_css_resource=true">
<meta name="fragment" content="!">
<link href="https://fonts.googleapis.com/css?family=Lato" rel="stylesheet">
<link href="https://edabit-fonts.s3-us-west-1.amazonaws.com/avenir.css?family=Avenir" rel="stylesheet">
<link rel="icon" type="image/png" href="https://s3.amazonaws.com/edabit-images/logo_main_medium.png">
<title>Edabit</title>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link href="https://fonts.googleapis.com/css?family=Lato" rel="stylesheet">
<link href="https://edabit-fonts.s3-us-west-1.amazonaws.com/avenir.css?family=Avenir" rel="stylesheet">
<link rel="icon" type="image/png" href="https://s3.amazonaws.com/edabit-images/logo_main_medium.png">
<script src="https://script.tapfiliate.com/tapfiliate.js" type="text/javascript" async></script>
<script type="text/javascript">
(function(t,a,p){t.TapfiliateObject=a;t[a]=t[a]||function(){
(t[a].q=t[a].q||[]).push(arguments)}})(window,'tap');
</script>
</head>
<body>
<script type="text/javascript">__meteor_runtime_config__ = JSON.parse(decodeURIComponent("%7B%22meteorRelease%22%3A%22METEOR%401.8.2%22%2C%22gitCommitHash%22%3A%22b231bf6df48b60606f5acf2f54427d52feb3711f%22%2C%22meteorEnv%22%3A%7B%22NODE_ENV%22%3A%22production%22%2C%22TEST_METADATA%22%3A%22%7B%7D%22%7D%2C%22PUBLIC_SETTINGS%22%3A%7B%22analyticsSettings%22%3A%7B%22Google%20Analytics%22%3A%7B%22trackingId%22%3A%22UA-91229704-1%22%7D%7D%2C%22hotjar%22%3A%7B%22hjid%22%3A%22399651%22%2C%22hjsv%22%3A%221%22%7D%2C%22pricing%22%3A%7B%22lifetime%22%3A299%2C%22monthly%22%3A39%2C%22yearly%22%3A120%7D%2C%22stripeKey%22%3A%22pk_live_OW5zXRZem0eb8x31ZSaET8xO%22%2C%22tap%22%3A%7B%22accountId%22%3A%2218539-483fb9%22%2C%22integration%22%3A%22stripe%22%7D%2C%22trialLimit%22%3A15%7D%2C%22ROOT_URL%22%3A%22https%3A%2F%2Fedabit.com%22%2C%22ROOT_URL_PATH_PREFIX%22%3A%22%22%2C%22kadira%22%3A%7B%22appId%22%3A%22Mz7xd9SXTmvc2Cyw5%22%2C%22endpoint%22%3A%22https%3A%2F%2Fapm-engine.meteor.com%22%2C%22clientEngineSyncDelay%22%3A10000%2C%22enableErrorTracking%22%3Atrue%7D%2C%22autoupdate%22%3A%7B%22versions%22%3A%7B%22web.browser%22%3A%7B%22version%22%3A%224c5f400ef296a1f363c2ac037bcca994a67c05a8%22%2C%22versionRefreshable%22%3A%22eabbe41e49f1cfb764e2a02a33e852728e7a26f0%22%2C%22versionNonRefreshable%22%3A%22799cb33096d2bcc3b00eb190b902cd6840ce9e86%22%7D%2C%22web.browser.legacy%22%3A%7B%22version%22%3A%22f45f0508d12468a843658947c98fad067be0fea6%22%2C%22versionRefreshable%22%3A%22eabbe41e49f1cfb764e2a02a33e852728e7a26f0%22%2C%22versionNonRefreshable%22%3A%225fdcb5621ec7f5351fe9c6d265dc52791133253a%22%7D%7D%2C%22autoupdateVersion%22%3Anull%2C%22autoupdateVersionRefreshable%22%3Anull%2C%22autoupdateVersionCordova%22%3Anull%2C%22appId%22%3A%226oe24v3kjymx1952geg%22%7D%2C%22appId%22%3A%226oe24v3kjymx1952geg%22%2C%22isModern%22%3Afalse%7D"))</script>
<script type="text/javascript" src="/fa82c61660f6e946bc1c9dfcc6f33af930712e50.js?meteor_js_resource=true"></script>
</body>
</html>
<Response [200]>
I built my backend using Python Flask and tested it with basic HTML templates with no CSS and it worked no problem. However, when I tried to implement Bootstrap it won't load the css. I've placed the css and js files into a static folder.
Here is the Bootstrap HTML markup:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags -->
<meta name="description" content="">
<meta name="author" content="">
<link rel="icon" href="../../favicon.ico">
<title>Simple Safety</title>
<!-- Bootstrap core CSS -->
<link href="{{ url_for('static', filename ='bootstrap.min.css') }}" rel="stylesheet">
<!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
<link href="{{ url_for('static', filename =ie10-viewport-bug-workaround.css') }}" rel="stylesheet">
<!-- Custom styles for this template -->
<link href="{{ url_for('static', filename ='style.css') }}" rel="stylesheet">
<!-- Just for debugging purposes. Don't actually copy these 2 lines! -->
<!--[if lt IE 9]><script src="../../assets/js/ie8-responsive-file-warning.js"></script><![endif]
<script type="text/javascript" src="{{ url_for('static', filename = 'ie-emulation-modes-warning.js') }}"></script>-->
<!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"> </script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"> </script>
<![endif]-->
</head>
<!--Main Content-->
<!-- Bootstrap core JavaScript
================================================== -->
<!-- Placed at the end of the document so the pages load faster -->
<script src="{{url_for('https://ajax.googleapis.com/ajax/libs/jquery/1.12.4/', filename='jquery.min.js') }}"></script>
<script>window.jQuery || document.write("<script type='text/javascript' src='{{ url_for('static', filename='jquery.min.js') }}''><\/script>")</script>
<script type="text/javascript" src="{{ url_for('static', filename ='bootstrap.min.js') }}"></script>
<!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
<script type="text/javascript" src="{{ url_for('static', filename='ie10-viewport-bug-workaround.js') }}"></script>
</body>
</html>
I realized that I had placed my 'static' folder in the 'templates' folder, an old habit I did with front-end only applications. You need to place the 'static' folder in the directory with the templates folder.
Css and Javascript are not working on my website
I'm using Flask framework on PythonAnywhere
Directory sturcture:
home/dubspher/mysite/
- README.md
- app.py
- index.html
Static/
- css
- fonts
- images
- js
- sass
Original version of HTML:
<html>
<head>
<title>Dubspher.</title>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<meta name="description" content="" />
<meta name="keywords" content="" />
<!--[if lte IE 8]><script src="css/ie/html5shiv.js"></script><![endif]-->
<script src="js/jquery.min.js"></script>
<script src="js/jquery.dropotron.min.js"></script>
<script src="js/jquery.scrollgress.min.js"></script>
<script src="js/jquery.scrolly.min.js"></script>
<script src="js/jquery.slidertron.min.js"></script>
<script src="js/skel.min.js"></script>
<script src="js/skel-layers.min.js"></script>
<script src="js/init.js"></script>
<noscript>
<link rel="stylesheet" href="css/skel.css" />
<link rel="stylesheet" href="css/style.css" />
<link rel="stylesheet" href="css/style-xlarge.css" />
</noscript>
<!--[if lte IE 9]><link rel="stylesheet" href="css/ie/v9.css" /><![endif]-->
<!--[if lte IE 8]><link rel="stylesheet" href="css/ie/v8.css" /><![endif]-->
</head>
Modified Version but not working:
<html>
<head>
<title>Dubspher.</title>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<meta name="description" content="" />
<meta name="keywords" content="" />
<noscript>
<link href="{{ url_for('static', filename='css/style.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/skel.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='css/style-xlarge.css') }}" rel="stylesheet">
</noscript>
<!--[if lte IE 9]><link rel="stylesheet" href="css/ie/v9.css" /><![endif]-->
<!--[if lte IE 8]><link rel="stylesheet" href="css/ie/v8.css" /><![endif]-->
</head>
<body class="landing">
<!--[if lte IE 8]><script src="css/ie/html5shiv.js"></script><![endif]-->
<script type="text/javascript" src="{{ url_for('static', filename='js/jquery.min.js"') }}"></script>
<script type="text/javascript" src="{{ url_for('static', filename='js/jquery.dropotron.min.js"') }}"></script>
<script type="text/javascript" src="{{ url_for('static', filename='js/jquery.scrollgress.min.js"') }}"></script>
<script type="text/javascript" src="{{ url_for('static', filename='js/jquery.scrolly.min.js"') }}"></script>
<script type="text/javascript" src="{{ url_for('static', filename='js/jquery.slidertron.min.js"') }}"></script>
<script type="text/javascript" src="{{ url_for('static', filename='js/skel.min.js"') }}"></script>
<script type="text/javascript" src="{{ url_for('static', filename='js/skel-layers.min.js"') }}"></script>
<script type="text/javascript" src="{{ url_for('static', filename='js/init.js"') }}"></script>
As you can see I moved JS from as suggested on another post! And also moved all the stylesheet files to a new folder called static!
I cleared the cache on my browser but I can see that 404 errors in the inspect mode.
http://dubspher.pythonanywhere.com
App.py
from flask import Flask
# set the project root directory as the static folder, you can set others.
app = Flask(__name__, static_folder='/home/dubspher/mysite/')
#app.route('/')
def static_file():
return app.send_static_file('index.html')
if __name__ == "__main__":
app.run()
With the files laid out like you currently have them, move your index.html file into the root of static subfolder.
home/dubspher/mysite/
- README.md
- app.py
Static/
- index.html
- css
- fonts
- images
- js
- sass
And use the following app.py
from flask import Flask
# set the project root directory as the static folder, you can set others.
app = Flask(__name__, static_url_path="/static", static_folder='/home/dubspher/mysite/static')
#app.route('/')
def static_file():
return app.send_static_file('index.html')
if __name__ == "__main__":
app.run()
And change your index.html links to:
<html>
<head>
<title>Dubspher.</title>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<meta name="description" content="" />
<meta name="keywords" content="" />
<!--[if lte IE 8]><script src="css/ie/html5shiv.js"></script><![endif]-->
<script src="/static/js/jquery.min.js"></script>
<script src="/static/js/jquery.dropotron.min.js"></script>
<script src="/static/js/jquery.scrollgress.min.js"></script>
<script src="/static/js/jquery.scrolly.min.js"></script>
<script src="/static/js/jquery.slidertron.min.js"></script>
<script src="/static/js/skel.min.js"></script>
<script src="/static/js/skel-layers.min.js"></script>
<script src="/static/js/init.js"></script>
<noscript>
<link rel="stylesheet" href="/static/css/skel.css" />
<link rel="stylesheet" href="/static/css/style.css" />
<link rel="stylesheet" href="/static/css/style-xlarge.css" />
</noscript>
<!--[if lte IE 9]><link rel="stylesheet" href="/static/css/ie/v9.css" /><![endif]-->
<!--[if lte IE 8]><link rel="stylesheet" href="/static/css/ie/v8.css" /><![endif]-->
</head>
Your stylesheets are referenced in your javascript. Modify init.js so that it references the correct paths:
global: { href: '/static/css/style.css', containers: 1400, grid: { gutters: ['2em', 0] } },
xlarge: { media: '(max-width: 1680px)', href: '/static/css/style-xlarge.css', containers: 1200 },
large: { media: '(max-width: 1280px)', href: '/static/css/style-large.css', containers: 960, grid: { gutters: ['1.5em', 0] }, viewport: { scalable: false } },
medium: { media: '(max-width: 980px)', href: '/static/css/style-medium.css', containers: '90%', grid: { zoom: 2 } },
small: { media: '(max-width: 736px)', href: '/static/css/style-small.css', containers: '90%!', grid: { gutters: ['1.25em', 0], zoom: 3 } },
xsmall: { media: '(max-width: 480px)', href: '/static/css/style-xsmall.css' }
I'm working locally on a django project with bootstrap . The structure screenshot is above:
I have the following in index.html:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="description" content="" />
<meta name="author" content="" />
<title>Landing Page - Start Bootstrap Theme</title>
<!-- Bootstrap Core CSS -->
{% load staticfiles %}
<link href="{% static "css/bootstrap.min.css" %}" rel="stylesheet" />
<!-- Custom CSS -->
<link href="{% static "css/landing-page.css" %}" rel="stylesheet" />
<!-- Custom Fonts -->
<link href="{% static "font-awesome-4.2.0/css/font-awesome.min.css" %}" rel="stylesheet" type="text/css" />
<link href="http://fonts.googleapis.com/css?family=Lato:300,400,700,300italic,400italic,700italic" rel="stylesheet" type="text/css" />
<!-- HTML5 Shim and Respond.js IE8 support of HTML5 elements and media queries -->
<!-- WARNING: Respond.js doesn't work if you view the page via file:// -->
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/libs/html5shiv/3.7.0/html5shiv.js"></script>
<script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
In my settings.py I have:
STATIC_URL = '/static/'
TEMPLATE_DIRS = (
os.path.join(BASE_DIR, 'templates'),
When I open "http://onetwentyseven.0.0.1:8000/index/", I see the html but the bootstrap css styling is not present. in dev tools I see:
How can I fix the CSS paths?
This seems to be a syntax error. If you move the css directory from static>app1>css to static>css. It should work just fine
Using Master/Details Example on pluker
http://plnkr.co/edit/CncDWCktXTuBQdDVfuVv?p=preview
I copied everthing local file and just launching the html
in firefox and chrome. Everything works.
But when I try to server the page via a Flask python server. I
dont see selection in html as output. It looks like that {{ ?? }}
in not showing up....
app.py: trimmed down flask app
from flask import Flask, render_template, request, json, Response
app = Flask(__name__)
DEBUG = True
#app.route("/index", methods=['GET', 'POST'])
def selected_version():
return render_template("index.html")
if __name__ == "__main__":
app.run(debug=True)
The index.html:
<!DOCTYPE html>
<html ng-app="myApp">
<head lang="en">
<meta charset="utf-8">
<title>Custom Plunker</title>
<link rel="stylesheet" type="text/css" href="http://angular-ui.github.com/ng-grid/css/ng-grid.css" />
<link rel="stylesheet" type="text/css" href="../static/css/style.css" />
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.8.0/jquery.min.js"></script>
<script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.0.2/angular.min.js"></script>
<script type="text/javascript" src="http://angular-ui.github.com/ng-grid/lib/ng-grid.debug.js"></script>
<script type="text/javascript" src="../static/js/newAppMain.js"></script>
</head>
<body ng-controller="MyCtrl">
<div class="gridStyle" ng-grid="gridOptions"></div>
<div class="selectedItems"> {{mySelections}} </div>
</body>
</html>
The view Source of index.html
<!DOCTYPE html>
<html ng-app="myApp">
<head lang="en">
<meta charset="utf-8">
<title>Custom Plunker</title>
<link rel="stylesheet" type="text/css" href="http://angular-ui.github.com/ng-grid/css/ng-grid.css" />
<link rel="stylesheet" type="text/css" href="../static/css/style.css" />
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.8.0/jquery.min.js"></script>
<script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.0.2/angular.min.js"></script>
<script type="text/javascript" src="http://angular-ui.github.com/ng-grid/lib/ng-grid.debug.js"></script>
<script type="text/javascript" src="../static/js/newAppMain.js"></script>
</head>
<body ng-controller="MyCtrl">
<div class="gridStyle" ng-grid="gridOptions"></div>
<div class="selectedItems"></div>
</body>
</html>