Django project will not load in CSS files - python

I am currently directly following the tutorial here: https://www.youtube.com/watch?v=qDwdMDQ8oX4
about setting up a blog website through Django. When I attempt to load in my CSS files, I obtain the same screen that the person in the video was achieving #38:01, when he claims the css wasn't loading in (I have restarted my server many times so that isn't the issue). For reference it should look like the screen #38:51 No colors and the header bar is mixed with the page header. I am able to open up the css through the source code on the browser, so it must be being referenced correctly; however here is my relevant code:
to reference the static files in settings.py (Note I had it exactly like how it is in the video and changed it to this with no change):
STATIC_URL = '/static/'
STATIC_ROOT = os.path.join(BASE_DIR, 'static_files')
STATICFILES_DIRS = (
os.path.join(BASE_DIR, 'static'),
)
the attempt to use the css file as a stylesheet:
{% load static %}
<!DOCTYPE html>
<html lang="en">
<head>
<!-- Required meta tags -->
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<!-- Bootstrap CSS -->
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/css/bootstrap.min.css" integrity="sha384-Gn5384xqQ1aoWXA+058RXPxPg6fy4IWvTNh0E263XmFcJlSAwiGgFAW/dAiS6JXm" crossorigin="anonymous">
<link rel="stytlesheet" type="text/css" href="{% static 'blog/main.css' %}">
My project directory
Also I guess here is the css code if it is any help:
background: #fafafa;
color: #333333;
margin-top: 5rem;
}
h1, h2, h3, h4, h5, h6 {
color: #00FBF9;
}
ul {
margin: 0;
}
.bg-steel {
background-color: #5f788a;
}
.site-header .navbar-nav .nav-link {
color: #cbd5db;
}
.site-header .navbar-nav .nav-link:hover {
color: #ffffff;
}
.site-header .navbar-nav .nav-link.active {
font-weight: 500;
}
.content-section {
background: #ffffff;
padding: 10px 20px;
border: 1px solid #dddddd;
border-radius: 3px;
margin-bottom: 20px;
}
.article-title {
color: #444444;
}
a.article-title:hover {
color: #428bca;
text-decoration: none;
}
.article-content {
white-space: pre-line;
}
.article-img {
height: 65px;
width: 65px;
margin-right: 16px;
}
.article-metadata {
padding-bottom: 1px;
margin-bottom: 4px;
border-bottom: 1px solid #e3e3e3
}
.article-metadata a:hover {
color: #333;
text-decoration: none;
}
.article-svg {
width: 25px;
height: 25px;
vertical-align: middle;
}
.account-img {
height: 125px;
width: 125px;
margin-right: 20px;
margin-bottom: 16px;
}
.account-heading {
font-size: 2.5rem;
}```

comment out STATIC_ROOT = 'static' and add the below code to your settings.py file.
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
STATIC_URL = '/static/'
STATICFILES_DIRS = (os.path.join(BASE_DIR, 'static'),)
If still not work then run this command on terminal
$ python manage.py collectstatic
for reference go through official Django documentation django static configuration , django documentation. It's good for those who are new to django.

Related

python-requests-html Getting message: please enable JavaScript to continue using this application

Trying to access a website with requests, I'm getting this:
"Please enable JavaScript to continue using this application."
I'm trying to render the HTML so it executes the javascript:
from requests_html import HTMLSession
session = HTMLSession()
response = session.get("https://winbir180.com/tr")
response.html.render()
print(response.text)
Unfortunately, I'm getting the same output...
PS C:\Users\Adrian\Documents\GitHub\sitechecker> c:; cd 'c:\Users\Adrian\Documents\GitHub\sitechecker'; & 'C:\Users\Adrian\AppData\Local\Programs\Python\Python310\python.exe' 'c:\Users\Adrian\.vscode\extensions\ms-python.python-2022.12.0\pythonFiles\lib\python\debugpy\adapter/../..\debugpy\launcher' '13160' '--' 'c:\Users\Adrian\Documents\GitHub\sitechecker\winbir\main.py'
12/08/2022 16:36:21 Checking domain https://winbir180.com/tr
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title></title>
<base href="/">
<meta name="viewport" content="width=device-width, initial-scale=1">
<style type="text/css">#font-face {
font-family: 'Material Icons';
font-style: normal;
font-weight: 400;
src: url(https://fonts.gstatic.com/s/materialicons/v135/flUhRq6tzZclQEJ-Vdg-IuiaDsNa.woff) format('woff');
}
.material-icons {
font-family: 'Material Icons';
font-weight: normal;
font-style: normal;
font-size: 24px;
line-height: 1;
letter-spacing: normal;
text-transform: none;
display: inline-block;
white-space: nowrap;
word-wrap: normal;
direction: ltr;
font-feature-settings: 'liga';
}
/* fallback */
#font-face {
font-family: 'Material Icons';
font-style: normal;
font-weight: 400;
src: url(https://fonts.gstatic.com/s/materialicons/v135/flUhRq6tzZclQEJ-Vdg-IuiaDsNcIhQ8tQ.woff2) format('woff2');
}
.material-icons {
font-family: 'Material Icons';
font-weight: normal;
font-style: normal;
font-size: 24px;
line-height: 1;
letter-spacing: normal;
text-transform: none;
display: inline-block;
white-space: nowrap;
word-wrap: normal;
direction: ltr;
-webkit-font-feature-settings: 'liga';
-webkit-font-smoothing: antialiased;
}
</style>
<link rel="shortcut icon" href="favicon.ico"/>
<script>window.prerenderReady = false;</script>
<!--pwa-->
<link rel="stylesheet" href="styles.043585251e462b0b92ab.css"></head>
<body>
<app-root></app-root>
<noscript>Please enable JavaScript to continue using this application.</noscript>
<script src="runtime-es2015.b3cfabca7b61383db79a.js" type="module"></script><script src="runtime-es5.b3cfabca7b61383db79a.js" nomodule="" defer=""></script><script src="polyfills-es5.04780b623e528dbf95c3.js" nomodule="" defer=""></script><script src="polyfills-es2015.cd1663d4f2033cce4e98.js" type="module"></script><script src="scripts.a1eaeba2dc191ae84b8f.js" defer=""></script><script src="main-es2015.e61547b6166045b4396f.js" type="module"></script><script src="main-es5.e61547b6166045b4396f.js" nomodule="" defer=""></script></body>
</html>
<!-- Wed Aug 10 2022 17:59:17 GMT+0300 (GMT+03:00) -->
How can I access that websites that are blocking like this?
The URL I'm trying to read is https://winbir180.com/tr

Background won't run with flask [duplicate]

This question already has answers here:
How to serve static files in Flask
(24 answers)
Closed 1 year ago.
I have this application and I made an animation background and when I open the HTML in chrome so it works well but then when I run it with flask It doesn't add the background and just ignores it
It's probably something that I missed but I still can't understand why it doesn't load the background
HTML - 1:
<!doctype html>
<html>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css" integrity="sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T" crossorigin="anonymous">
<link rel="stylesheet" href="style.css">
<title>Covid-19</title>
</head>
<body>
<section>
<h1>Animated something</h1>
</section>
{% block content %}
{% endblock content %}
<script src="https://code.jquery.com/jquery-3.3.1.slim.min.js" integrity="sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo" crossorigin="anonymous"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js" integrity="sha384-UO2eT0CpHqdSJQ6hJty5KVphtPhzWj9WO1clHTMGa3JDZwrnQq4sF86dIHNDz0W1" crossorigin="anonymous"></script>
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/js/bootstrap.min.js" integrity="sha384-JjSmVgyd0p3pXB1rRibZUAYoIIy6OrQ6VrjIEaFf/nJGzIxFDsf4x0xIM+B07jRM" crossorigin="anonymous"></script>
</body>
</html>
CSS:
#import "https://fonts.googleapis.com/css?family=Lato:100";
*{
margin: 0;
padding: 0;
box-sizing: border-box;
}
html{
font-size: 10px;
font-family: "Latop", Arial, sans-serif;
}
section{
width: 100%;
height: 100vh;
color: #fff;
background: linear-gradient(-45deg, #EE7752, #E73C7E, #23A6D5, #23D5AB);
background-size: 400% 400%;
position: relative;
animation: change 10s ease-in-out infinite;
}
h1{
font-size: 5rem;
text-transform: uppercase;
letter-spacing: 2px;
border: 3px solid #fff;
position: absolute;
top: 50%;
left: 50%;
transform: translate(-50%, -50%);
padding: 5rem 10rem;
}
#keyframes change{
0%{
background-position: 0 50%;
}
50%{
background-position: 100% 50%;
}
100%{
background-position: 0 50%;
}
}
Python:
from flask import Flask, render_template, request
def create_app():
app = Flask(__name__)
#app.route('/', methods=['POST', 'GET'])
def home():
return render_template("base.html")
return app
It's due to the server being unable to locate the style.css file. To fix this, make sure your static files are located inside a folder named 'static' next to your flask application python file. something like this:
/app
- app.py
/templates
- base.html
/static
- style.css
And to easily link the files in the templates using:
<link rel="stylesheet" href="{{ url_for('static',filename='style.css') }}">
Check Flask's docs for more details
Good luck :)

Not Found: /style.css/ , django

I'm trying to create my own website on Django, but some problems stop me and I can't solve them myself.
I want to create a sidebar. I found a website with css and HTML code for it.
style.css:
#import url('https://fonts.googleapis.com/css?family=Montserrat:600|Open+Sans:600&display=swap');
*{
margin: 0;
padding: 0;
text-decoration: none;
}
.sidebar{
position: fixed;
width: 240px;
left: -240px;
height: 100%;
background: #1e1e1e;
transition: all .5s ease;
}
.sidebar header{
font-size: 28px;
color: white;
line-height: 70px;
text-align: center;
background: #1b1b1b;
user-select: none;
font-family: 'Montserrat', sans-serif;
}
.sidebar a{
display: block;
height: 65px;
width: 100%;
color: white;
line-height: 65px;
padding-left: 30px;
box-sizing: border-box;
border-bottom: 1px solid black;
border-top: 1px solid rgba(255,255,255,.1);
border-left: 5px solid transparent;
font-family: 'Open Sans', sans-serif;
transition: all .5s ease;
}
a.active,a:hover{
border-left: 5px solid #b93632;
color: #b93632;
}
.sidebar a i{
font-size: 23px;
margin-right: 16px;
}
.sidebar a span{
letter-spacing: 1px;
text-transform: uppercase;
}
#check{
display: none;
}
label #btn,label #cancel{
position: absolute;
cursor: pointer;
color: white;
border-radius: 5px;
border: 1px solid #262626;
margin: 15px 30px;
font-size: 29px;
background: #262626;
height: 45px;
width: 45px;
text-align: center;
line-height: 45px;
transition: all .5s ease;
}
label #cancel{
opacity: 0;
visibility: hidden;
}
#check:checked ~ .sidebar{
left: 0;
}
#check:checked ~ label #btn{
margin-left: 245px;
opacity: 0;
visibility: hidden;
}
#check:checked ~ label #cancel{
margin-left: 245px;
opacity: 1;
visibility: visible;
}
#media(max-width : 860px){
.sidebar{
height: auto;
width: 70px;
left: 0;
margin: 100px 0;
}
header,#btn,#cancel{
display: none;
}
span{
position: absolute;
margin-left: 23px;
opacity: 0;
visibility: hidden;
}
.sidebar a{
height: 60px;
}
.sidebar a i{
margin-left: -10px;
}
a:hover {
width: 200px;
background: inherit;
}
.sidebar a:hover span{
opacity: 1;
visibility: visible;
}
}
sidebar.html
<!DOCTYPE html>
<html lang="en" dir="ltr">
<head>
<meta charset="utf-8">
<title>Responsive Sidebar Menu</title>
<link rel="stylesheet" href='style.css'/>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<script src="https://kit.fontawesome.com/a076d05399.js"></script>
</head>
<body>
<input type="checkbox" id="check">
<label for="check">
<i class="fas fa-bars" id="btn"></i>
<i class="fas fa-times" id="cancel"></i>
</label>
<div class="sidebar">
<header>My Menu</header>
<a href="#" class="active">
<i class="fas fa-qrcode"></i>
<span>Dashboard</span>
</a>
<a href="#">
<i class="fas fa-link"></i>
<span>Shortcuts</span>
</a>
<a href="#">
<i class="fas fa-stream"></i>
<span>Overview</span>
</a>
<a href="#">
<i class="fas fa-calendar"></i>
<span>Events</span>
</a>
<a href="#">
<i class="far fa-question-circle"></i>
<span>About</span>
</a>
<a href="#">
<i class="fas fa-sliders-h"></i>
<span>Services</span>
</a>
<a href="#">
<i class="far fa-envelope"></i>
<span>Contact</span>
</a>
</div>
</body>
</html>
All of these works if I start it in any snippet, or if I just run sidebar.html from pycharm in google, but when I start my site it doesn't work and gives an error: Not Found: /style.css/
Both files are on the same directory.
Easy fix, but you will need to dig into Django a bit more.
First, this tag <link rel="stylesheet" href=style.css/> will never work. The tag is written wrong on a few levels. The big problem is that the location is relative, the use of style.css is assumed to be on the same directory level as what ever page/script/etc is being called. The problem is that wsgi.py or similar file is actually the "root" that is "running" the site and has no idea where "style.css" exists and also deos not care.
Styles, image, js, etc are all stored as static assets in Django. This folder is "served" using a static tag that will transform to the correct path per your configuration for local dev and production. Take a look here: https://docs.djangoproject.com/en/3.1/howto/static-files/
I will not explain all the nuance, the Django site does a better job. Instead I will point out how Django is different from static sites or something like PHP. Django is an application running in the CGI (WSGI), there is only one "route" on the server so to speak, all data is served from this one file. PHP can and typically does serve data in a file+directory manner. In PHP/static scenario the location of files is stable compared to Django. In Django the page, url, and the way data is served all come from one point. That means the relationship to static files will be different and not something you can or should control.
Django does not want you to ever serve static files through the CGI (WSGI), that is a waste of CPU and resources and is slow for static. So they have a static system. When running locally with the configuration set up correctly and DEBUG=True then the static keyword will transform to the necessary local path in conjunction with your configuration.
When in production it is assumed that a CDN is used, in that case the static keyword is replaced with the path (URL) to the CDN static files.
The approach Django uses is much more mature than say, Wordpress where use of a CDN can be tricky (I have written custom CDNs for WP many times, not fun).
Walk through the link above, set up your configuration correctly and follow the rules. Django is very interested in your following of the rules, Deviation will only cause pain. I have been primarily a Django dev for almost 10 years now. The problem you ran into got me real good in the beginning, but now CDN and static file management is second nature and definitely more productive than other less mature systems.
Set static root on your 'settings.py'.
You're gonna want to change <link rel="stylesheet" href=style.css/> to <link rel="stylesheet" href="style.css"/>. Also, the path it is trying to find your style.css file in is like this: (whatever path to your sidebar.html)/sidebar.html/style.css. I doubt that it is located there. If it is located in the same directory as your sidebar.html file, try <link rel="stylesheet" href="./style.css"/>

Python terminal in html?

You can downvote this before you read (Edit:Thank you for understanding) because of the title but other questions just answers how to run python programs in html but what I want to do is use python terminal in html.
So guys there is actually big big question in my head. How exactly can I use python program in html like it is a terminal.
Here is my project, it will be an interactive dictionary that you can study on vocabulary.
<!DOCTYPE html>
<html>
<head>
<meta name="viewport" content="width=device-width, initial-scale=1">
<style>
body {
font-family: Arial
}
* {
box-sizing: border-box;
}
/* The browser window */
.container {
border: 3px solid #f1f1f1;
border-top-left-radius: 4px;
border-top-right-radius: 4px;
}
/* Container for columns and the top "toolbar" */
.row {
padding: 10px;
background: #f1f1f1;
border-top-left-radius: 4px;
border-top-right-radius: 4px;
}
.scriptcontainer {
padding: 10px;
background: #ffffff;
border-top-left-radius: 4px;
border-top-right-radius: 4px;
}
.row2 {
padding: 5px;
background: #cc0000;
border-top-left-radius: 4px;
border-top-right-radius: 4px;
}
/* Create three unequal columns that floats next to each other */
.column {
float: left;
}
.left {
width: 15%;
}
.right {
width: 10%;
}
.middle {
width: 75%;
}
/* Clear floats after the columns */
.row:after {
content: "";
display: table;
clear: both;
}
/* Three dots */
.dot {
margin-top: 4px;
height: 12px;
width: 12px;
background-color: #bbb;
border-radius: 50%;
display: inline-block;
}
/* Style the input field */
input[type=text] {
width: 100%;
border-radius: 3px;
border: none;
background-color: white;
margin-top: -8px;
height: 25px;
color: #666;
padding: 5px;
}
.bar {
width: 17px;
height: 3px;
background-color: #aaa;
margin: 3px 0;
display: block;
}
/* Page content */
.content {
padding: 10px;
}
</style>
</head>
<body>
<div class="container">
<div class="row">
<div class="column middle">
<h4><font size="6" type="Times">VocaDict</font></h4>
</div>
</div>
<div class="content">
<h3>=>Your dictionary:</h3>
<div class="row2"><div class="content"><div class="scriptcontainer"> <script>//This is where your dictionary will go!</script></div></div></div>
</div>
<div class="content">
<h3>=>Study lists:</h3>
<div class="row2"><div class="content"><div class="scriptcontainer"> <script>//This is where your dictionary will go!</script></div></div></div>
</div><br><br><br><br><br><br><br><br> <br><br><br><br><br><br><br> <br><br><br><br><br><br><br><br><br><br>
</div>
</body>
</html>
What I'm asking is about this places:
<div class="content"><div class="scriptcontainer"> <script>//This is the place I want python terminal to be!</script>
I want this places act like a python terminal. When the page loads it will run the program, it will not affect any other thing and act like an independent thing but it will stay on the page and will be interactive.
Is that possible and If it is, how?
Thank you!
What are you trying to do here exactly? If you really want to run python script with HTML then go for CGI. As you can't run python directly you may have to use the below,
http://karrigell.sourceforge.net/en/pythoninsidehtml.html
or
http://www.skulpt.org/
But best way would be to use python-cgi programming https://www.tutorialspoint.com/python/python_cgi_programming.htm
It is impossible.
I had this question myself before. Clearly the intention was make room for other scripts, at least I think this way:
<script type="text/javascript">
<script type="text/python">
It would require a browser that implements this. Maybe this engine: https://www.gnu.org/software/pythonwebkit/

How to Use Splash (JS Rendering Service) with a Proxy

It's configured automatically in Scrapy, but not in Curl or normal request.
In curl, we can do this without any proxy:
http://<server_ip>:8050/render.html?url=http://www.example.com/?timeout=10&wait=0.5
How to do it with proxy?
I tried this:
http://<server_ip>:8050/render.html?url=http://www.example.com/?timeout=10&wait=0.5 --proxy myproxy:port
But I got:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Lightspeed Systems - Web Access</title>
<style type="text/css">
html {
background: #13396b; /* Old browsers */
/* IE9 SVG, needs conditional override of 'filter' to 'none' */
background: url(data:image/svg+xml;base64,PD94bWwgdmVyc2lvbj0iMS4wIiA/Pgo8c3ZnIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIgd2lkdGg9IjEwMCUiIGhlaWdodD0iMTAwJSIgdmlld0JveD0iMCAwIDEgMSIgcHJlc2VydmVBc3BlY3RSYXRpbz0ibm9uZSI+CiAgPGxpbmVhckdyYWRpZW50IGlkPSJncmFkLXVjZ2ctZ2VuZXJhdGVkIiBncmFkaWVudFVuaXRzPSJ1c2VyU3BhY2VPblVzZSIgeDE9IjAlIiB5MT0iMCUiIHgyPSIwJSIgeTI9IjEwMCUiPgogICAgPHN0b3Agb2Zmc2V0PSIwJSIgc3RvcC1jb2xvcj0iIzEzMzk2YiIgc3RvcC1vcGFjaXR5PSIxIi8+CiAgICA8c3RvcCBvZmZzZXQ9IjEwMCUiIHN0b3AtY29sb3I9IiMzZTY1OTkiIHN0b3Atb3BhY2l0eT0iMSIvPgogIDwvbGluZWFyR3JhZGllbnQ+CiAgPHJlY3QgeD0iMCIgeT0iMCIgd2lkdGg9IjEiIGhlaWdodD0iMSIgZmlsbD0idXJsKCNncmFkLXVjZ2ctZ2VuZXJhdGVkKSIgLz4KPC9zdmc+);
background: -moz-linear-gradient(top, #13396b 0%, #3e6599 100%); /* FF3.6+ */
background: -webkit-gradient(linear, left top, left bottom, color-stop(0%,#13396b), color-stop(100%,#3e6599)); /* Chrome,Safari4+ */
background: -webkit-linear-gradient(top, #13396b 0%,#3e6599 100%); /* Chrome10+,Safari5.1+ */
background: -o-linear-gradient(top, #13396b 0%,#3e6599 100%); /* Opera 11.10+ */
background: -ms-linear-gradient(top, #13396b 0%,#3e6599 100%); /* IE10+ */
background: linear-gradient(to bottom, #13396b 0%,#3e6599 100%); /* W3C */
filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#13396b', endColorstr='#3e6599',GradientType=0 ); /* IE6-8 */
height: 100%;
}
body {
width: 960px;
overflow: hidden;
margin: 50px auto;
font-family: "HelveticaNeue-Light", "Helvetica Neue Light", "Helvetica Neue", Helvetica, Arial, "Lucida Grande", sans-serif;
font-size: 14px;
color: #a2c3ef;
}
h1,h2 {
color: #fff;
}
h1 {
font-size: 32px;
font-weight: normal;
}
h2 {
font-size: 24px;
font-weight: lighter;
}
a {
color: #fff;
font-weight: bold;
}
#content {
margin: 20px 0 20px 30px;
}
blockquote#error, blockquote#data {
color: #fff;
font-size: 16px;
}
#footer p {
font-size: 12px;
padding: 7px 12px;
margin-top: 10px;
color: #fff;
text-align: right;
}
</style>
<!--[if gte IE 9]>
<style type="text/css">
.gradient {
filter: none;
}
</style>
<![endif]-->
</head>
<body id=ERR_ACCESS_DENIED>
<div id="titles">
<h1>ERROR</h1>
<h2>Unable to complete URL request</h2>
</div>
<hr>
<div id="content">
<p>An error has occurred while trying to access http://<server_ip>:8050/render.html?.</p>
<blockquote id="error">
<p><b>Access denied.</b></p>
</blockquote>
<p>Security permissions are not allowing the request attempt. Please contact your service provider if you feel this is incorrect.</p>
</div>
<hr>
<div id="footer">
</div>
</body>
</html>
C:\Users\Dr. Printer>curl "http://<server_ip>:8050/render.html?url=http://www.example.com/?timeout=30&wait=0.5"
{"description": "Timeout exceeded rendering page", "type": "GlobalTimeoutError", "info": {"timeout": 30.0}, "error": 504}
If we want to use Crawlera as the proxy, we can do it using this lua script
function use_crawlera(splash)
-- Make sure you pass your Crawlera API key in the 'crawlera_user' arg.
-- Have a look at the file spiders/quotes-js.py to see how to do it.
-- Find your Crawlera credentials in https://app.scrapinghub.com/
local user = splash.args.crawlera_user
local host = 'proxy.crawlera.com'
local port = 8010
local session_header = 'X-Crawlera-Session'
local session_id = 'create'
splash:on_request(function (request)
-- The commented code below can be used to speed up the crawling
-- process. They filter requests to undesired domains and useless
-- resources. Uncomment the ones that make sense to your use case
-- and add your own rules.
-- Discard requests to advertising and tracking domains.
if string.find(request.url, 'doubleclick%.net') or
string.find(request.url, 'analytics%.google%.com') then
request.abort()
return
end
-- Avoid using Crawlera for subresources fetching to increase crawling
-- speed. The example below avoids using Crawlera for URLS starting
-- with 'static.' and the ones ending with '.png'.
if string.find(request.url, '://static%.') ~= nil or
string.find(request.url, '%.png$') ~= nil then
return
end
request:set_header('X-Crawlera-Cookies', 'disable')
request:set_header(session_header, session_id)
request:set_proxy{{host, port, username=user, password=''}}
end)
splash:on_response_headers(function (response)
if type(response.headers[session_header]) ~= nil then
session_id = response.headers[session_header]
end
end)
end
function main(splash)
use_crawlera(splash)
splash:init_cookies(splash.args.cookies)
assert(splash:go{{
splash.args.url,
headers=splash.args.headers,
http_method=splash.args.http_method,
}})
assert(splash:wait({0}))
return {{
html = splash:html(),
cookies = splash:get_cookies(),
}}
end
Don't forget to install scrapy-crawlera and activate it in the settings. For more information please refer https://support.scrapinghub.com/support/solutions/articles/22000188428-using-crawlera-with-splash-scrapy

Categories

Resources