I have built a website using Django framework(www.example.com). While navigating the site the URL changes to like (www.example.com/home or /profile etc).
Is there some way that the current url is masked by a placeholder eg(www.example.com/home should be shown as www.example.com).
This should work throughout the website.
The url shown to the user would remain same (www.example.com) to where ever the user navigates on the site
HIGHLY NOT RECOMMENDED
Keep a single link in your urls.py, containing r'^$'.
Handle all requests purely as POST request.
Note: You'll have to hard code every parameter, and detect manually from where the request is coming.
To know how to work with POST request in detail check Django docs
Related
I made a project, but in it you need to get a special token from the VK social network. I made the token pass along with the link. She looks like this:
http://127.0.0.1:8000/vk/auth#access_token=7138dcd74f5da5e557943b955bbfbd9a62811da7874067e5fa0edef1ca8680216755be16&expires_in=86400&user_id=397697636
But the problem is that the django cannot see this link. I tried to look at it in a post request, get request, but everything is empty there. I tried to make it come not as a request but as a link, it is like this:
http://127.0.0.1:8000/vk/auth #access_token=7138dcd74f5da5e557943b955bbfbd9a62811da7874067e5fa0edef1ca8680216755be16&expires_in=86400&user_id=397697636
But the django does not want to read the space. Who can help
I think there is a confusion between a query string (get params) that follows a ? and a fragment (the text, that follows a #)
What follows the # is not sent to the server (and thus not received by Django) it is only useful to the web browser and to the javascript that is executed on the browser , which can use it to update parts of the screen. use it as virtual urls / bookmarks for one page web applications.
The javascript can of course also trigger AJAX requests using that data, but that's up to the javascript
If you write however http://127.0.0.1:8000/vk/auth?access_token=7138dcd74f5da5e557943b955bbfbd9a62811da7874067e5fa0edef1ca8680216755be16&expires_in=86400&user_id=397697636 (you replace # with ?)
Then you can receive the information in your django view with
request.GET["access_token"], request.GET["expires_in"] and request.GET["user_id"]
If it is really a #, then your javascript should parse whatever follows the # and make the according AJAX requests to the server to send / validate the token.
For another question about fragments, refer for example to Is the URL fragment identifier sent to the server?
From this question, the last responder seems to think that it is possible to use python to open a webpage, let me sign in manually, go through a bunch of menus then let the python parse the page when I get where I want. The website has a weird sign in procedure so using requests and passing a user name and password will not be sufficient.
However it seems from this question that it's not a possibility.
SO the question is, is it possible? if so, do you know of some example code out there?
The way to approach this problem is when you login normally have the developer tools next to you and see what the request is sending.
When logging in to bandcamp the XHR request that's being sent is the following:
From that response you can see that an identity cookie is being sent. That's probably how they identify that you are logged in. So when you've got that cookie set you would be authorized to view logged in pages.
So in your program you could login normally using requests, save the cookie in a variable and then apply the cookie to further requests using requests.
Of course login procedures and how this authorization mechanism works may differ, but that's the general gist of it.
So when do you actually need selenium? You need it if a lot of the things are being rendered by javascript. requests is only able to get the html. So if the menus and such is rendered with javascript you won't ever be able to see that information using requests.
I am writing a web scraping application. When I enter the URL directly into a browser, it displays the JSON data I want.
However, if I use Python's request lib, or URLDownloadToFile in C++, it simply downloads the html for the login page.
The site I am trying to scrape it from (DraftKings.com) requires a login. The other sites I scrape from don't.
I am 100% sure this is related, since if I paste the url when I am logged out, I get the login page, rather than the JSON data. Once I log in, if I paste the URL again, I get the JSON data again.
The thing is that if I remain logged in, and then use the Python script or C++ app to download the JSON data, as mentioned.... it downloads the Login HTML.
Anyone know how I can fix this issue?
Please don't ask us to help with an activity that violates the terms of service of the site you are trying to (ab-)use:
Using automated means (including but not limited to harvesting bots, robots, parser, spiders or screen scrapers) to obtain, collect or access any information on the Website or of any User for any purpose.
Even if that kind of usage were allowed, the answer would be boring:
You'd need to implement the login functionality in your scraper.
I'm trying to scrape information from How Long to Beat, how can I make a request for a search without having to put the search-term in the URL?
EDIT for clarity:
The problem I face is that the site doesn't use something like http://www.howlongtobeat.com/search.php?s=search-term, therefore I cannot do something like
url = 'http://www.howlongtobeat.com/search.php?s='
search_term = raw_input("Search: ")
r = requests.get(url + search_term)
In other words, when you type the search-term in the search dialog, the site doesn't refresh nor show a change in the URL so I can't find a way to search from outside the site.
I'm sorry if I made grammar mistakes, english is not my first language.
This is because the page is driven by AJAX requests - it updates automatically without redirecting you to visible URL.
If you open developer tools in your browser (F12) and navigate to Network panel, you will see that there are indeed requests sent to the server. I typed "test2" and got following:
As you see, request is sent to a URL that looks like this: http://www.howlongtobeat.com/search_main.php?t=games&page=1&sorthead=popular&sortd=Normal%20Order&plat=&detail=0.
I typed "test2", but it's nowhere to be seen.
That's because it was sent using POST request, e.g. the parameters were embedded in the HTTP request itself, not the URL. When I navigated to "Params" tab in the Developer Tools, indeed I could see my input:
queryString: "test2"
So in order to use this search form, you should send a POST request to that URL containing variable "queryString" filled with whatever value you need.
I strongly encourage asking the site owners' about an API, though. Using publicly available form engines that are designed to be used by end-users in automated fashion is considered unethical.
I have a django template that calls a python methods but I can't seem to find a way for the python method to retrieve information from the template such as user input. Could anybody tell me how to do this?
I think you're under a misconception (A reasonably common one) that because you can use django variables in the html template, the template can "send" information back to django/the database.
Instead, try to get into the habit of thinking in terms of request/response. The user requests a web page, the django server builds up the response using content from the database and an html template (via the template language) and serves it to the user. If the response page has a form in it, that's not the "template" sending information back to the Django server, that's the response web page providing the user will tools to make ANOTHER request (this time a POST request).
The template is merely a generic container that you fill up with your dynamic content to build up the entire HTTP response.