I been trying to get a cookie and post it to a url in later use in the program, but I cant seem to get the cookie parameters to work.
Right now I have
response = requests.get("url")
But how exactly do I retrive cookies from this url and post them to a new url (the same cookies). The tutorial in requests is somewhat vague on the topic and gives examples I cannot test. Hope someone can help with further examples.
This is python 2.7 btw.
You want to use a session:
s = requests.session()
response = s.get('url')
You use the session just like the requests module (it has the same methods), but it'll retain cookies for you and send them along on future requests.
Related
How can I know if a website is using apache, nginx or other and get this information in python? Thanks in advance
This information if available is given in the header of the response to a HTTP Request. With Python you can perform HTTP requests using the module requests.
Make a simple GET request to the interested site and then print the headers parameter of the returned object.
import requests
r = requests.get(YOUR_SITE)
print(r.headers)
The output is made of a dictionary of keys and value, you have to look for the Server parameter
server = r.headers['Server']
You must be aware that not all websites return this information for several reasons, so you could not find this key in the response header.
I am having some trouble running the Freedom of information act API in python. I am sure it is related to how I am implementing my API key but I am uncertain as to where I am dropping the ball. Any help is greatly appreciated.
import requests
apikey= ''
api_base_url = f"https://api.foia.gov/api/webform/submit"
endpoint = f"{api_base_url}{apikey}"
r = requests.get(endpoint)
print(r.status_code)
print(r.text)
there error I receive is requests.exceptions.InvalidSchema: No connection adapters were found for this website. Thanks again
According to the documentation, the API requires the API key to be passed as a request header parameter ("X-API-Key"). Your python code appears to be simply concatenating the API key and the URL.
The following Q&A explains how to set a request header using requests.
Using headers with the Python requests library's get method
It would be something like this:
import requests
apikey= ...
api_base_url = ...
r = requests.get(api_base_url,
headers={"X-API-Key": apikey})
print(r.status_code)
print(r.text)
Note that the documentation for the FOIA site explains what you need to do to submit a FIOA request form. It is significantly different to what your Python code is apparently trying to do. I would advise you to read the documentation. Also read the manual entry for the "curl" command so that you understand the requests that the examples show.
I took a programming class in python so I know the basics of the language. A project I'm currently attempting involves submiting a form repeatedly untill the request is successful. In order to achieve a faster success using the program, I thought cutting the browser out of the program by directly sending and recieving data from the server would be faster. Also the website I'm creating the program for has a tendency to crash but I'm pretty sure i could still receive and send response to the server. Currently, im just researching different resources I could use to complete the task. I understand mechanize is easy to fill forms and submit them, but it requires a browser. So my question is what would be the best resource to use within python to communicate directly with the server without a browser.
I apologize if any of my knowledge is flawed. I did take the class but I'm still relatively new to the language.
Yes, there are plenty of ways to do this, but the easiest is the third party library called requests.
With that installed, you can do for example:
requests.post("https://mywebsite/path", {"key: "value"})
You can try this below.
from urllib.parse import urlencode
from urllib.request import Request, urlopen
url = 'https://httpbin.org/post' # Set destination URL here
post_fields = {'foo': 'bar'} # Set POST fields here
request = Request(url, urlencode(post_fields).encode())
json = urlopen(request).read().decode()
print(json)
I see from your tags that you've already decided to use requests.
Here's how to perform a basic POST request using requests:
Typically, you want to send some form-encoded data — much like an HTML
form. To do this, simply pass a dictionary to the data argument. Your
dictionary of data will automatically be form-encoded when the request
is made
import requests
payload = {'key1': 'value1', 'key2': 'value2'}
response = requests.post("http://httpbin.org/post", data=payload)
print(response.text)
I took this example from requests official documentation
I suggest you to read it and try also other examples available in order to become more confident and decide what approach suits your task best.
I recently wanted to extract data from a website that seems to use cookies to grant me access. I do not know very much about those procedures but appearently this inteferes with my method of getting the html content of the website via Python and its requests module.
The code I am running to extract the information contains the following lines:
import responses
#...
response = requests.get(url, proxies=proxies)
content = requests.text
Where the website i am referring to is http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=6675630&tag=1 and proxies is a valid dict of my proxy servers (I tested those settings on websites that seemed to work fine). However, instead of the content of the article on this site I receive the html-content of the page that you get when you do not accept cookies in your browser.
As I am not really aware of what website is really doing and lack real Web-Developement experience I could not find a solution so far, even if a similar question might have been asked before. Is there any solution to access the content of this website via Python?
startr = requests.get('https://viennaairport.com/login/')
secondr = requests.post('http://xxx/', cookies=startr.cookies)
I am trying to access and parse a website at work using Python. The sites authorization is done via siteminder, so the usual urllib/urllib2 user password does not work.
Does anyone have an idea how to do that?
Thanks
NoamM
Just did this - I know its an oldie - but if anyone else looking to do this - use the requests library. I had done this in C# before and used mammoth amounts of code - but this is all it takes to login to my corporate siteminder system - nice. The request.session() object will persist redirects, headers and cookies - so all you need to worry about is posting the login form. I'm sure the variables will be different in your environment, but the process will be the same.
output.text will be the body of the target page you wanted to parse which you can then xpath or whatever.
import requests
r = requests.session()
postUrl = "https://loginUrl"
params = { 'USER': 'user',
'PASSWORD': 'pass',
'SMENC': 'ISO-8859-1',
'SMLOCALE': 'US-EN',
'target': '/redir.shtml?GOTO=redirecturl}',
'smauthreason': '0' }
r.post(postUrl, data=params)
getUrl = "http://urlFromBehindLogInYouWantDataFrom"
output = r.get(getUrl)
print(output.text)
First of all, you should find out what's happening when you authenticate through siteminder. Perhaps there's documentation for it, but if not it's not so hard to find out: the Network tab in Chrome or Safari's developer tools has all the information you need: HTTP Headers and Cookies for every network request. Firebug can give you that as well.
Once you have a clear idea of what's happening at each step of the authentication process, it's only a matter of replicating the same behavior in your script. urllib2 has support for cookies and headers. If you need something urllib2 doesn't provide, PycURL will probably do.
Agree with Martin - you need to just replicate what the browser does. Siteminder will pass you a token once successfully authenticated. I have to do this as well, will post once I find a good way.