Download a file from https with authentication - python

I have a Python 2.6 script that downloades a file from a web server. I want this this script to pass a username and password(for authenrication before fetching the file) and I am passing them as part of the url as follows:
import urllib2
response = urllib2.urlopen("http://'user1':'password'#server_name/file")
However, I am getting syntax error in this case. Is this the correct way to go about it? I am pretty new to Python and coding in general.
Can anybody help me out?
Thanks!

If you can use the requests library, it's insanely easy. I'd highly recommend using it if possible:
import requests
url = 'http://somewebsite.org'
user, password = 'bob', 'I love cats'
resp = requests.get(url, auth=(user, password))

I suppose you are trying to pass through a Basic Authentication. In this case, you can handle it this way:
import urllib2
username = 'user1'
password = '123456'
#This should be the base url you wanted to access.
baseurl = 'http://server_name.com'
#Create a password manager
manager = urllib2.HTTPPasswordMgrWithDefaultRealm()
manager.add_password(None, baseurl, username, password)
#Create an authentication handler using the password manager
auth = urllib2.HTTPBasicAuthHandler(manager)
#Create an opener that will replace the default urlopen method on further calls
opener = urllib2.build_opener(auth)
urllib2.install_opener(opener)
#Here you should access the full url you wanted to open
response = urllib2.urlopen(baseurl + "/file")

Use requests library and just put the credentials inside your .netrc file.
The library will load them from there and you will be able to commit the code to your SCM of choice without any security worries.

Related

Log in a site and navigate another pages

I have a script for Python 2 to login into a webpage and then move inside to reach a couple of files pointed to on the same site, but different pages. Python 2 let me open the site with my credentials and then create a opener.open() to keep the connection available to navigate to the other pages.
Here's the code that worked in Python 2:
$Your admin login and password
LOGIN = "*******"
PASSWORD = "********"
ROOT = "https:*********"
#The client have to take care of the cookies.
jar = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(jar))
#POST login query on '/login_handler' (post data are: 'login' and 'password').
req = urllib2.Request(ROOT + "/login_handler",
urllib.urlencode({'login': LOGIN,
'password': PASSWORD}))
opener.open(rep)
#Set the right accountcode
for accountcode, queues in QUEUES.items():
req = urllib2.Request(ROOT + "/switch_to" + accountcode)
opener.open(req)
I need to do the same thing in Python 3. I have tried with request module and urllib, but although I can establish the first login, I don't know how to keep the opener to navigate the site. I found the OpenerDirector but it seems like I don't know how to do it, because I haven't reached my goal.
I have used this Python 3 code to get the result desired but unfortunately I can't get the csv file to print it.
enter image description here
Question: I don't know how to keep the opener to navigate the site.
Python 3.6ยป Documentation urllib.request.build_opener
Use of Basic HTTP Authentication:
import urllib.request
# Create an OpenerDirector with support for Basic HTTP Authentication...
auth_handler = urllib.request.HTTPBasicAuthHandler()
auth_handler.add_password(realm='PDQ Application',
uri='https://mahler:8092/site-updates.py',
user='klem',
passwd='kadidd!ehopper')
opener = urllib.request.build_opener(auth_handler)
# ...and install it globally so it can be used with urlopen.
urllib.request.install_opener(opener)
f = urllib.request.urlopen('http://www.example.com/login.html')
csv_content = f.read()
Use python requests library for python 3 and session.
http://docs.python-requests.org/en/master/user/advanced/#session-objects
Once you login your session will be automatically managed. You dont need to create your own cookie jar. Following is the sample code.
s = requests.Session()
auth={"login":LOGIN,"pass":PASS}
url=ROOT+/login_handler
r=s.post(url, data=auth)
print(r.status_code)
for accountcode, queues in QUEUES.items():
req = s.get(ROOT + "/switch_to" + accountcode)
print(req.text) #response text

Rest call authentication in Python

Sorry for this basic question again, still in learning stages of Python. I am writing a Python script that makes a Rest call which will have basic authentication headers included. In this example, the user is luke and password is mypasswd. Since the password is written in clear text, is there a way to encrypt the password within the script or move authentication outside the script in a more secure way? What is the recommended way of authenticatiion when using Rest with Python?
import urllib2
import base64
import xml.etree.ElementTree as ET
weblink = "https://192.168.1.1/user"
auth = base64.b64encode("luke:mypasswd")
headers = {"Authorization":"Basic " + auth}
You'll have to put somewhere the credentials, so I think you are worried about distributing the credentials with your script. This could be solved by
1) Using a configuration file where you'd store the credentials (https://docs.python.org/2/library/configparser.html)
2) Specify them at the command line
3) Specify them through environment variables.
my recommendation is to use requests package.(pip install requests).
http://docs.python-requests.org/en/latest/
Regarding the security of passwords, you can use Global variables perhaps, or some text file with adequate permissions.
In linux terminal or .bashrc file: export mypasswd="*******"
import os
import base64
import requests
weblink = "https://192.168.1.1/user"
mypasswd = os.getenv("mypasswd")
auth = base64.b64encode("luke:"+str(mypasswd))
headers = {"Authorization":"Basic " + auth}
#In headers you can have some more properties as Content-Type or so on...
#next would be to call the http method you need(GET,POST,PUT,DELETE)
resp = requests.get(weblink,headers=headers)
print resp.text
print resp.status_code

REST GET request on Python sending named argument

I am playing from Linux using curl inside bash but now I need to move my script to python in order to be more effective.
What is the best way to do something like the following line on python?
curl baseuri:port/resource -u user:pswd
I played already with urllib2 but I dont get a clue on how to send the "-u user:pswd"
You'll need to use a HTTPPasswordMgrWithDefaultRealm instance to handle the authentication, and add in a HTTPBasicAuthHandler handler to respond to the authentication challenge:
import urllib2
url = 'baseuri:port/resource'
username = 'user'
password = 'pswd'
pwmgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
pwmgr.add_password(None, url, username, password)
authhandler = urllib2.HTTPBasicAuthHandler(passman)
opener = urllib2.build_opener(authhandler)
response = opener.open(theurl)
Yes, this is a handful.
If you can install a 3rd party library, then add use the requests library; it'll be so much easier:
import requests
url = 'baseuri:port/resource'
username = 'user'
password = 'pswd'
response = requests.get(url, auth=(username, password))
Actually, we use pycurl instead which also works pretty well. Here an example to get a json from the request:
import pycurl
import json
from io import BytesIO
data = BytesIO()
pyCurl = pycurl.Curl()
pyCurl.setopt(pyCurlClass.URL,string_http)
pyCurl.setopt(pycurl.USERPWD, 'user:password')
pycurl.setopt(c.WRITEFUNCTION, data.write)
pyCurl.perform()
dictionary = json.loads(data.getvalue())

Python requests lib, is requests.Session equivalent to urllib2's opener?

I need to accomplish a login task in my own project.Luckily I found someone has it done already.
Here is the related code.
import re,urllib,urllib2,cookielib
class Login():
cj = cookielib.LWPCookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
def __init__(self,name='',password='',domain=''):
self.name=name
self.password=password
self.domain=domain
urllib2.install_opener(self.opener)
def login(self):
params = {'domain':self.domain,'email':self.name,'password':self.password}
req = urllib2.Request(
website_url,
urllib.urlencode(params)
)
self.openrate = self.opener.open(req)
print self.openrate.geturl()
info = self.openrate.read()
I've tested the code, it works great (according to info).
Now I want to port it to Python 3 as well as using requests lib instead of urllib2.
My thoughts:
since the original code use opener, though not sure, I think its equivalent in requests is requests.Session
Am I supposed to pass in a jar = cookiejar.CookieJar() when making request? Not sure either.
I've tried something like
import requests
from http import cookiejar
from urllib.parse import urlencode
jar = cookiejar.CookieJar()
s = requests.Session()
s.post(
website_url,
data = urlencode(params),
allow_redirects = True,
cookies = jar
)
Also, followed the answer in Putting a `Cookie` in a `CookieJar`, I tried making the same request again, but none of these worked.
That's why I'm here for help.
Will someone show me the right way to do this job? Thank you~
An opener and a Session are not entirely analogous, but for your particular use-case they match perfectly.
You do not need to pass a CookieJar when using a Session: Requests will automatically create one, attach it to the Session, and then persist the cookies to the Session for you.
You don't need to urlencode the data: requests will do that for you.
allow_redirects is True by default, you don't need to pass that parameter.
Putting all of that together, your code should look like this:
import requests
s = requests.Session()
s.post(website_url, data = params)
Any future requests made using the Session you just created will automatically have cookies applied to them if they are appropriate.

using python urlopen for a url query

Using urlopen also for url queries seems obvious. What I tried is:
import urllib2
query='http://www.onvista.de/aktien/snapshot.html?ID_OSI=86627'
f = urllib2.urlopen(query)
s = f.read()
f.close()
However, for this specific url query it fails with HTTP error 403 forbidden
When entering this query in my browser, it works.
Also when using http://www.httpquery.com/ to submit the query, it works.
Do you have suggestions how to use Python right to grab the correct response?
Looks like it requires cookies... (which you can do with urllib2), but an easier way if you're doing this, is to use requests
import requests
session = requests.session()
r = session.get('http://www.onvista.de/aktien/snapshot.html?ID_OSI=86627')
This is generally a much easier and less-stressful method of retrieving URLs in Python.
requests will automatically store and re-use cookies for you. Creating a session is slightly overkill here, but is useful for when you need to submit data to login pages etc..., or re-use cookies across a site... etc...
using urllib2 is something like
import urllib2, cookielib
cookies = cookielib.CookieJar()
opener = urllib2.build_opener( urllib2.HTTPCookieProcessor(cookies) )
data = opener.open('url').read()
It appears that the urllib2 default user agent is banned by the host. You can simply supply your own user agent string:
import urllib2
url = 'http://www.onvista.de/aktien/snapshot.html?ID_OSI=86627'
request = urllib2.Request(url, headers={"User-Agent" : "MyUserAgent"})
contents = urllib2.urlopen(request).read()
print contents

Categories

Resources