urllib HTTP Error: 400 Bad Request

urllib HTTP Error: 400 Bad Request - python

I looked at this answer: Again urllib.error.HTTPError: HTTP Error 400: Bad Request because it was very similar to my question, but that solution did not work. I'm using Python 3.3.2. The lines that are like ..something.. are just values I replaced to protect my privacy. They should be strings
I'm getting an Error 400: Bad request from this code:
import urllib.parse
import urllib.request
url = '..url..'
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
values = {'email':'..my email address..',
'github':'..my github account..'}
headers = {'User-Agent':user_agent}
data = urllib.parse.urlencode(values)
data = data.encode('utf-8')
req = urllib.request.Request(url, data, headers)
response = urllib.request.urlopen(req) #this line causes the errors
page = response.read()
This is the particular error message:
Traceback (most recent call last):
File "/Users/.../Documents/Code 2040/File.py", line 23, in <module>
response = urllib.request.urlopen(req)
File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/urllib/request.py", line 156, in urlopen
return opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/urllib/request.py", line 475, in open
response = meth(req, response)
File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/urllib/request.py", line 587, in http_response
'http', request, response, code, msg, hdrs)
File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/urllib/request.py", line 513, in error
return self._call_chain(*args)
File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/urllib/request.py", line 447, in _call_chain
result = func(*args)
File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/urllib/request.py", line 595, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request

Related

How to use urllib in python?

I was wondering if somebody can help me with getting my code to work.
import urllib.request
import urllib.parse
import re
url = 'https://www.google.com'
values = {'s':'basics',
'submit':'search'}
data = urllib.parse.urlencode(values)
data = data.encode('utf-8')
req = urllib.request.Request(url,data)
resp = urllib.request.urlopen(req)
respData = resp.read()
print(respData)
It is giving me this error message when running the code.
TypeError: POST data should be bytes, an iterable of bytes, or a file object. It cannot be of type str.
I hope someone can help me with my problem. If not thanks anyways.
It is giving me this gigantic error message:
It is giving me this gigantic error
Traceback (most recent call last):
File "C:\Users\user\OneDrive\Desktop\Lotto.py", line 11, in <module>
resp = urllib.request.urlopen(req)
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\urllib \request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 569, in error
return self._call_chain(*args)
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 405: Method Not Allowed

It's not working because you have a TYPO:
data = urllib.parse.urlencode(values)
date = data.encode('utf-8') # YOUR TYPO IS HERE
req = urllib.request.Request(url,data)
You need to change that line to:
data = data.encode('utf-8') # TYPO FIXED!

Python: urllib2.HTTPError: HTTP Error 401: authenticationrequired

I was trying to get a web page, but got into this problem. I've looked up some references, and this is what I've done so far:
import sys
import urllib2
from bs4 import BeautifulSoup
user = 'myuserID'
password = "mypassword"
ip = sys.argv[1]
url = "http://www.websites.com/" + ip
passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, url, user, password)
handler = urllib2.HTTPBasicAuthHandler(passman)
opener = urllib2.build_opener(handler)
urllib2.install_opener(opener)
header = {
'Connection' : 'keep-alive',
'User-Agent' : 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Firefox/38.0',
'Accept-Language' : 'en-US,en;q=0.5',
'Accept-Encoding' : 'gzip, deflate'
}
html = urllib2.urlopen(urllib2.Request(url, None, header))
soup = BeautifulSoup(html, 'html.parser')
# some if else function afterwards #
When I try to run the script, it shows this kind of error:
python checker.py 8.8.8.8
Traceback (most recent call last):
File "checker.py", line 34, in <module>
html = urllib2.urlopen(urllib2.Request(url, None, header))
File "C:\Python27\lib\urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "C:\Python27\lib\urllib2.py", line 437, in open
response = meth(req, response)
File "C:\Python27\lib\urllib2.py", line 550, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python27\lib\urllib2.py", line 469, in error
result = self._call_chain(*args)
File "C:\Python27\lib\urllib2.py", line 409, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 656, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "C:\Python27\lib\urllib2.py", line 437, in open
response = meth(req, response)
File "C:\Python27\lib\urllib2.py", line 550, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python27\lib\urllib2.py", line 475, in error
return self._call_chain(*args)
File "C:\Python27\lib\urllib2.py", line 409, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 558, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 401: authenticationrequired
But if I opened the page or other web page, and manually enter my credential, this script works fine after that. Am I missing something?
Just to add, my current network are using McAfee web gateway device. So sometimes we need to enter our credential to proceed browsing the net. Our user/pass are integrated with Active Directory. Is that may cause the issue?

This seems to work really well (taken from another thread)
import urllib2
import base64
import sys
user = 'myuserID'
password = "mypassword"
ip = sys.argv[1]
url = "http://www.websites.com/" + ip
request = urllib2.Request(url)
base64string = base64.encodestring('%s:%s' % (user, password)).replace('\n', '')
request.add_header("Authorization", "Basic %s" % base64string)
result = urllib2.urlopen(request)
Or you may use requests:
from requests.auth import HTTPBasicAuth
user = 'myuserID'
password = "mypassword"
ip = sys.argv[1]
url = "http://www.websites.com/" + ip
res=requests.get(url , auth=HTTPBasicAuth(user, password))
print res.text

How to get HTML of a webpage that requires authentification using python

I using python want to get the raw HTML of a webpage that requires authentication.
Similar to this question but the answers here do not work.
Code I am trying:
import urllib, urllib2, cookielib
username = 'redacted'
password = 'redacted'
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
login_data = urllib.urlencode({'username' : username, 'j_password' : password})
opener.open('https://redacted.net', login_data)#http://www.example.com/login.php
resp = opener.open('https://redacted.net')#http://www.example.com/hiddenpage.php
print resp.read() #print strait HTML of the page can use opener to view any page using your session cookie.
Error:
Traceback (most recent call last):
File "C:/Users/Jacob/Desktop/School/Python_Scripts/session refresher/session_refresher.py", line 9, in <module>
opener.open('Redacted', login_data)#http://www.example.com/login.php
File "C:\Python27\lib\urllib2.py", line 437, in open
response = meth(req, response)
File "C:\Python27\lib\urllib2.py", line 550, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python27\lib\urllib2.py", line 475, in error
return self._call_chain(*args)
File "C:\Python27\lib\urllib2.py", line 409, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 558, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 401: Unauthorized
Here is what the window that popups to ask for authentication when I go to the webpage with a browser.

I'd use requests for this as it is simpler than what urllib provides for authentication.
import requests
r = requests.get("https://redacted.net", auth=('username', 'password'))
print(r.text)

use requests and supply your user/pass pair in the request:
import requests
requests.get('https://redacted.net', auth=('user', 'pass'))

HTTP Error 409: conflict urllib2 python

I would like to iterate over a list of url api requests with different ?get variables (x1,....,xn).
from urllib2 import urlopen, Request
import pandas as pd
var=[x[0],...,x[n]]
url_list=['http://BLAH/get?var[0]&fmt=csv','http://BLAH/get?var[1]&fmt=csv',... ...,'http://BLAH/get?var[n]&fmt=csv']
j=0
while j < len(url_list):
req=Request(url_list[j])
response=urlopen(req)
df=pd.read_csv(response)
print df
j=j+1**
I have tried to del req and response.close() but my code still produces a conflict error.
File "C:\Users\Anaconda\lib\urllib2.py", line 127, in urlopen
return _opener.open(url, data, timeout)
File "C:\Users\Anaconda\lib\urllib2.py", line 410, in open
response = meth(req, response)
File "C:\Users\Anaconda\lib\urllib2.py", line 523, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Users\nfitzsimons\Anaconda\lib\urllib2.py", line 448, in error
return self._call_chain(*args)
File "C:\Users\Anaconda\lib\urllib2.py", line 382, in _call_chain
result = func(*args)
File "C:\Users\Anaconda\lib\urllib2.py", line 531, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
rllib2.HTTPError: HTTP Error 409: Conflict
Has anybody any suggestions?

If you're just trying to iterate over a list of urls and print the contents upon a successful request, why not try this?
#!/usr/bin/env python
try:
import requests
except ImportError as err:
print("Woops, you're missing " + str(err))
urls = []
req = requests
for url in urls:
response = req.get(url)
if response.status_code == 200: #Successful request
print(response.content)

Cannot fetch url using neither urllib2 nor requests

I'm trying to do this on remote Ubuntu Server:
>>> import urllib2, requests
>>> url = 'http://python.org/'
>>> urllib2.urlopen(url)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 406, in open
response = meth(req, response)
File "/usr/lib/python2.7/urllib2.py", line 519, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.7/urllib2.py", line 444, in error
return self._call_chain(*args)
File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 527, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: Not Found
>>> requests.get(url)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/django/zyq2/venv/local/lib/python2.7/site-packages/requests/api.py", line 55, in get
return request('get', url, **kwargs)
File "/home/django/zyq2/venv/local/lib/python2.7/site-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/home/django/zyq2/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 382, in request
resp = self.send(prep, **send_kwargs)
File "/home/django/zyq2/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 505, in send
history = [resp for resp in gen] if allow_redirects else []
File "/home/django/zyq2/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 99, in resolve_redir ts
raise TooManyRedirects('Exceeded %s redirects.' % self.max_redirects)
requests.exceptions.TooManyRedirects: Exceeded 30 redirects.
But it works fine on local Windows machine:
>>> urllib2.urlopen(url)
<addinfourl at 57470168 whose fp = <socket._fileobject object at 0x036CB630>>
>>> requests.get(url)
<Response [200]>
I have absolutely no idea about what's going on and would appreciate any suggestion.
Update
I tried S.M. Al Mamun's suggestion and got an exception with long traceback:
>>> req = urllib2.Request(url, headers={ 'User-Agent': 'Mozilla/5.0' })
>>> urllib2.urlopen(req).read()
...
long traceback (more than one page)
...
urllib2.HTTPError: HTTP Error 303: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
See Other
Infinite loop again (I mean TooManyRedirects exception).

Try using a user-agent:
req = urllib2.Request(url, headers={ 'User-Agent': 'Mozilla/5.0' })
urllib2.urlopen(req).read()
If it doesn't work still, that might be your Ubuntu is offline!

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

urllib HTTP Error: 400 Bad Request - python

Related

How to use urllib in python?

Python: urllib2.HTTPError: HTTP Error 401: authenticationrequired

How to get HTML of a webpage that requires authentification using python

HTTP Error 409: conflict urllib2 python

Cannot fetch url using neither urllib2 nor requests

Categories

Resources