Download .gz file using requests in Python Error

Download .gz file using requests in Python Error - python

I would be very grateful if anyone could help me with this issue I am having.
I’m trying to use the request lib to download a .gz file from the internet. I have successfully used the lib before to get xml data that is parsed to the browser, but the .gz version is not working.
Once the URL_To_Gzip link is clicked in my browser, the .gz file automatically starts to download the file. --> so the url is ok, but just points directly to the file.
I’m trying to code this in python 2.7 so I can then process the file and data it contains, but I get an error message that I am struggling to resolve.
Error Message:
HTTPSConnectionPool(host=HOST_URL_TO_GZip, port=443): Max retries exceeded with url: URL_TO_GZip.gz (Caused by : [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond)
import requests
data = requests.get(url_to_gzip,proxies = {"http":proxy_url}) # Does not work data = #Does not work
data = requests.get(url_to_gzip,proxies = {"http":proxy_url}, stream = True) # Does not work
The information on Errno 10060 suggests the error is related to my proxy, as a connection can not be established. --> But I have successfully used these to get the xml data in a similar version.
Thanks,
Ravi
EDIT
The URL_TO_GZip.gz file is via a https:// whereas the xml file that works ok is via a http:// which I think is the cause of my problem and why it works for one file but not another.

For anyone else that comes across this issue, I needed to add a auth = (username, password) keyword to access the HTTPS site auth keyword.

Related

Problem uploading excel file to SharePoint with LDAP

I am trying to upload an Excel file to SharePoint, using the code I found here, but so far, I cannot manage to make it work with my LDAP account. Getting this error:
raise ShareplumRequestError("Shareplum HTTP Post Failed", err)
shareplum.errors.ShareplumRequestError: Shareplum HTTP Post Failed :
HTTPSConnectionPool(host='login.microsoftonline.com', port=443): Max retries exceeded with url:
/extSTS.srf (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at
0x000001B24A126B08>: Failed to establish a new connection: [WinError 10061] No connection could
be made because the target machine actively refused it'))
with this code:
from shareplum import Office365
from shareplum import Site
from shareplum.site import Version
def main():
sp = r"//foo.sharepoint.com/sites/foobarwiki/Shared Documents/SourceExcelFile.xlsx"
cp = r"C:\Users\Git\SourceExcelFile.xlsx"
authcookie = Office365('https://foo.sharepoint.com', username='un', password='pwd').GetCookies()
site = Site(r'https://foo.sharepoint.com/sites/foobarwiki/', version=Version.v365, authcookie=authcookie);
folder = site.Folder('Shared Documents/foobarbar/')
with open(cp, mode='rb') as file:
fileContent = file.read()
folder.upload_file(fileContent, "myfile.txt")
print("Done!")
if __name__ == "__main__":
main()
As this can be considered an xy question, in the meantime, I tried simply using shutil.copy(cp, sp) and .copy2, as it should be ok and I am quite ok to access the SharePoint with other ways, but still no success there as I guess shutil does not like SharePoint a lot.
Any ideas?

Verify SSL certificate from the custom path using python

I have installed apache web server. Generated SSL for the apache website. Got cert file and key. I wrote a python snippet to validate the ssl file for the website. The certificate file path is stored in cer_auth. My code will access file in the cer_auth,validates it and provide the result. But it is showing error. How to solve it?
Here's the code:
import requests
host = '192.168.1.27'
host1 = 'https://'+host
#cer_auth = '/etc/ssl/certs/ca-certificates.crt'
cer_auth = '/home/paulsteven/Apache_SSL/apache-selfsigned.crt'
print(host1)
try:
requests.get(host1, verify= cer_auth)
print("SSL Certificate Verified")
except:
print("No SSL certificate")
Error i got:
https://192.168.1.27
/home/paulsteven/.local/lib/python3.5/site-packages/urllib3/connection.py:362: SubjectAltNameWarning: Certificate for 192.168.1.27 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.)
SubjectAltNameWarning
No SSL certificate

The old way of pointing certificates to hostnames was through the CommonName or CN field. This practice is rapidly changing due to changes in how browsers handle certificates. The current expectation is to have all hostnames and IPs in x509v3 extended fields in the certificate, named subjectAlternativeNames. The instructions you have followed were probably outdated.
Here's a mediocre guide into doing just that with OpenSSL
https://support.citrix.com/article/CTX135602
If you want to sign for some IP addresses, the field name is IP.1 instead of DNS.1 like in the link above.

Python soap client - connection having issue

I am using Python soap API client Zeep and here is the code that I have written:
from zeep import Client
def myapi(request):
client = Client("https://siteURL.asmx?wsdl")
key = client.service.LogOnUser('myusername', 'mypassord')
print(key)
it is giving me an error as: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
While I am trying below command the URL works well and shows all the services it has
python -mzeep https://siteURL.asmx?wsdl
Please help to understand what is the reason above code is not working.
PS: I couldn't share site URL which I am trying to connect to.
Additional Info: The site/page is accessible only through intranet and I am testing locally from intranet itself.
Traceback error:
Exception Type: ConnectionError at /music/mypersonalapi/
Exception Value: HTTPSConnectionPool(host='URL I have hidden', port=81):
Max retries exceeded with url: /ABC/XYZ/Logon.asmx
(Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x0546E770>:
Failed to establish a new connection:
[WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond',))
Please note: I have removed URL and Host information from my traceback due to confidentiality

What this does:
python -mzeep https://site/url.asmx?wsdl
is:
c = Client("https://site/url.asmx?wsdl")
c.wsdl.dump()
both alternatives are using port 443 since that is the default https port.
From your traceback we see
Exception Value: HTTPSConnectionPool(host='URL I have hidden', port=81):
which would have been similar to
python -mzeep https://site:81/url.asmx?wsdl
I.e. the command line and your code are not connecting to the same address (also note that port values less than 1024 requires system level permissions to use -- in case you're writing/controlling the service too).
The last line does say "..failed because the connected party did not properly respond after a period of time..", but that is not the underlying reason. In line 3 you can read
Max retries exceeded with url: /ABC/XYZ/Logon.asmx
in other words, you've tried (and failed) to log on too many times and the server is probably doubling the time it uses to respond every time you try (a well known mitigation strategy for "things" that fail to login multiple times -- i.e. look like an attack). The extended delay is most likely causing the error message you see at the bottom.
You'll need to wait a while, or reset your account for the service, and if the service is yours then perhaps turn off this feature during development?

Maybe this can help. I had the same connexion problem (Max retries exceeded...). I solved it by increasing the transport timeout.
client = Client(wsdl=wsdl, transport=Transport(session=session, timeout=120))

Python - SSL: CERTIFICATE_VERIFY_FAILED

I have a python script that uses the VirusTotal API. It has been working with no problems, but all of a sudden when I run the script I am getting the following error:
urllib2.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590)>
I believe it may be our web proxy that is causing the issue. Is there a way to prevent it from verifying the cert? Here is the portion of the code that uses the API:
json_out = []
url = "https://www.virustotal.com/vtapi/v2/file/report"
parameters = {"resource": my_list,
"apikey": "<MY API KEY>"}
data = urllib.urlencode(parameters)
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
json_out.append (response.read())

I believe it may be our web proxy that is causing the issue. Is there a way to prevent it from verifying the cert?
If you assume that a SSL intercepting proxy is denying the connection then you have to fix the problem at the proxy, i.e. there is no way to instruct the proxy to not check the certificate from your application.
If instead you assume that there is a SSL intercepting proxy and thus the certificate you receive is not signed by a CA you trust then you should get the CA of the proxy and trust it in your application (see cafile parameter in the documentation). Disabling validation is almost never the right way. Instead fix it so that validation works.

There are two possibilities,
You are using a self-signed certificate. Browsers don not trust on such certificate, so be sure that you are using CA-signed trusted certificate.
If you are using CA-signed trusted the certificate that you should have to check for install CA chain certificates (Root and Intermediate certificate).
You can refer this article, it may help you. - https://access.redhat.com/articles/2039753

Logging into a website which uses Microsoft ForeFront "Thread Management Gateway"

I want to use python to log into a website which uses Microsoft Forefront, and retrieve the content of an internal webpage for processing.
I am not new to python but I have not used any URL libraries.
I checked the following posts:
How can I log into a website using python?
How can I login to a website with Python?
How to use Python to login to a webpage and retrieve cookies for later usage?
Logging in to websites with python
I have also tried a couple of modules such as requests. Still I am unable to understand how this should be done, Is it enough to enter username/password? Or should I somehow use the cookies to authenticate? Any sample code would really be appreciated.
This is the code I have so far:
import requests
NAME = 'XXX'
PASSWORD = 'XXX'
URL = 'https://intra.xxx.se/CookieAuth.dll?GetLogon?curl=Z2F&reason=0&formdir=3'
def main():
# Start a session so we can have persistant cookies
session = requests.session()
# This is the form data that the page sends when logging in
login_data = {
'username': NAME,
'password': PASSWORD,
'SubmitCreds': 'login',
}
# Authenticate
r = session.post(URL, data=login_data)
# Try accessing a page that requires you to be logged in
r = session.get('https://intra.xxx.se/?t=1-2')
print r
main()
but the above code results in the following exception, on the session.post-line:
raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='intra.xxx.se', port=443): Max retries exceeded with url: /CookieAuth.dll?GetLogon?curl=Z2F&reason=0&formdir=3 (Caused by <class 'socket.error'>: [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond)
UPDATE:
I noticed that I was providing wrong username/password.
Once that was updated I get a HTTP-200 response with the above code, but when I try to access any internal site I get a HTTP 401 response. Why Is this happening? What is wrong with the above code? Should I be using the cookies somehow?

TMG can be notoriously fussy about what types of connections it blocks. The next step is to find out why TMG is blocking your connection attempts.
If you have access to the TMG server, log in to it, start the TMG management user-interface (I can't remember what it is called) and have a look at the logs for failed requests coming from your IP address. Hopefully it should tell you why the connection was denied.
It seems you are attempting to connect to it over an intranet. One way I've seen it block connections is if it receives them from an address it considers to be on its 'internal' network. (TMG has two network interfaces as it is intended to be used between two networks: an internal network, whose resources it protects from threats, and an external network, where threats may come from.) If it receives on its external network interface a request that appears to have come from the internal network, it assumes the IP address has been spoofed and blocks the connection. However, I can't be sure that this is the case as I don't know what this TMG server's internal network is set up as nor whether your machine's IP address is on this internal network.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Download .gz file using requests in Python Error - python

For anyone else that comes across this issue, I needed to add a auth = (username, password) keyword to access the HTTPS site auth keyword.

Related

Problem uploading excel file to SharePoint with LDAP

Verify SSL certificate from the custom path using python

Python soap client - connection having issue

Python - SSL: CERTIFICATE_VERIFY_FAILED

Logging into a website which uses Microsoft ForeFront "Thread Management Gateway"

Categories

Resources