python - urllib.error.HTTPError: HTTP Error 401: Unauthorized - python

I'm new to pyhton and just tried to write data from external file. I have no idea where i'm going wrong. can anyone please help me with this.
Thanks in advance.
from urllib import request
url = r'https://query1.finance.yahoo.com/v7/finance/download/AMD?period1=1497317134&period2=1499909134&interval=1d&events=history&crumb=HwDtuBHqtg0'
def download_csv(csv_url):
csv = request.urlopen(csv_url)
csv_data = csv.read
csv_str = str(csv_data)
file = csv_str.split('\\n')
dest_url = r'appl.csv'
wr = open(dest_url, 'w')
for data in file:
wr.write(data + '\n')
wr.close()
download_csv(url)

So I ran the URL in the browser, and it states clearly that your API requires a cookie.
So you must provide a proper header, usually with urllib you can manage sessions but honestly I would opt for a more user-friendly library, such as the requests python library (HTTP for Humans)
Example:
s = requests.Session()
s.get('http://httpbin.org/cookies/set/sessioncookie/123456789')
r = s.get('http://httpbin.org/cookies')
print(r.text)
# '{"cookies": {"sessioncookie": "123456789"}}'
More: http://docs.python-requests.org/en/master/user/advanced/#session-objects

Related

How change the encoding in POST method in CaseInsensitiveDict()?

First thank you for your time. I'm trying to do an insert using a Rest-API POST, I'm working with Python. Among my messages I have special characters that I want to keep in the destination, which by the way returns an error for them since by default the messages are in UTF-8, but I want them in "ISO-8859-1".
For this I have created the line: headers["Charset"] = "ISO-8859-1" . Python does not give me an error but I continue with the same problem.
The error is:
400 Client Error: Bad Request for url: https://api.example.com/
Here is my code:
import requests
from requests.structures import CaseInsensitiveDict
url = 'https://api.example.com/'
headers = CaseInsensitiveDict()
headers["Accept"] = "application/json"
headers["Authorization"] = "Bearer "
headers["Content-Type"] = "application/json"
headers["Charset"] = "ISO-8859-1"
collet_x = df_spark.collect()
for row in collet_x:
#insert
resp = requests.post(url, headers=headers, data=row['JSON'])
v_respuesta = resp.text
print(resp.status_code)
print(v_respuesta)
How else can I change the encoding?
From already thank you very much.
Regards

How to pass a file from a URL via Python to an API?

I'm trying to use an API where you give it a CSV file and it validates it. For this API you pass it a file.
What I'm trying to do is this:
import requests
import urllib3
api_url="https://validation.openbridge.io/dryrun"
file_url = 'http://winterolympicsmedals.com/medals.csv'
http = urllib3.PoolManager()
response = http.request('GET', file_url)
data = response.data.decode('utf-8')
headers = {
# requests won't add a boundary if this header is set when you pass files=
'Content-Type': 'multipart/form-data',
}
response = requests.post(api_url, headers=headers, files=data)
I'm getting an error I don't quite understand:
'cannot encode objects that are not 2-tuples'
I'm fairly new to python, I'm mostly experienced in other languages but what I think the issue is that I'm probably passing the API the data in the file instead of giving it the file.
Any suggestions on what I"m doing wrong and how I can pass that file to the API?
For anyone who stumbles across this off google in the future. I figured out how to do it.
import requests
import urllib3
api_url="https://validation.openbridge.io/dryrun"
file_url="http://winterolympicsmedals.com/medals.csv"
remote_file = urllib3.PoolManager().request('GET', file_url)
response = requests.post(url=api_url,
json={'data': {'attributes': {'is_async': True }}},
files={ "file": remote_file},
allow_redirects=False)
if response.ok:
print("Upload completed successfully!")
print()
print("Status Code: ")
print(response.status_code)
print()
print(response.headers['Location'])
print()
else:
print("Something went wrong!")
print()
print("Status Code: ")
print(response.status_code)
print()
print(response.text)
print()

Need to download the PDF, NOT the content of the webpage

So as it stands I am able to get the content of the webpage of the PDF link EXAMPLE OF THE LINK HERE BUT, I don't want the content of the webpage I want the content of the PDF so I can put the content into a PDF on my computer in a folder.
I have been successful in doing this on sites that I don't need to log into and without a proxy server.
Relevant CODE:
import os
import urllib2
import time
import requests
import urllib3
from random import *
s = requests.Session()
data = {"Username":"username", "Password":"password"}
url = "https://login.url.com"
print "doing things"
r2 = s.post(url, data=data, proxies = {'https' : 'https://PROXYip:PORT'}, verify=False)
#I get a response 200 from printing r2
print r2
downlaod_url = "http://msds.walmartstores.com/client/document?productid=1000527&productguid=54e8aa24-0db4-4973-a81f-87368312069a&DocumentKey=undefined&HazdocumentKey=undefined&MSDS=0&subformat=NAM"
file = open("F:\my_filepath\document" + str(maxCounter) + ".pdf", 'wb')
temp = s.get(download_url, proxies = {'https' : 'https://PROXYip:PORT'}, verify=False)
#This prints out the response from the proxy server (i.e. 200)
print temp
something = uniform(5,6)
print something
time.sleep(something)
#This gets me the content of the web page, not the content of the PDF
print temp.content
file.write(temp.content)
file.close()
I need help figuring out how to "download" the content of the PDF
try this:
import requests
url = 'http://msds.walmartstores.com/client/document?productid=1000527&productguid=54e8aa24-0db4-4973-a81f-87368312069a&DocumentKey=undefined&HazdocumentKey=undefined&MSDS=0&subformat=NAM'
pdf = requests.get(url)
with open('walmart.pdf', 'wb') as file:
file.write(pdf.content)
Edit
Try again with a requests session to manage cookies (assuming they send you those after login) and also maybe a different proxy
proxy_dict = {'https': 'ip:port'}
with requests.Session() as session:
# Authentication request, use GET/POST whatever is needed
# data variable should hold user/password information
auth = session.get(login_url, data=data, proxies=proxy_dict, verify=False)
if auth.status_code == 200:
print(auth.cookies) # Tell me if you got anything
pdf = auth.get('download_url') # Were continuing the same session
with open('walmart.pdf', 'wb') as file:
file.write(pdf.content)
else:
print('No go, got {0} response'.format(auth.status_code))

subprocess.call() formatting issue

I am very new to subprocess, and I have a hard debugging it without any error code.
I'm trying to automatically call an API which respond to :
http -f POST https://api-adresse.data.gouv.fr/search/csv/ columns=voie columns=ville data#path/to/file.csv > response_file.csv
I've tried various combination with subprocess.call, but I only manage to get "1" as an error code.
What is the correct way to format this call, knowing that the answer from the API has to go in a csv file, and that I send a csv (path after the #data)?
EDIT: Here are my attempts :
ret = subprocess.call(cmd,shell=True)
ret = subprocess.call(cmd.split(),shell=True)
ret = subprocess.call([cmd],shell=True)
The same with shell = False, and with stdout = myFileHandler (open inside a with open(file,"w") as myFileHandler:)
EDIT2 : still curious about the answer, but I managed to go around with Request, as #spectras suggested
file_path = "PATH/TO/OUTPUT/FILE.csv"
url = "https://api-adresse.data.gouv.fr/search/csv/"
files = {'data': open('PATH/TO/CSV/FILE.csv','rb')}
values = {'columns': 'Adresse', 'columns': 'Ville', 'postcode': 'CP'}
r = requests.post(url, files=files, data=values)
with open(file_path, "w") as myFh:
myFh.write(r.content)
Since you are attempting to send a form, may I suggest you do it straight from python?
import requests
with open('path/to/file', 'rb') as fd:
payload = fd.read()
r = requests.post(
'https://api-adresse.data.gouv.fr/search/csv/',
data=(
('columns', 'voie'),
('columns', 'ville'),
),
files={
'data': ('filename.csv', payload, 'text/csv'),
}
)
if r.status_code not in requests.codes.ok:
r.raise_for_status()
with open('response_file.csv', 'wb') as result:
result.write(r.content)
This uses the ubiquitous python-requests module, and especially the form file upload part of the documentation.
It's untested. Basically, I opened httpie documentation and converted your command line arguments into python-requests api arguments.

python can't send post request image

I'm trying to decode a qr image from a website with python: https://zxing.org/w/decode.jspx
And i don't know why my post requests fail and i don't get any response
import requests
url ="https://zxing.org/w/decode.jspx"
session = requests.Session()
f = {'f':open("new.png","rb")}
response = session.post(url,files = f)
f = open("page.html","w")
f.write(response.text)
f.close()
session.close()
Even when i do it with a get requests it still fail ... :/
url ="https://zxing.org/w/decode.jspx"
session = requests.Session()
data = {'u':'https://www.qrstuff.com/images/default_qrcode.png'}
response = session.post(url,data = data)
f = open("page.html","w")
f.write(response.text)
f.close()
session.close()
maby because the website contain two forms ? ...
Thanks for helping
You can do this:
import urllib
url ="https://zxing.org/w/decode?u=https://www.qrstuff.com/images/default_qrcode.png"
response = urllib.urlopen(url)
f = open("page.html","w")
f.write(response.read())
f.close()
If you want to send url action == get and if you want to post data as a file, action == post.
You can check it with Hackbar addons on Firefox
Well i just saw my mistake ...
the web site is : https://zxing.org/w/decode.jspx
but once you have a post or a get it'll be
https://zxing.org/w/decode without ".jspx" so i just removed it and every thing worked well !!

Categories

Resources