How to fetch a file from a URL

How to fetch a file from a URL - python

I have a URL, for example
url = "www.example.com/file/processing/path/excefile.xls"
This URL downloads an excel file directly when I paste it in a browser.
How can I use python to download this file? That is, if I run the python code the above URL should open in a browser and download the excel file.

If you don't necessarily need to go through the browser, you can use the urllib module to save a file to a specified location.
import urllib
url = 'http://www.example.com/file/processing/path/excelfile.xls'
local_fname = '/home/John/excelfile.xls'
filename, headers = urllib.retrieveurl(url, local_fname)
http://docs.python.org/library/urllib.html#urllib.urlretrieve

Use the webbrowser module:
import webbrowser
webbrowser.open(url)

You should definitely look into the awesome requests lib.

Related

How to download PDF files in python that doesn't end with .pdf

The URL looks like this: https://apps.websitename.com/AccountOnlineWeb/AccountOnlineCommand?command=getBlobImage&image=11/19/2019 I have tried everything. But none of them worked.
import requests
from requests.auth import HTTPBasicAuth
url ='https://apps.websitename.com/AccountOnlineWeb/AccountOnlineCommand?command=getBlobImage&image=11/19/2019'
s = requests.Session()
r = requests.get(url, allow_redirects=True, auth=HTTPBasicAuth('username', 'password'))
with open('filepath/file.pdf', 'wb')as f:
f.write(r.content)
I tested getting a .jpg file from the website to make sure the authentication part is working. I have downloaded a file with a .pdf url that's not authenticated to make sure downloading pdf is working. But I just cannot download this file.
I used r.is_redirect to test if the url redirects to another url for the PDF but it returned False
I should mention that when you open the file manually it just waits for like 2s and loads the PDF like a regular PDF and you can download it just like a regular PDF.
Currently my code downloads a file that's supposed to be the PDF but it has 0 KB.

Convert CSV file to HTML and display in browser with Pandas

How can I convert a CSV file to HTML and open it in a web browser via Python using pandas.
Below is my program but I can not display them in the web page:
import pandas
import webbrowser
data = pandas.read_csv(r'C:\Users\issao\Downloads\data.csv')
data = data.to_html()
webbrowser.open('data.html')

You need to pass a url to webbrowser.
Save the html content into a local file and pass it's path to webbrowser
import os
import webbrowser
import pandas
data = pandas.read_csv(r'C:\Users\issao\Downloads\data.csv')
html = data.to_html()
path = os.path.abspath('data.html')
url = 'file://' + path
with open(path, 'w') as f:
f.write(html)
webbrowser.open(url)

You're missing a few steps:
Pandas does not build a full HTML page but only a element.
pd.DataFrame({'a': [1,2,3]}).to_html()
Returns: <table border="1" class="dataframe">...</table>
You need host the HTML somewhere and open a web browser. You can use a local file and do run the browser from python (os.system('firefox page.html'). But I doubt that is what you are looking for.

Doesn't answer the OP's question directly, but for someone who is looking at an alternative to Pandas, they can check out csvtotable(https://github.com/vividvilla/csvtotable), especially the option with "--serve". Sample usage would be something like this: csvtotable data.csv --serve. This "serves" the CSV file to the browser.

How to download a csv file from internet when there is javascript button?

I am trying to download a CSV file from morningstar using python.
here is the link:
http://financials.morningstar.com/income-statement/is.html?t=NAB&region=aus
There is a button to "export CSV" but I can't access the link.
There is this javascript:exportKeyStat2CSV(); but I am not sure how to find the URL of the CSV file?
I tried to download the file and get the URL to use requests/panadas to download the file but requests/panadas cant get anything.
import pandas as pd
URL= ("http://financials.morningstar.com/finan/ajax/exportKR2CSV.html?&callback=?&t=XASX:NAB&region=aus&culture=en-US&cur=&order=asc")
df=pd.read_csv(URL) ( didnt work with pandas)
import requests
print ('Starting to Download!')
r = requests.get(URL)
filename = url.split('/')[-1]
with open(filename, 'wb') as out_file:
out_file.write(r.content)
print("Download complete!")
I get Requests 204 code.
How do I solve the problem?

Have you tried accessing the resource via Selenium with Python?
See this example:
https://stackoverflow.com/a/51725319/5198805

how to download with splinter knowing direction and name of the file

I am working on python and splinter. I want to download a file from clicking event using splinter. I wrote following code
from splinter import Browser
browser = Browser('chrome')
url = "download link"
browser.visit(url)
I want to know how to download with splinter knowing URL and name of the file

Splinter is not involved in the download of a file.
Maybe you need to navigate the page to find the exact URL, but then use the regular requests library for the download:
import requests
url="some.download.link.com"
result = requests.get(url)
with open('file.pdf', 'wb') as f:
f.write(result.content)

Using urllib2 in Python

I am trying to do the following via python:
From this website:
http://www.bmf.com.br/arquivos1/arquivos_ipn.asp?idioma=pt-BR&status=ativo
I would like to check the 4th checkbox and then click on Download image.
That is what I did:
import urllib2
import urllib
url = "http://www.bmf.com.br/arquivos1/arquivos_ipn.asp?idioma=pt-BR&status=ativo"
payload = {"chkArquivoDownload3_ativo":"1"}
data = urllib.urlencode(payload)
request = urllib2.Request(url, data)
print request
response = urllib2.urlopen(request)
contents = response.read()
print contents
Does anyone have any suggestions?

Selenium is a great project, it lets you control a firefox browser with python. Something like this:
from selenium import webdriver
browser = webdriver.Firefox()
browser.get('http://www.bmf.com.br/arquivos1/arquivos_ipn.asp?idioma=pt-BR&status=ativo')
browser.find_element_by_id('chkArquivoDownload3').click()
browser.find_element_by_id('imgSubmeter_ativo').click()
browser.quit()
would probably work.

Web browsers are a complex collection of components which interact together.
Python does not have a web-browser built in (in particular a DOM or Javascript engine) and it is simply downloading a html file which would normally interact with said DOM and javascript in your browser.
The easiest method I foresee:
Pares the string using the python module BeautifulSoup.
Manually make the download request with the information you have parsed.
Save the downloaded image to file

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to fetch a file from a URL - python

Use the webbrowser module: import webbrowser webbrowser.open(url)

You should definitely look into the awesome requests lib.

Related

How to download PDF files in python that doesn't end with .pdf

Convert CSV file to HTML and display in browser with Pandas

How to download a csv file from internet when there is javascript button?

how to download with splinter knowing direction and name of the file

Using urllib2 in Python

Categories

Resources