How to open a list of address in browser - python

I have a list of links, which is stored in a file. I would like to open all the links in my browsers via some script instead of manually copy-pasting every items.
For example, OS: MAC OS X; Browser: Chrome; Script: Python (prefered)

Take a look at the webbrowser module.
import webbrowser
urls = ['http://www.google.com', 'http://www.reddit.com', 'http://stackoverflow.com']
b = webbrowser.get('firefox')
for url in urls:
b.open(url)
P.S.: support for Chrome has been included in the version 3.3, but Python 3.3 is still a release candidate.

Since you're on a mac, you can just use the subprocess module to call open http://link1 http://link2 http://link3. For example:
from subprocess import call
call(["open","http://www.google.com", "http://www.stackoverflow.com"])
Note that this will just open your default browser; however, you could simply replace the open command with the specific browser's command to choose your browser.
Here's a full example for files of the general format
alink
http://anotherlink
(etc.)
from subprocess import call
import re
import sys
links = []
filename = 'test'
try:
with open(filename) as linkListFile:
for line in linkListFile:
link = line.strip()
if link != '':
if re.match('http://.+|https://.+|ftp://.+|file://.+',link.lower()):
links.append(link)
else:
links.append('http://' + link)
except IOError:
print 'Failed to open the file "%s".\nExiting.'
sys.exit()
print links
call(["open"]+links)

Related

Python Uploading on webbrowser

I am writing an script that will upload file from my local machine to an webpage. This is the url: https://tutorshelping.com/bulkask and there is a upload option. but i am trouble not getting how to upload it.
my current script:
import webbrowser, os
def fileUploader(dirname):
mydir = os.getcwd() + dirname
filelist = os.listdir(mydir)
for file in filelist:
filepath = mydir + file #This is the file absolte file path
print(filepath)
url = "https://tutorshelping.com/bulkask"
webbrowser.open_new(url) # open in default browser
webbrowser.get('firefox').open_new_tab(url)
if __name__ == '__main__':
dirname = '/testdir'
fileUploader(dirname)
A quick solution would be to use something like AppRobotic personal macro software to either interact with the Windows pop-ups and apps directly, or just use X,Y coordinates to move the mouse, click on buttons, and then to send keyboard keys to type or tab through your files.
Something like this would work when tweaked, so that it runs at the point when you're ready to click the upload button and browse for your file:
import win32com.client
x = win32com.client.Dispatch("AppRobotic.API")
import webbrowser
# specify URL
url = "https://www.google.com"
# open with default browser
webbrowser.open_new(url)
# wait a bit for page to open
x.Wait(3000)
# use UI Item Explorer to find the X,Y coordinates of Search box
x.MoveCursor(438, 435)
# click inside Search box
x.MouseLeftClick
x.Type("AppRobotic")
x.Type("{ENTER}")
I don't think the Python webbrowser package can do anything else than open a browser / tab with a specific url.
If I understand your question well, you want to open the page, set the file to upload and then simulate a button click. You can try pyppeteer for this.
Disclaimer: I have never used the Python version, only the JS version (puppeteer).

Web Scraping FileNotFoundError When Saving to PDF

I copy some Python code in order to download data from a website. Here is my specific website:
https://www.codot.gov/business/bidding/bid-tab-archives/bid-tabs-2017-1
Here is the code which I copied:
import requests
from bs4 import BeautifulSoup
def _getUrls_(res):
hrefs = []
soup = BeautifulSoup(res.text, 'lxml')
main_content = soup.find('div',{'id' : 'content-core'})
table = main_content.find("table")
for a in table.findAll('a', href=True):
hrefs.append(a['href'])
return(hrefs)
bidurl = 'https://www.codot.gov/business/bidding/bid-tab-archives/bid-tabs-2017-1'
r = requests.get(bidurl)
hrefs = _getUrls_(r)
def _getPdfs_(hrefs, basedir):
for i in range(len(hrefs)):
print(hrefs[i])
respdf = requests.get(hrefs[i])
pdffile = basedir + "/pdf_dot/" + hrefs[i].split("/")[-1] + ".pdf"
try:
with open(pdffile, 'wb') as p:
p.write(respdf.content)
p.close()
except FileNotFoundError:
print("No PDF produced")
basedir= "/Users/ABC/Desktop"
_getPdfs_(hrefs, basedir)
The code runs successfully, but it did not download anything at all, even though there is no Filenotfounderror obviously.
I tried the following two URLs:
https://www.codot.gov/business/bidding/bid-tab-archives/bid-tabs-2017/aqc-088a-035-20360
https://www.codot.gov/business/bidding/bid-tab-archives/bid-tabs-2017/aqc-r100-258-21125
However both of these URLs return >>> No PDF produced.
The thing is that the code worked and downloaded successfully for other people, but not me.
Your code works I just tested. You need to make sure the basedir exists, you want to add this to your code:
if not os.path.exists(basedir):
os.makedirs(basedir)
I used this exact (indented) code but replaced the basedir with my own dir and it worked only after I made sure that the path actually exists. This code does not create the folder in case it does not exist.
As others have pointed out, you need to create basedir beforehand. The user running the script may not have the directory created. Make sure you insert this code at the beginning of the script, before the main logic.
Additionally, hardcoding the base directory might not be a good idea when transferring the script to different systems. It would be preferable to use the users %USERPROFILE% enviorment variable:
from os import envioron
basedir= join(environ["USERPROFILE"], "Desktop", "pdf_dot")
Which would be the same as C:\Users\blah\Desktop\pdf_dot.
However, the above enivorment variable only works for Windows. If you want it to work Linux, you will have to use os.environ["HOME"] instead.
If you need to transfer between both systems, then you can use os.name:
from os import name
from os import environ
# Windows
if name == 'nt':
basedir= join(environ["USERPROFILE"], "Desktop", "pdf_dot")
# Linux
elif name == 'posix':
basedir = join(environ["HOME"], "Desktop", "pdf_dot")
You don't need to specify the directory or create any folder manually. All you need do is run the following script. When the execution is done, you should get a folder named pdf_dot in your desktop containing the pdf files you wish to grab.
import requests
from bs4 import BeautifulSoup
import os
URL = 'https://www.codot.gov/business/bidding/bid-tab-archives/bid-tabs-2017-1'
dirf = os.environ['USERPROFILE'] + '\Desktop\pdf_dot'
if not os.path.exists(dirf):os.makedirs(dirf)
os.chdir(dirf)
res = requests.get(URL)
soup = BeautifulSoup(res.text, 'lxml')
pdflinks = [itemlink['href'] for itemlink in soup.find_all("a",{"data-linktype":"internal"}) if "reject" not in itemlink['href']]
for pdflink in pdflinks:
filename = f'{pdflink.split("/")[-1]}{".pdf"}'
with open(filename, 'wb') as f:
f.write(requests.get(pdflink).content)

Python script to save webpage and rename it while saving (save as - command)

Hi I searched a lot and ended up with no relevant results on how to save a webpage using python 2.6 and renaming it while saving.
Better user requests libraty:
import requests
pagelink = "http://www.example.com"
page = requests.get(pagelink)
with open('/path/to/file/example.html', "w") as file:
file.write(page.text)
You may want to use the urllib(2) package to access the webpage, and then save the file object to the desired location (os.path).
It should look something like this:
import urllib2, os
pagelink = "http://www.example.com"
page = urllib2.urlopen(pagelink)
with open(os.path.join('/(full)path/to/Documents',pagelink), "w") as file:
file.write(page)

webbrowser.open() in python

I have a python file html_gen.py which write a new html file index.html in the same directory, and would like to open up the index.html when the writing is finished.
So I wrote
import webbrowser
webbrowser.open("index.html");
But nothing happen after executing the .py file. If I instead put a code
webbrowser.open("http://www.google.com")
Safari will open google frontpage when executing the code.
I wonder how to open the local index.html file?
Try specifying the "file://" at the start of the URL. Also, use the absolute path of the file:
import webbrowser, os
webbrowser.open('file://' + os.path.realpath(filename))
Convert the filename to url using urllib.pathname2url:
import os
try:
from urllib import pathname2url # Python 2.x
except:
from urllib.request import pathname2url # Python 3.x
url = 'file:{}'.format(pathname2url(os.path.abspath('1.html')))
webbrowser.open(url)

can't open file with browser

I wrote this web crawler program that accesses a website then it writes the output to an HTML file.
I have a problem with the following though. I am not able to open the output file with the web browser. However I can open URL's with the webbrowser module. Is it possible to open files using this method? If yes, how exactly can I do it?
import urllib
import webbrowser
f = open('/Users/kyle/Desktop/html_test.html', 'w')
u=urllib.urlopen('http://www.ebay.com')
f.write(u.read())
f.close()
webbrowser.open_new('/Users/kyle/Desktop/html_test.html')
If you are using python3, you should use urllib.request:
from urllib import request
filename = '/Users/kyle/Desktop/html_test.html'
u = request.urlopen('http://www.ebay.com')
with open(filename, 'wb') as f: #notice the 'b' here
f.write(u.read())
import webbrowser
webbrowser.open_new(filename)

Categories

Resources