How to make screenshot from web page? [duplicate] - python

This question already has answers here:
ModuleNotFoundError: What does it mean __main__ is not a package?
(6 answers)
Closed 4 years ago.
How to make screenshot from any url (web page)?
I was trying:
from .ghost import Ghost
ghost = Ghost(wait_timeout=4)
ghost.open('http://www.google.com')
ghost.capture_to('screen_shot.png')
Result:
No module named '__main__.ghost'; '__main__' is not a package
I was trying also:
Python Webkit making web-site screenshots using virtual framebuffer
Take screenshot of multiple URLs using selenium (python)
Fastest way to take a screenshot with python on windows
Take a screenshot of open website in python script
I've also tried other methods that are not listed here.
Nothing succeeded. Or an error or module is not found .. or or or.
I'm tired. Is there an easy way to make a screenshot of a web page using Python 3.X?
upd1:
C:\prg\PY\PUMA\tests>py save-web-html.py
Traceback (most recent call last):
File "save-web-html.py", line 2, in <module>
from .ghost import Ghost
ModuleNotFoundError: No module named '__main__.ghost'; '__main__' is not a package
upd2:
C:\prg\PY\PUMA\tests>py save-web-html.py
Exception ignored in: <bound method Ghost.__del__ of <ghost.ghost.Ghost object at 0x0000020A169CF860>>
Traceback (most recent call last):
File "C:\Users\Coar\AppData\Local\Programs\Python\Python36\lib\site-packages\ghost\ghost.py", line 325, in __del__
self.exit()
File "C:\Users\Coar\AppData\Local\Programs\Python\Python36\lib\site-packages\ghost\ghost.py", line 315, in exit
self._app.quit()
AttributeError: 'NoneType' object has no attribute 'quit'
Traceback (most recent call last):
File "save-web-html.py", line 4, in <module>
ghost = Ghost(wait_timeout=4)
TypeError: __init__() got an unexpected keyword argument 'wait_timeout'

In the late 80's this may have been a simple task, just render some html to an image instead of the screen.
But these days web-pages require client-side execution to build parts of its DOM and re-render based on client-side initiated AJAX (or equivalent) requests... it's a whole thing "web 2.0" thing.
Rendering a web-site such as http://google.com as a simple html return should be easy, but rendering something like https://www.facebook.com/ or https://www.kogan.com/ will have many back & fourth comms to display what you're expecting to see.
So restricting this to a pure python solution may not be plausible; I'm not aware of a python-based browser.
Consider running a separate service to take the screenshots, and use your core application (in python) to fetch requested screenshots.
I just tried a few with docker, many of them struggle with https and the aforementioned ajax behaviour.
earlyclaim/docker-manet appears to work demo page
edit: from your comments, you need the data from a graph that's rendered using a 2nd request.
you just need the json return from https://www.minnowbooster.net/limit/chart
try:
from urllib.request import urlopen # py3
except ImportError:
from urllib2 import urlopen # py2
import json
url = 'https://www.minnowbooster.net/limit/chart'
response = urlopen(url)
data_str = response.read().decode()
data = json.loads(data_str)
print(data)

Related

Python 3.6.3: Multiple "Exception ignored in: <generator object..." in flask app

I'm running a flask app, upgraded everything from Python 2.7 to 3 about 5 months ago.
Most things have gone smooth enough, other than this one that's consistently bugging me locally. I'm on a MacBook on OSX 10.12.6, and a brew install of Python 3.6.3 under a virtualenv.
When a request comes in from a page that seemingly has multiple static requests (.css, .js, and image files mainly), I seem to be able to get this error just about anywhere that's using generators anywhere in my code.
Some examples (request is a flask.request object):
A place that checks to see if a path starts with '/static' of '/admin/static' (my code),
any(request.path.startswith(k) for k in self._static_paths)
.
Exception ignored in: <generator object CustomPrincipal._is_static_route.<locals>.<genexpr> at 0x11450f3b8>
Traceback (most recent call last):
File "/Developer/repos/git/betapilibs/lbbsports/flask_monkeypatches.py", line 22, in <genexpr>
any(_checker(request.path, k) for k in self._static_paths)
SystemError: error return without exception set
If a url path is restricted, check if the logged in user has the proper permissions / role,
return role in (role.name for role in self.roles)
.
Exception ignored in: <generator object UserMixin.has_role.<locals>.<genexpr> at 0x1155a7e08>
Traceback (most recent call last):
File "/Developer/virtualenvs/lbb3/lib/python3.6/site-packages/flask_security/core.py", line 386, in <genexpr>
SystemError: error return without exception set
A custom bit of code to ensure their "sub" account id is valid,
(not any(ident == account_id for ident in account_ids))
.
Exception ignored in: <generator object CustomSession.get_set_accounts.<locals>.<genexpr> at 0x115ff4fc0>
Traceback (most recent call last):
File "/Developer/customflask/flasklogin.py", line 168, in <genexpr>
SystemError: error return without exception set
Now, nothing seems to break in the system, I just get these error messages, and not consistently, only sometimes. If I set a breakpoint anywhere these errors are being reported to be happening, they don't error any more.
If I do something like, in the first example, break it into request.path.startswith('/static') or request.path.startswith('/admin/static'), I no longer get the error message, and in general, I never have a problem using request all over the place in the rest of the app.
A thing that was wrong in my local development setup was that I was serving all the /static and /admin/static through the flask app, instead of serving them through the web-server (in my case, nginx). So for some of the urls I was hitting, there might have been 10 requests come in basically at the same time, with Flask in debug mode, and a debugger connected as well (via PyCharm).
When I went through the trouble to ensure that all '/static' and '/admin/static' get served from there, instead of via flask, and flask was only getting 1 request per url, this problem went away.
I won't mark this as the answer, because there is still an underlying issue, but in case others have the same problem as me, this was a solution for my situation.

netCDF4 - Python error

Can anyone tell me what I did wrong? I am using python-conda, and the files I have from http://meop40.troja.mff.cuni.cz:11180/gw.projekt/data.stratopauza/netcdf.profily/
Why it tells me that file doesn't exist?
>>> import netCDF4
>>> import pandas as pd
>>> import matplotlib.pyplot as plt
>>> url = 'http://meop40.troja.mff.cuni.cz:11180/gw.projekt/data.stratopauza/netcdf.profily/atmPrf_C001.2010.227.00.03.G04_2013.3520_nc'
>>> nc = netCDF4.Dataset(url)
**syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET or SCAN_ERROR
context: <!DOCTYPE^ HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><html><head><title>404 Not Found</title></head><body><h1>Not Found</h1><p>The requested URL /gw.projekt/data.stratopauza/netcdf.profily/atmPrf_C001.2010.227.00.03.G04_2013.3520_nc.dds was not found on this server.</p><hr><address>Apache/2.4.12 (Ubuntu) Server at meop40.troja.mff.cuni.cz Port 11180</address></body></html>
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "netCDF4\_netCDF4.pyx", line 1811, in netCDF4._netCDF4.Dataset.__init__ (netCDF4\_netCDF4.c:12626)
IOError: NetCDF: file not found**
NetCDF4.Dataset() can only access remote NetCDF files which are served by an OPeNDAP service, which can return metadata about the file. The error message returned is incorrect and misleading.
There is a brief tutorial, which mentions this and gives basic information at: http://unidata.github.io/netcdf4-python/#section1
I downloaded the file and had no problem opening the file. You should use the method in the answer to your previous question https://stackoverflow.com/a/44622713/1211981
Update:
Go to:
http://meop40.troja.mff.cuni.cz:11180/gw.projekt/data.stratopauza/netcdf.profily/
Click one or more of the links and save to a folder where you will run your script. Change your script or python commands to:
>>> url = 'atmPrf_C001.2010.227.00.03.G04_2013.3520_nc'
>>> nc = netCDF4.Dataset(url)
netCDF4.Dataset() will take either a url or a local file name and work the same way. In this case it will recognize the file as a NetCDF / OPeNDAP compatible.

uTorrent Automation using pywinauto

I am trying out an utorrent automation using pywinauto lib. I want to add a torrent with URL. This option is under the file menu. I can get as far as opening uTorrent and then nothing happens. I used Swapy for generating this code. The box below opens only when I run the code in swapy. But when I save it into a file and run with cmd, only utorrent opens and a traceback occurs in the cmd.
from pywinauto.application import Application
app = Application().Start(cmd_line=u'"C:\\Users\\User\\AppData\\Roaming\\uTorrent\\u Torrent.exe" ')
torrentdfb = app[u'\xb5Torrent4823DF041B09']
torrentdfb.Wait('ready')
menu_item = torrentdfb.MenuItem(u'&File->Add Torrent from &URL...\tCtrl+U')
menu_item.Click()
app.Kill_()
Traceback:
Traceback (most recent call last):
File "AddTorrent.py", line 5, in <module>
torrentdfb.Wait('ready')
File "C:\Python27\lib\site-packages\pywinauto\application.py", line 380, in Wait
WaitUntil(timeout, retry_interval, lambda: self.__check_all_conditions(check_method_names))
File "C:\Python27\lib\site-packages\pywinauto\timings.py", line 308, in WaitUntil
raise err
pywinauto.timings.TimeoutError: timed out
I am new to python coding and I am not an expert. It would be helpful if you provide the explanation to solve my problem or the code. Thanks!!
uTorrent is spawning another process, this is how I got it:
>>> app.windows_()
[]
>>> app.process
6096
>>> app.connect(title_re=u'^μTorrent.*(build \d+).*')
<pywinauto.application.Application object at 0x000000000405C240>
>>> app.process
4044L
This is a final code working for me (with 32-bit uTorrent and 32-bit Python 2.7):
import pywinauto
app = pywinauto.Application().start(r'uTorrent.exe')
time.sleep(5) # because method connect() has no timeout param yet (planned for 0.6.0)
app.connect(title_re=u'^\u03bcTorrent.*(build \d+).*')
main_window = app.window_(title_re=u'^\u03bcTorrent.*(build \d+).*')
main_window.MenuSelect(u'&File->Add Torrent from &URL...\tCtrl+U')
app.AddTorrentFromURL.Edit.SetText('some URL')
app.AddTorrentFromURL.OK.Click()
Bitness is important. 32-bit uTorrent crashes if I use 64-bit Python.

Python urllib won't download file due to permissions, but wget will

I'm trying to download an MP3 file, via its URL, using Python's urllib2.
mp3file = urllib2.urlopen(url)
output = open(dst,'wb')
output.write(mp3file.read())
output.close()
I'm getting a urllib2.HTTPError: HTTP Error 403: Forbidden error.
Trying urllib also fails, but silently.
urllib.urlretrieve(url, dst)
However, if I use wget, I can download the file successfully.
I've noted the general differences between the two methods mentioned in "Difference between Python urllib.urlretrieve() and wget", but they don't seem to apply here.
Is wget doing something to negotiate permissions that urllib2 doesn't do? If so, what, and how do I replicate this in urllib2?
Could be something on the server side - blocking python user agent for example. Try using wget user agent : Wget/1.13.4 (linux-gnu) .
In Python 2:
import urllib
# Change header for User-Agent
class AppURLopener(urllib.FancyURLopener):
version = "Wget/1.13.4 (linux-gnu)"
url = "http://www.example.com/test_file"
fname = "test_file"
urllib._urlopener = AppURLopener()
urllib.urlretrieve(url, fname)
The above didn't work for me (I'm using python3.5). wget works fine.
It's not (I assume) a huge problem for me - surely I can still do a system() and use wget to get the data, with some file renaming and munging.
But in case anyone else is suffering from the same problem, these are the errors I get from the above snippet:
Traceback (most recent call last):
File "./mksynt.py", line 10, in <module>
class AppURLopener(urllib.FancyURLopener):
AttributeError: module 'urllib' has no attribute 'FancyURLopener'
I see that the original answer was only promised to work in python2.

How to interact with pynessus

I am using http://code.google.com/p/pynessus/ so that I can interact with nessus using python but I run into problems trying to connect to the server. I am not sure what I need to set pynessus too?
I try connecting to the server using the following syntax as directed by the documentation on the site but I receive the following error:
n = pynessus.NessusServer(localhost, 8834, root, password123)
Error:
root#bt:~/Desktop# ./nessus.py
Traceback (most recent call last):
File "./nessus.py", line 634, in
n = pynessus.NessusServer(localhost, 8834, root, password123)
NameError: name 'pynessus' is not defined
The problem is that you didn't import the pynessus module. To solve this problem, simply place the downloaded pynessus.py in the same folder as your Python script and add the line
import pynessus
at the top of that script. You can reference the pynessus library in your script only after that line.

Categories

Resources