Screenshot of flash element in Python - python

How can I take a scrennshot of flash website in Python 3.5.1. I trying something like this but I can't see video image.
from selenium import webdriver
def webshot(url, filename):
browser = webdriver.PhantomJS()
browser.get(url)
browser.save_screenshot(filename)
browser.quit()
webshot('https://www.youtube.com/watch?v=YQHsXMglC9A', 'screentest.png')

Short version : With Youtube system, if you didn't press that "play" button (initiate playback) there is no video served. Loading the page via browser is a form of initiating playback too. However using a webshot doesn't fulfill Youtube server's requirements so it wont work.
long version :
How can I take a screenshot of a Flash website... I tried this but I
can't see video image.
webshot('https://www.youtube.com/watch?v=YQHsXMglC9A', 'screentest.png')
You cannot screenshot Youtube's video player content like this. The way Youtube works is that when video page is ready, another PHP file is accessed to determine the video link (eg: the correct file for chosen quality settings, etc). Basically you have to appear to be like a browser making an HTTP request to their servers. Their server gives temporary token to access video link until token expires etc. There's other issues like CORS to deal with. These things are not being done by your tool.
If only Youtube used a normal <video tag> with simple MP4 link then your code would've worked.
The best you can get is like below (see how there is no controls?) using :
webshot('https://www.youtube.com/embed/YQHsXMglC9A', 'screentest.png')

Related

How to download a video when I get the URL of the MP4 file in selenium python? (WITHOUT URLLIB)

I can get to the point where the video is right in front of me. I need to loop through the urls and download all of these videos. The issue is that the request is stateful and I cannot use urllib because then authorization issues occur. How do I just target the three dots in chrome video viewer and download the file?
All I need now is to be able to download by clicking on the download button. I do not know if it can be done without the specification of coordinates. Like I said, the urls follow a pattern and I can generate them. The only issue is the authorization. Please help me get the videos through selenium.
Note that the video is in JavaScript so I cannot really target the three dots or the download button.
You can get the cookies from the driver and pass the information to the Request session. So, you can download with the Requests library.
import requests
cookies = driver.get_cookies()
s = requests.Session()
for cookie in cookies:
s.cookies.set(cookie['name'], cookie['value'])
response = s.get(urlDownload, stream=True)
print(response.status_code)
with open(fileName,'wb') as f:
f.write(response.content)
you can use selenium in python 2 as you have only pics i cannot give you a real code but something like that will help.you can find XPath by inspecting HTML
import selenium
driver.find_element_by_xpath('xpath of 3 dots')

How to listen for an AUDIO file using Python and Selenium Webdriver

Full disclosure, I'm fairly new to Python and Selenium Webdriver.
I am expanding a small automation project that I've been working on. I have an in-browser chat window that, when a message is received, a file titled 'chime.mp3' will play to notify the user of a new message.
To verify the success that this audio file has played, I will need to verify that the file was called or requested via HTTP request, i.e. https://usr/bin/webapps/chime.mp3
Is there a module I could import in python, or a webdriver technique to listen for this file being requested?
Or is there a known way to verify audio files play on a certain event?
Any help is much appreciated.
To the best of my knowledge, there is no built-in functionality in Selenium (or Python) that you could use for that.
While you can't easily verify that the sound as actually been played however, you can easily verify whether your site has instructed the browser to play the file. Most likely, your site will use the HTML5 audio tag to actually play the file, and you can check for that:
Example (Java)
WebElement audio = driver.findElement(By.tagName("audio")); // make sure any audio tag is there
// ... or ...
WebElement audio = driver.findElement(By.xpath("//audio/source[contains(#src, 'chime.mp3')]/..")); // make sure an audio tag is there that refers to chime.mp3
assertTrue(Boolean.parseBoolean(audio.getAttribute("ended"))); // make sure the audio tag has played at least once

how to convert IP address into http for urllib

I'm looking to embark on my own personal project of creating an application which i can save doc/texts/image from the site my browser is at. I have done a lot of research to conclude that either of the two ways is possible for now: using cookies or packet sniffers to identify the IP address(the packet sniffer method being more relevent at the moment).
I would like to automate the application so I would not have to copy and paste the url on my browser and paste it into the script using urllib.
Are there any suggestions that experienced network programmers can provide with regards to the process or modules or libraries I need?
thanks so much
jonathan
If you want to download all images, docs, and text while you're actively browsing (which is probably a bad idea considering the sheer amount of bandwidth) then you'll want something more than urllib2. I assume you don't want to have to keep copying and pasting all the urls into a script to download everything, if that is not the case a simple urllib2 and beautifulsoup filter would do you wonders.
However if what I assume is correct then you are probably going to want to investigate selenium. From there you can launch a selenium window (defaults to Firefox) and then do your browsing normally. The best option from there is to continually poll the current url and if it is different identify all of the elements you want to download and then use urllib2 to download them. Since I don't know what you want to download I can't really help you on that part. However here is what something like that would look like in selenium:
from selenium import webdriver
from time import sleep
# Startup the web-browser
browser = webdriver.Firefox()
current_url = browser.current_url
while True:
try:
# If we have a url, identify and download your items
if browser.current_url != current_url:
# Download the stuff here
current_url = browser.current_url
# Triggered once you close the web-browser
except:
break
# Sleep for half a second to avoid demolishing your machine from constant polling
sleep(0.5)
Once again I advise against doing this, as constantly downloading images, text, and documents would take up a huge amount of space.

PopcornJS Youtube Player and Flask/Python

I'm trying to build a simple web application using PopcornJS in Python. I want to simply mute the volume of a youtube video and autoplay it but it seems to not be working/ there may be a bug in PopcornJS's youtube video player.
Here's my code:
<script>
var popcorn = Popcorn.youtube( "#video" , 'http://www.youtube.com/embed/XMW2lbNVaXY');
popcorn.volume(0);
popcorn.autoplay(true);
</script>
and the html
<div id="video"></div>
Seems like the volume control and the autoplay are not working. If I switch to a non youtube video everything works fine. Also if I look at my console during run time I see the following error
Blocked a frame with origin "http://www.youtube.com" from accessing
a frame with origin "http://127.0.0.1:5000". Protocols, domains,
and ports must match.
Googling the problem revealed that it might be a chrome bug, but I'm not sure. Anyone know what's wrong? Thanks!

How to download a generated captcha using mechanize?

I'm trying to build a sort-of client a blog platform in my country, but the blog platform has an in-house built captcha generation.
The problem is that the CAPTCHA is built as such so that a new image is generated every time there is a GET request. So suppose the captcha image URL is this: http://example.com/randomcaptcha.aspx?someparams-that-are-always-the-same
Even when I open the above link in Firefox and hit refresh (which shows the JPG image only), I'm presented with a different image every time I refresh.
The problem arises because when mechanize downloads the entire web page, it also downloads the image during that request (or rather, it follows the randomcaptcha.aspx link). So when I try to download the image again, I need to issue another GET request to grab the image and download it - and at this moment the image has changed.
How would I solve this problem?
Thank you.
EDIT the code currently is this:
browser.open("http://www.example.com/registration.aspx") #this contains the randomcaptcha.aspx url in img src
#then we have a regex to find the url of the image, say the variable is url
with open("captcha.jpg", "wb") as file:
file.write(browser.open_novisit(url).read())
At this time the downloaded captcha.jpg file is already different from the one presented on the registration page. I used the software called Fiddler to see that - there are definitely 2 GET requests being issued for the randomcaptcha.aspx url.
EDIT #2 Solved: My bad. The captcha URL was incorrect.

Categories

Resources