How to extract image content in PyQt WebKit - python

I am writing a image scraper using Pycurl by sending forged requests which is the same with the results by the http analyzer to the website server. Using the http analyzer
This site requires several steps of interaction to finally response with image contents. First I have to open the link by pycurl and get the gzip format response which including the html content. The request for image is then send by the site's javascript code.The server generated the image by a dll according to the reqeust.
I can already get the images by identifying the response content. However I found it very trivial that I have to change my code every time the website change the querying steps so I want to interact with this website by PyQt4.WebKit as a browser.
How to extract the specific image content in PyQt4.WebKit?

Related

Retrieving images loaded in the browser

The following code works great for fetching static images:
import urllib
urllib.urlretrieve({SOME_URL}, "00000001.jpg")
I'd however like to fetch all media that I can load on WhatsApp Web. For instance, looking at the source of a conversation that's fully loaded on my screen, I can see the image URLs such as blob:https://web.whatsapp.com/a2e9249a-365c-4ce7-a3ce-795be018400e – that link/image can actually be opened in a separate browser tab.
If I replace {SOME_URL} with that link, Python doesn't recognise blob as a valid HTTP request. I have kept my browser with WhatsApp Web loaded in the background as I assume those links may actually change every time I load the conversation. But the first problem is trying to find a urlretrieve equivalent. Any thoughts? Thank you!

Is there a way to display an image in the body of an email using an image URL

I'm working on a personal web crawler project where I want it to be able to send emails as notifications with an image inside the body of an email. I see plenty of documentation on how to download and then display an image using smtplib. However, I do not see any examples of displaying an image without downloading it.
I was wondering if it's possible to display an image inside the body of an smtplib email using only the image URL without needing to download the image. (You can use an image URL to put an image in an email body using gmail, so I was assuming it might be possible via automated python script.)

Getting html of Facebook page using urllib2

I am writing a Python script that can take a Facebook URL and locally save an html file of that Facebook page. Based on the answer to this question: Inherent way to save web page source
I tried using urllib2, but the resulting html file is different (missing some parts) compared to the html file that get from manually right clicking on the Facebook page and saving the entire webpage. Do you know why they would be different and what other Python libraries I could use instead of urllib2?

Sending html email with image: absolute url or attach image?

I am sending html email with several images in it. My project is written in Python and Django and invokes web service to send the email(I can pass to the web service the html and attachments). The web service is in other part of the project and is implemented in Java and uses Amazon SES.
Which is the best approach for the html images?
To store them on my web server and link them with absolute URLs or to send the images as attachments and embed them in the html?
Do all email clients support absolute URLs for images?
I would suggest attaching images for clarity, but if you really want to save on used bandwidth and send lots of emails, it's better to send the URL..... If you are sure it won't change.

Link to force download of external image with Python web app

I've got a Python (Wep2py) web app that generates QR codes using the Google chart API. The app displays the QR code on the screen, and I want to offer a link to download it. Considering the images are not on my server, what are my options?
Example image url:
https://chart.googleapis.com/chart?chs=150x150&cht=qr&chl=Hello%20world
EDIT:
I've seen mention of using the header Content-disposition: attachment; to force a download dialog. Does anyone know if this header can be applied to external resources?
Download QR code here
Edit: If the browser of your user is able to handle the mimetype Google sends for the image, the browser will handle it. There is not much you can do about this which is a Good ThingTM.

Categories

Resources