How can I get url of images from browser in django view - python

What should I do to allow django to read the image url from browser?
I'm learning to do a django app for getting dominant colors from images in grasshopper, so that the results could appear in rhino. which need to get images from browser. Images were downloaded in my PC could work , but not images from browser. Then I wonder if there is any python library could help with this problem?
jsob = {"clusters": 5,"path": 0}
if request.method == "POST":
try:
data = request.POST["data"]
print(data)
received = json.loads(str(data))
jsob.update(received)
path = jsob.get("path")
clusters = int(jsob.get("clusters"))
dc = DominantColors(path, clusters)
colors = dc.dominantColors().tolist()
print(colors)
print(type(colors))
results = {"colors":colors}
return JsonResponse(results)
except Exception as e:
PASS

import requests instead of json and requests.get method take url of browser url.
import requests
requests.get('your urls goes here!')

Related

In python, how do I get urllib to recognize multiple lines in a string as separate URLs?

I'm very new to code so forgive any errors I make in explanation! I'm trying to write code on python that uses Praw to access the /r/pics subreddit, scrape the source urls and display them with urllib, cv2 and numpy.
Currently my code looks like this:
import praw
import numpy as np
import urllib
import cv2
# urllib set-up
def reddit_scrape(url):
resp = urllib.request.urlopen(url)
image = np.asarray(bytearray(resp.read()), dtype="uint8")
image = cv2.imdecode(image, cv2.IMREAD_COLOR)
return image
# reddit set-up
reddit = praw.Reddit(client_id = 'id',
client_secret = 'secret',
user_agent = 'agent')
subreddit = reddit.subreddit('pics')
hot_pics = subreddit.hot(limit=10)
for submission in hot_pics:
if not submission.stickied:
print(submission.url)
# print images
urls = [submission.url]
for url in urls:
image = reddit_scrape(url)
cv2.imshow('image', image)
cv2.waitKey(0)
My problem when I run this is that although the print(submission.url) line prints a full list of the top 10 posts, only the last url on the list is actually opened and displayed.
My guess is that the error lies somewhere in my definition of
urls = [submission.url]
But I can't define 'urls' to be a static list of urls, because the hot list changes over time.
What am I doing wrong? is there even a right way to do this? Any help would be greatly appreciated.
submission is whatever the last submission was at the end of your for loop. Instead of constructing urls outside the loop, so when you say urls = [submission.url] you're only getting the last url. Instead you should create a list and append them:
urls = []
for submission in hot_pics:
if not submission.stickied:
urls.append(submission.url)
Or even the more Pythonic:
urls = [submission.url for submission in hot_pics if not submission.stickied]
Then the for url in urls will loop through all the appended urls.

How to retrieve an image from redis in python flask

I am trying to store an image in the redis and retrieve it and send it to an HTML template. I am able to cache the image but I dunno how to retrieve the image back and send it to the HTML template. This is the part of my code which does caching and retrieving.
from urllib2 import Request, urlopen
import json
import redis
import urlparse
import os
from StringIO import StringIO
import requests
from PIL import Image
from flask import send_file
REDIS_URL = urlparse.urlparse(os.environ.get('REDISCLOUD_URL', 'redis://:#localhost:6379/'))
r = redis.StrictRedis(
host=REDIS_URL.hostname, port=REDIS_URL.port,
password=REDIS_URL.password)
class MovieInfo(object):
def __init__(self, movie):
self.movie_name = movie.replace(" ", "+")
def get_movie_info(self):
url = 'http://www.omdbapi.com/?t=' + self.movie_name + '&y=&plot=short&r=json'
result = Request(url)
response = urlopen(result)
infoFromJson = json.loads(response.read())
self._cache_image(infoFromJson)
return infoFromJson
def _cache_image(self, infoFromJson):
key = "{}".format(infoFromJson['Title'])
# Open redis.
cached = r.get(key)
if not cached:
response = requests.get(infoFromJson['Poster'])
image = Image.open(StringIO(response.content))
r.setex(key, (60*60*5), image)
return True
def get_image(self, key):
cached = r.get(key)
if cached:
image = StringIO(cached)
image.seek(0)
return send_file(image, mimetype='image/jpg')
if __name__ == '__main__':
M = MovieInfo("Furious 7")
M.get_movie_info()
M.get_image("Furious 7")
Any help on the retrieving part would be helpful. Also whats the best way to send the image file from a cache to a HTML template in Flask.
What you saved in Redis is a string,
Something likes:'<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=300x475 at 0x4874090>'.
response.content is rawData . use Image.frombytes() to get Image object.
Check here : Doc
You can't create nested structures in Redis, meaning you can't (for
example) store a native redis list inside a native redis hash-map.
If you really need nested structures, you might want to just store a
JSON-blob (or something similar) instead. Another option is to store
an "id"/key to a different redis object as the value of the map key,
but that requires multiple calls to the server to get the full object.
Try this:
response = requests.get(infoFromJson['Poster'])
# Create a string buffer then set it raw data in redis.
output = StringIO(response.content)
r.setex(key, (60*60*5), output.getvalue())
output.close()
see: how-to-store-a-complex-object-in-redis-using-redis-py

unable to save data in datastore but no errors

I'm building a web crawler. some of the the data I input into datastore get saved, others do not get saved and I have no idea what is the problem.
here is my crawler class
class Crawler(object):
def get_page(self, url):
try:
req = urllib2.Request(url, headers={'User-Agent': "Magic Browser"}) # yessss!!! with the header, I am able to download pages
#response = urlfetch.fetch(url, method='GET')
#return response.content
#except urlfetch.InvalidURLError as iu:
# return iu.message
response = urllib2.urlopen(req)
return response.read()
except urllib2.HTTPError as e:
return e.reason
def get_all_links(self, page):
return re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_#.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+',page)
def union(self, lyst1, lyst2):
try:
for elmt in lyst2:
if elmt not in lyst1:
lyst1.append(elmt)
return lyst1
except e:
return e.reason
#function that crawls the web for links starting from the seed
#returns a dictionary of index and graph
def crawl_web(self, seed="http://tonaton.com/"):
query = Listings.query() #create a listings object from storage
if query.get():
objListing = query.get()
else:
objListing = Listings()
objListing.toCrawl = [seed]
objListing.Crawled = []
start_time = datetime.datetime.now()
while datetime.datetime.now()-start_time < datetime.timedelta(0,5):#tocrawl (to crawl can take forever)
try:
#while True:
page = objListing.toCrawl.pop()
if page not in objListing.Crawled:
content = self.get_page(page)
add_page_to_index(page, content)
outlinks = self.get_all_links(content)
graph = Graph() #create a graph object with the url
graph.url = page
graph.links = outlinks #save all outlinks as the value part of the graph url
graph.put()
self.union(objListing.toCrawl, outlinks)
objListing.Crawled.append(page)
except:
return False
objListing.put() #save to database
return True #return true if it works
the classes that define the various ndb Models are in this python module:
import os
import urllib
from google.appengine.ext import ndb
import webapp2
class Listings(ndb.Model):
toCrawl = ndb.StringProperty(repeated=True)
Crawled = ndb.StringProperty(repeated=True)
#let's see how this works
class Index(ndb.Model):
keyword = ndb.StringProperty() # keyword part of the index
url = ndb.StringProperty(repeated=True) # value part of the index
#class Links(ndb.Model):
# links = ndb.JsonProperty(indexed=True)
class Graph(ndb.Model):
url = ndb.StringProperty()
links = ndb.StringProperty(repeated=True)
it used to work fine when I had JsonProperty in place of StringProperty(repeated=true). but JsonProperty is limited to 1500 bytes so I had an error once.
now, when I run the crawl_web member function, it actually crawls but when I check datastore it's only the Index entity that is created. No Graph, no Listing. please help. thanks.
Putting your code together, adding the missing imports, and logging the exception, eventually shows the first killer problem:
Exception Indexed value links must be at most 500 characters
and indeed, adding a logging of outlinks, one easily eyeballs that several of them are far longer than 500 characters -- therefore they can't be items in an indexed property, such as a StringProperty. Changing each repeated StringProperty to a repeated TextProperty (so it does not get indexed and thus has no 500-characters-per-item limitation), the code runs for a while (making a few instances of Graph) but eventually dies with:
An error occured while connecting to the server: Unable to fetch URL: https://sb':'http://b')+'.scorecardresearch.com/beacon.js';document.getElementsByTagName('head')[0].appendChild(s); Error: [Errno 8] nodename nor servname provided, or not known
and indeed, it's pretty obvious tht the alleged "link" is actually a bunch of Javascript and as such cannot be fetched.
So, essentially, the core bug in your code is not at all related to app engine, but rather, the issue is that your regular expression:
'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_#.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+'
does not properly extract outgoing links given a web page containing Javascript as well as HTML.
There are many issues with your code, but to this point they're just slowing it down or making it harder to understand, not killing it -- what's killing it is using that regular expression pattern to try and extract links from the page.
Check out retrieve links from web page using python and BeautifulSoup -- most answers suggest, for the purpose of extracting links from a page, using BeautifulSoup, which may perhaps be a problem in app engine, but one shows how to do it with just Python and REs.

Odoo8. Upload product image programmatically

I get an image from url and try to upload it to Odoo (product.template, image column). I tried many methods to do that but none of them helped me. Could you give me the right way to upload image of product to Odoo without using csv import.
This worked for me :
import urllib2
import base64
image = urllib2.urlopen('http://ddd.com/somepics.jpg').read()
image_base64 = base64.encodestring(image)
product.image_medium = image_base64 //(new api v9)
#in old api maybe something like
#prod_obj.write(prod_id, {'image_medium': image_base64})
you may need something like this
using a psycopg2 library
try:
logo = urllib2.urlopen(logo_url).read()
except:
print 'waitting 60s'
time.sleep(60)
logo = urllib2.urlopen(logo_url).read()
res_data={'image':psycopg2.Binary(logo)}
...
If you have image URL and need to set in product then you can do as following and call this method when install/upgrade your custom module.
import requests
import base64
#api.multi
def get_image(self):
for product in self:
img = False
if image.url:
response = requests.get(image.url)
if response.ok and response.content :
img = base64.b64encode(response.content)
else :
img = False
product.image = img

Python 2&3: both urllib & requests POST data mysteriously disappears

I'm using Python to scrape data from a number of web pages that have simple HTML input forms, like the 'Username:' form at the bottom of this page:
http://www.w3schools.com/html/html_forms.asp (this is just a simple example to illustrate the problem)
Firefox Inspect Element indicates this form field has the following HTML structure:
<form name="input0" target="_blank" action="html_form_action.asp" method="get">
Username:
<input name="user" size="20" type="text"></input>
<input value="Submit" type="submit"></input>
</form>
All I want to do is fill out this form and get the resulting page:
http://www.w3schools.com/html/html_form_action.asp?user=ThisIsMyUserName
Which is what is produced in my browser by entering 'ThisIsMyUserName' in the 'Username' field and pressing 'Submit'. However, every method that I have tried (details below) returns the contents of the original page containing the unaltered form without any indication the form data I submitted was recognized, i.e. I get the content from the first link above in response to my request, when I expected to receive the content of the second link.
I suspect the problem has to do with action="html_form_action.asp" in the form above, or perhaps some kind of hidden field I'm missing (I don't know what to look for - I'm new to form submission). Any suggestions?
HERE IS WHAT I'VE TRIED SO FAR:
Using urllib.requests in Python 3:
import urllib.request
import urllib.parse
# Create dict of form values
example_data = urllib.parse.urlencode({'user': 'ThisIsMyUserName'})
# Encode dict
example_data = example_data.encode('utf-8')
# Create request
example_url = 'http://www.w3schools.com/html/html_forms.asp'
request = urllib.request.Request(example_url, data=example_data)
# Create opener and install
my_url_opener = urllib.request.build_opener() # no handlers
urllib.request.install_opener(my_url_opener)
# Open the page and read content
web_page = urllib.request.urlopen(request)
content = web_page.read()
# Save content to file
my_html_file = open('my_html_file.html', 'wb')
my_html_file.write(content)
But what is returned to me and saved in 'my_html_file.html' is the original page containing
the unaltered form without any indication that my form data was recognized, i.e. I get this page in response: qqqhttp://www.w3schools.com/html/html_forms.asp
...which is the same thing I would have expected if I made this request without the
data parameter at all (which would change the request from a POST to a GET).
Naturally the first thing I did was check whether my request was being constructed properly:
# Just double-checking the request is set up correctly
print("GET or POST?", request.get_method())
print("DATA:", request.data)
print("HEADERS:", request.header_items())
Which produces the following output:
GET or POST? POST
DATA: b'user=ThisIsMyUserName'
HEADERS: [('Content-length', '21'), ('Content-type', 'application/x-www-form-urlencoded'), ('User-agent', 'Python-urllib/3.3'), ('Host', 'www.w3schools.com')]
So it appears the POST request has been structured correctly. After re-reading the
documentation and unsuccessfuly searching the web for an answer to this problem, I
moved on to a different tool: the requests module. I attempted to perform the same task:
import requests
example_url = 'http://www.w3schools.com/html/html_forms.asp'
data_to_send = {'user': 'ThisIsMyUserName'}
response = requests.post(example_url, params=data_to_send)
contents = response.content
And I get the same exact result. At this point I'm thinking maybe this is a Python 3
issue. So I fire up my trusty Python 2.7 and try the following:
import urllib, urllib2
data = urllib.urlencode({'user' : 'ThisIsMyUserName'})
resp = urllib2.urlopen('http://www.w3schools.com/html/html_forms.asp', data)
content = resp.read()
And I get the same result again! For thoroughness I figured I'd attempt to achieve the
same result by encoding the dictionary values into the url and attempting a GET request:
# Using Python 3
# Construct the url for the GET request
example_url = 'http://www.w3schools.com/html/html_forms.asp'
form_values = {'user': 'ThisIsMyUserName'}
example_data = urllib.parse.urlencode(form_values)
final_url = example_url + '?' + example_data
print(final_url)
This spits out the following value for final_url:
qqqhttp://www.w3schools.com/html/html_forms.asp?user=ThisIsMyUserName
I plug this into my browser and I see that this page is exactly the same as
the original page, which is exactly what my program is downloading.
I've also tried adding additional headers and cookie support to no avail.
I've tried everything I can think of. Any idea what could be going wrong?
The form states an action and a method; you are ignoring both. The method states the form uses GET, not POST, and the action tells you to send the form data to html_form_action.asp.
The action attribute acts like any other URL specifier in an HTML page; unless it starts with a scheme (so with http://..., https://..., etc.) it is relative to the current base URL of the page.
The GET HTTP method adds the URL-encoded form parameters to the target URL with a question mark:
import urllib.request
import urllib.parse
# Create dict of form values
example_data = urllib.parse.urlencode({'user': 'ThisIsMyUserName'})
# Create request
example_url = 'http://www.w3schools.com/html/html_form_action.asp'
get_url = example_url + '?' + example_data
# Open the page and read content
web_page = urllib.request.urlopen(get_url)
print(web_page.read().decode(web_page.info().get_param('charset', 'utf8')))
or, using requests:
import requests
example_url = 'http://www.w3schools.com/html/html_form_action.asp'
data_to_send = {'user': 'ThisIsMyUserName'}
response = requests.get(example_url, params=data_to_send)
contents = response.text
print(contents)
In both examples I also decoded the response to Unicode text (something requests makes easier for me with the response.text attribute).

Categories

Resources