I'm new to python and currently trying to use mako templating.
I want to be able to take an html file and add a template to it from another html file.
Let's say I got this index.html file:
<html>
<head>
<title>Hello</title>
</head>
<body>
<p>Hello, ${name}!</p>
</body>
</html>
and this name.html file:
world
(yes, it just has the word world inside).
I want the ${name} in index.html to be replaced with the content of the name.html file.
I've been able to do this without the name.html file, by stating in the render method what name is, using the following code:
#route(':filename')
def static_file(filename):
mylookup = TemplateLookup(directories=['html'])
mytemplate = mylookup.get_template('hello/index.html')
return mytemplate.render(name='world')
This is obviously not useful for larger pieces of text. Now all I want is to simply load the text from name.html, but haven't yet found a way to do this. What should I try?
return mytemplate.render(name=open(<path-to-file>).read())
Thanks for the replies.
The idea is to use the mako framework since it does things like cache and check if the file has been updated...
this code seems to eventually work:
#route(':filename')
def static_file(filename):
mylookup = TemplateLookup(directories=['.'])
mytemplate = mylookup.get_template('index.html')
temp = mylookup.get_template('name.html').render()
return mytemplate.render(name=temp)
Thanks again.
Did I understand you correctly that all you want is read the content from a file? If you want to read the complete content use something like this (Python >= 2.5):
from __future__ import with_statement
with open(my_file_name, 'r') as fp:
content = fp.read()
Note: The from __future__ line has to be the first line in your .py file (or right after the content encoding specification that can be placed in the first line)
Or the old approach:
fp = open(my_file_name, 'r')
try:
content = fp.read()
finally:
fp.close()
If your file contains non-ascii characters, you should also take a look at the codecs page :-)
Then, based on your example, the last section could look like this:
from __future__ import with_statement
#route(':filename')
def static_file(filename):
mylookup = TemplateLookup(directories=['html'])
mytemplate = mylookup.get_template('hello/index.html')
content = ''
with open('name.html', 'r') as fp:
content = fp.read()
return mytemplate.render(name=content)
You can find more details about the file object in the official documentation :-)
There is also a shortcut version:
content = open('name.html').read()
But I personally prefer the long version with the explicit closing :-)
Related
I have a code which i am using to scrape from a web page and I am saving the scraped data in a html file and displaying it as a different page. below is the code
from flask import Flask, render_template,request from bs4 import
BeautifulSoup import urllib.request import sys,os app =
Flask(__name__) #app.route('/') def index():
return render_template ('index.html')
#app.route('/result',methods = ['POST']) def result(): if
request.method == 'POST':
result = request.form.get("search")
link = "https://xyz.comindex?search="
url = (link+result)
print(url)
try:
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page, 'html.parser')
test = soup.findAll('div', attrs={"class": "search-inner-wrapper"})
sys.stdout = open("tests.html", "w")
print(test)
sys.stdout.close()
return render_template("SearchResults.html", test=test)
except:
print("An error occured.")
return render_template("test.html", test=test)
if __name__ == '__main__':
app.run(use_reloader = True, debug = True)
My problem is that this code works perfectly fine but just for once, When i reload the index page and perform a search query I get
ValueError: I/O operation on closed file.
I cant figure a work around for this since i have to use single file every time and do not want the results to append with existing code.
You are redefining sys.stdout to be the file handle of the file you opened. Use another name, don't overwrite sys.stdout. And don't close sys.stdout. It's ok to close the file handle you create though.
Example of opening a file and reading it, opening a file and writing it:
bjb#blueeyes:~$ cat /tmp/infile.html
<html>
<head>
</head>
<body>
<div class="search-inner-wrapper">fleeble flobble</div>
</body>
</html>
bjb#blueeyes:~$ cat /tmp/file2.py
#!/usr/bin/env python3
with open('/tmp/infile.html', 'r') as infile:
page = infile.readlines()
with open('/tmp/outfile.html', 'w') as ofd:
ofd.write(''.join(page))
bjb#blueeyes:~$ /tmp/file2.py
bjb#blueeyes:~$ cat /tmp/outfile.html
<html>
<head>
</head>
<body>
<div class="search-inner-wrapper">fleeble flobble</div>
</body>
</html>
The first line of /tmp/file2.py just says this is a python script.
The next two lines open a file called /tmp/infile.html for reading and declare a variable "infile" as the read file descriptor. Then all the lines in /tmp/infile.html are read into a list of strings.
When we leave that "with" block, the file is closed for us.
Then in the next two lines, we open /tmp/outfile.html for writing and we use the variable ofd ("output file descriptor") to hold the file descriptor. We use ofd to write the series of lines in the list "page" to that file. Once we leave that second "with" block, the output file is closed for us. Then the program exits ... my last command dumps out the contents of /tmp/outfile.html, which you can see is the same as infile.html.
If you want to open and close files without using those with blocks, you can:
infile = open('/tmp/infile.html', 'r')
page = infile.readlines()
infile.close()
ofd = open('/tmp/outfile.html', 'w')
ofd.write(''.join(page))
ofd.close()
Hopefully that will work in a flask script ...
I am using 'urllib.request.urlopen' to read the content of an HTML page. Afterwards, I want to print the content to my local file and then do a certain operation (e.g. constuct a parser on that page e.g. BeautifulSoup).
The problem
After reading the content for the first time (and writing it into a file), I can't read the content for the second time in order to do something with it (e.g. construct a parser on it). It is just empty and I can't move the cursor(seek(0)) back to the beginning.
import urllib.request
response = urllib.request.urlopen("http://finance.yahoo.com")
file = open( "myTestFile.html", "w")
file.write( response.read() ) # Tried responce.readlines(), but that did not help me
#Tried: response.seek() but that did not work
print( response.read() ) # Actually, I want something done here... e.g. construct a parser:
# BeautifulSoup(response).
# Anyway this is an empty result
file.close()
How can I fix it?
Thank you very much!
You can not read the response twice. But you can easily reuse the saved content:
content = response.read()
file.write(content)
print(content)
I'm using O365 for Python.
Sending an email and building the body my using the setBodyHTML() function. However at the present I need to write the actual HTML code inside the function. I don't want to do that. I want to just have python look at an HTML file I saved somewhere and send an email using that file as the body. Is that possible? Or am I confined to copy/pasting my HTML into that function? I'm using office365 for business. Thanks.
In other words instead of this: msg.setBodyHTML("<h3>Hello</h3>") I want to be able to do this: msg.setBodyHTML("C:\somemsg.html")
I guess you can assign the file content to a variable first, i.e.:
file = open('C:/somemsg.html', 'r')
content = file.read()
file.close()
msg.setBodyHTML(content)
You can do this via a simple reading of that file into a string, which you then can pass to the setBodyHTML function.
Here's a quick function example that will do the trick:
def load_html_from_file(path):
contents = ""
with open(path, 'r') as f:
contents = f.read()
return contents
Later, you can do something along the lines of
msg.setBodyHTML(load_html_from_file("C:\somemsg.html"))
or
html_contents = load_html_from_file("C:\somemsg.html")
msg.setBodyHTML(html_contents)
I'm writing this script that downloads an HTML document from http://example.com/ and attempts to parse it as an XML by using:
with urllib.request.urlopen("http://example.com/") as f:
tree = xml.etree.ElementTree.parse(f)
However, I keep getting a ParseError: mismatched tag error, supposedly at line 1, column 2781, so I donwloaded the file manually (Ctrl+S on my browser) and checked it, but such position indicates a place in the middle of a string, and not even near the EOF, but there were a few lines before the actual 2781nth character so that might've messed up my calculation of the exact position. However, I tried to download and actually write the response to a file to parse it later by:
response = urllib.request.urlopen("http://example.com/")
f = open("test.html", "wb")
f.write(response.read())
f.close()
html = open("test.html", "r")
tree = xml.etree.ElementTree.parse(html)
And I'm still getting the same mismatched tag error at the same column, but this time I opened the downloaded html and the only stuff near column 2781 is this:
;</script></head><body class
And the exact 2781nth column marks the first "h" in </head>, so what could be wrong here? am I missing something?
Edit:
I've been looking more into it and tried to parse the XML using another parser, this time minidom, but I'm still getting the exact same error at the exact same line, what could be the problem here? this also happens even though I've downloaded the file by several different ways (urllib, curl, wget, even Ctrl+Save on the browser) and the result is the same.
Edit 2:
This is what I've tried so far:
This is an example xml I just got from the API doc, and saved it to text.html:
<html>
<head>
<title>Example page</title>
</head>
<body>
<p>Moved to example.org
or example.com.</p>
</body>
</html>
And I tried:
with urllib.request.urlopen("text.html") as f:
tree = xml.etree.ElementTree.parse(f)
And it works, then:
with urllib.request.urlopen("text.html") as f:
tree = xml.etree.ElementTree.fromstring(f.read())
And it also works, but:
with urllib.request.urlopen("http://example.com/") as f:
xml.etree.ElementTree.parse(f)
Doesn't, also tried:
with urllib.request.urlopen("http://example.com/") as f:
xml.etree.ElementTree.fromstring(f.read())
And it doesn't work too, what could be the problem? as far as I can tell the document doesn't have mismatching tags, but perhaps it's too large? it's only 95.2 KB.
You can use bs4 to parse this page. Like this:
import bs4
import urllib
url = 'http://boards.4chan.org/wsg/thread/629672/i-just-lost-my-marauder-on-eve-i-need-a-ylyl'
proxies = {'http': 'http://www-proxy.ericsson.se:8080'}
f = urllib.urlopen(url, proxies=proxies)
info = f.read()
soup = bs4.BeautifulSoup(info)
print soup.a
OUTPUT:
a
You can download bs4 from this link.
Based on the urllib and ElementTree documentation, this code snippet seemed to work without error for your sample URL.
import urllib.request
import xml.etree.ElementTree as ET
with urllib.request.urlopen('http://boards.4chan.org/wsg/thread/629672/i-just-lost-my-marauder-on-eve-i-need-a-ylyl') as response:
html = response.read()
tree = ET.parse(html)
If you don't want to read the response into a variable before parsing it with ElementTree, this also works:
with urllib.request.urlopen('http://boards.4chan.org/wsg/thread/629672/i-just-lost-my-marauder-on-eve-i-need-a-ylyl') as response:
tree = ET.parse(response.read())
My file is like this, but I can't exec the content correctly. I've spent my whole afternoon on this, and still so confused. The main reason is that I don't know what does that [file_obj[0]['body']] looks like.
here is part of my code
# user_file content
"uid = 'h123456789'"
"data = [something]"
# end of user_file
# code piece
file_obj = req.request.files.get('user_file', None)
for i in file_obj[0]['body']:
i.strip('\n') # I tried comment out this line, still can't work
exec(i)
# I failed
Can you tell me what does the user_file conentent would looks like in the file_obj body? So that I can figure out the solution maybe. I submitted it with http form to tornado.
Really thanks.
Maybe this will help.
#first file object in request.
file1 = self.request.files['file1'][0]
#where the file content actually placed.
content = file1['body']
#split content into lines, unix line terminals assumed.
lines = content.split(b'\n')
for l in lines:
#after decoding into strings, you're free to execute them.
try:
exec(l.decode())
except:
pass