Parse XML file in python and display it in HTML page - python

I am doing a digital signage project using Raspberry-Pi. The R-Pi will be connected to HDMI display and to internet. There will be one XML file and one self-designed HTML webpage in R-Pi.The XML file will be frequently updated from remote terminal.
My idea is to parse the XML file using Python (lxml) and pass this parsed data to my local HTML webpage so that it can display this data in R-Pi's web-browser.The webpage will be frequently reloading to reflect the changed data.
I was able to parse the XML file using Python (lxml). But what tools should I use to display this parsed contents (mostly strings) in a local HTML webpage ?
This question might sound trivial but I am very new to this field and could not find any clear answer anywhere. Also there are methods that use PHP for parsing XML and then pass it to HTML page but as per my other needs I am bound to use Python.

I think there are 3 steps you need to make it work.
Extracting only the data you want from the given XML file.
Using simple template engine to insert the extracted data into a HTML file.
Use a web server to service the file create above.
Step 1) You are already using lxml which is a good library for doing this so I don't think you need help there.
Step 2) Now there are many python templating engines out there but for a simple purpose you just need an HTML file that was created in advance with some special markup such as {{0}}, {{1}} or whatever that works for you. This would be your template. Take the data from step 1 and just do find and replace in the template and save the output to a new HTML file.
Step 3) To make that file accessible using a browser on a different device or a PC you need to service it using a simple HTTP web server. Python provides http.server library or you can use an 3rd party web server and just make sure it can access the file created on step 2.

Instead of passing the parsed data (parsed from a XML file) to specific components in the HTML page, I've written python code such that it rewrites the entire HTML webpage's code periodically.
Suppose we have a XML file, a python script, a HTML webpage.
XML file : Contains certain values that are updated periodically and are to be parsed.
Python Script : Parses the XML file (when ever there are changes in XML file) and updates the HTML page with the newly parsed values
HTML webpage : To be shown on R-Pi screen and reloaded periodically (to reflect any changes at the browser)
The python code will have a string (say, str) declared, that contains the code of the HTML page, say the below code.
<!DOCTYPE html>
<html>
<body>
<h1>My First Heading</h1>
<p>My first paragraph.</p>
</body>
</html>
Then suppose we would like to update the My first paragraph with a value we parsed from XML, we can use Python string replacement function,
str.replace("My first paragraph",root[0][1].text)
After the replacement is done, write that entire string (str) into the HTML file. Now, the HTML file will have new code and once it's reloaded, the updated webpage will show up in the browser (of R-Pi)

Related

Zope PostgreSQL variable with HTML and DTML

I have a postgresql db table called blog_post and in that table a column called post_main. That column stores the entire blog post article, including various HTML and DTML tags.
For reference (and yes, I know it's old), this is Zope 2.13 with PostgreSQL 8.1.19
For example:
<p>This is paragraph 1</p>
<dtml-var "blog.sitefiles.post.postimg1(_.None, _)">
<p>This is paragraph 2</p>
The dtml-var tag is telling Zope to insert the contents of the dtml-document postimg1 between the two paragraphs.
OK, no problem. I am storing this data without issue in the postgres db table, exactly as it was entered, and I am running a ZSQL Method via a <dtml-in zsqlmethod> tag that surrounds the entire dtml-document, in order to be able to call to the variables I need in the page.
Normally, and without either HTML code OR especially without DTML tags, it's no issue to insert the data into the web page. You do this via &dtml-varname; if you have no html tags and just want a plain text output, OR you do <dtml-var varname> if you want the data to be rendered and shown as proper html.
Here's the problem
Zope is just posting the <dtml-var "blog.sitefiles.post.postimg1(_.None, _)"> line to the html page instead of processing it like when I type it into the dtml-doc directly.
What I need:
I need the code stored in the post_main column (referenced above as varname) to be processed as if I typed it directly into the dtml-document, so that the <dtml-var> tags work the way they are supposed to work.
So, you have a variable that contains a DTML Document, and you want to execute that document and insert the results?
To be honest, I'm not sure that's possible in DTML alone, as it generally users don't want to execute code contained in strings. This is the same danger as exposing eval() or exec() of user supplied strings, as if someone can control the string they have arbitrary code execution on the Zope instance. It's the equivalent of storing PHP code in your database and executing that.
Frankly, I'm surprised you're using DTML on Zope 2.13 at all, rather than PageTemplates, but I assume you've got a good reason for it.
If you want to interpret the value of a DTML variable rather than just insert it, you'll need to explicitly do the interpreting, using something like:
from DocumentTemplate.DT_HTML import HTML
return HTML(trusted_dtml_string)
The problem with this is that you can't do it in a Script (Python) through the web, because of the security concerns. If you do this as an external method or filesystem code it's very likely that you'll allow arbitrary code execution on your server.
I'm afraid my only recommendation is to avoid doing this, it's very difficult to get it right and errors can be catastrophic. I'd strongly suggest you do not store DTML tags as part of your blog articles.
As an alternative, if you have a fixed number of delegations to DTML methods, I recommend writing a Python script, such as:
## Script (Python) "parse_variables"
##bind container=container
##bind context=context
##bind namespace=
##bind script=script
##bind subpath=traverse_subpath
##parameters=post, _
##title=
##
post = post.replace("##POST_IMAGE##", context.postimg(None, _))
return post
And then calling that with your variable that contains the user-supplied data, like <dtml-var expr="parse_variables(data, _)">

Is there any method available to convert the html text to xhtml?

I am trying to store a table which is created by PrettyTable, in to confluence. I converted the data to html using the method PrettyTable.get_html_string(). I can store this data to local html file as per my requirement without any issues. However, when I tried to upload the data to the confluence using confluence.createPage(), I am getting errors related to XHTML parsing errors as createPage() accepts only XHTML and the content is not formatted properly. So, I would like to convert my HTML data to XHTML so I can push it to the confluence. Is there any method available to convert the pretty table data directly to the XHTML?
I tried to use PrettyTable._get_formatted_html_string(), but there is no proper information about which arguments should I give to that method from http://www.aplu.ch/classdoc/raspipylib/prettytable-pysrc.html#PrettyTable._get_formatted_html_string
check the source html being created. XHTML needs all tags closed which is most likely the problem:
paragraph tag needs a close paragraph tag etc...
There is a method available in PrettyTable. Using the method get_html_string(xhtml=True) which automatically adds as end tags. Here is the reference: http://www.aplu.ch/classdoc/raspipylib/prettytable-pysrc.html#PrettyTable.get_html_string
Thank you :)

Grabbing data from sperate links of the same website

Thank you for your time to read this
I wanted to know if there's any way that i can get a specific code from different links but they are all of the same domain i mean if i put many facebook pages links it gets all their names in a text file and each one in different line
I think if i understood you need the user's name form the link.
facebook.com/zuck
acebook.com/moskov
You can track this and extract the pagetitle, this may not be accurate always.
> <title id="pageTitle">Mark Zuckerberg</title>
> <title id="pageTitle">Dustin Moskovitz</title>
html2text is a Python script that converts a page of HTML into clean, easy-to-read plain ASCII text. Better yet, that ASCII also happens to be valid Markdown (a text-to-HTML format).
https://github.com/Alir3z4/html2text
if you want to read from the url check the below explanations
How to read html from a url in python 3

build a web page at runtime with python?

I am trying to build static pages, to display data on my internal website.
I would like to grab data from various text files, so every time that new data is created; I simply need to run again the script and the page is created with the new data.
I can't use JS or other runtime languages since my server allows only static pages; so I opted for python, to build the static pages.
Now the question is: how do I write such script, that allow me to build a web page?
All the data that I need is 3-4 lines, so the page is not so complex. I tried to create an empty page, and then try to modify the content via python but it was a disaster; then I thought that it would be probably simpler to build the whole page from scratch every time.
To be clear, I am making a simple page with white background, and some text on it, adjusted so it is nice to read; no graphic, no animations, nothing; just pure old school HTML.
Is there a template to do what I am trying to achieve? Thanks
You mean something like this?
I'm using the yattag library to define the template.
from yattag import Doc
def homepage_content():
return {
'text': open('/home/username/texts/homepage_text.txt').read(),
'title': open('/home/username/texts/homepage_title.txt').read()
}
def page_template(content):
doc, tag, text = Doc().tagtext()
with tag('html'):
with tag('head'):
with tag('title'):
text(content['title'])
with tag('body'):
with tag('div', id = 'main'):
text(content['text'])
return doc.getvalue()
def create_homepage():
with open('/home/username/www/index.html', "w") as fp:
fp.write(page_template(homepage_content()))

Converting HTML markup to a RTF document

I have an XML document containing embedded HTML content that I am attempting to convert to an RTF output file. I have the XML elements decorated with <li>, <p>, <b> and other HTML markup, that I would like to have transferred into the generated RTF.
Here is what works as of now:
Fetch XML tag content as string (containing HTML tags for line breaks, paragraph breaks, and lists)
Write the XML tag content to an RTF file.
I am using Python scripts to achieve the conversion. Also being used is ElementTree (to parse input XML) PyRTF-NG (to convert from HTML to RTF), a library that handles tables and other special formatting. At the moment, I have managed to get everything I need except the 'markdown' of the HTML (i.e. translating HTML format tags into actual RTF formatting). To clarify, I mean that if my RTF convertor encounters an <ol><li> tag, it should create an ordered list in the RTF, instead of just spitting out <ol><li> tags into the RTF.
Does anyone know if Python has any native calls that will allow me to do this, or any other Python libraries that might have what I need to complete the full-conversion into RTF.
Thanks!
The best free conversor is the LibreOffice, and it can be used directly by command line at termimal, see
libreoffice --convert-to
The same conversor is indirectally called by Python using UNO bridge,
http://api.libreoffice.org/
http://software.opensuse.org/package/libreoffice-pyuno
...

Categories

Resources