Does anybody have any experience rendering web pages in weasyprint that are styled using twitter Bootstrap? Whenever I try, the html renders completely unstyled as if there was no css applied to it.
I figured out what the problem was. When I declared the style sheet i set media="screen", I removed that tag element and it seemed to fix it. Further research indicated I could also declare a separate stylesheet and set media="print".
Related
So, I am using TinyMCE as a text editor in my Django application. It shows all HTML tags and inline CSS codes after saving and reloading data. I've used also 'encoding':'xml', then I am getting something like this.
Your Django framework is likely escaping the content as a safety precaution. It would appear there is an easy way to stop it from doing that:
Disable HTML escaping in Django's TextField
I am currently scraping a website for work to be able to sort the data locally, however when I do this the code seems to be incomplete, and I feel may be changing whilst I scroll on the website to add more content. Can this happen ? And if so, how can I ensure I am able to scrape the whole website for processing?
I only currently know some python and html for web scraping, looking into what other elements may be affecting this issue (javascript or ReactJS etc).
I am expecting to get a list of 50 names when scraping the website, but it only returns 13. I have downloaded the whole HTML file to go through it and none of the other names seem to exist in the file, i.e. why I think the file may be changing dynamically
Yes, the content of the HTML can be dynamic, and Javascript loading should be the most essential . For Python, scrapy+splash maybe a good choice to get started.
Depending on how the data is handled, you can have different methods to handle dyamic content HTML
I'm using Weasyprint to print an HTML template to PDF, and I keep getting a gap of 10cm on the right side.
I'm using #page:(size:letter;) as only page attribute.
I've tried setting the page size manually, but I still keep getting a huge space to the right of all the pages.
Any thoughts on what could be the problem?
Found the solution. It was a CSS problem. The class used to style the body was not at the beginning of the css file and that caused erratic behavior with other styles declared before it.
I am building a screen clipping app.
So far:
I can get the html mark up of the part of the web page the user has selected including images and videos.
I then send them to a server to process the html with BeautifulSoup to sanitize the html and convert all relative paths if any to absolute paths
Now I need to render the part of the page. But I have no way to render the styling. Is there any library to help me in this matter or any other way in python ?
One way would be to fetch the whole webpage with urllib2 and remove the parts of the body I don't need and then render it.
But there must be a more pythonic way :)
Note: I don't want a screenshot. I am trying to render proper html with styling.
Thanks :)
Download the complete webpage, extract the style elements and the stylesheet link elements and download the files referenced the latter. That should give you the CSS used on the page.
I have a website that gets a lot of links to youtube and similar sites and I wanted to know if there is anyway that I can make a link automatically appear as a video. Like what happens when you post a link to a video on facebook. You can play it right on the page. Is there a way to do this without users actually posting the entire embed video HTML code?
By the way I am using google app engine with python and jinja2 templating.
Each youtube video has a unique ID which is present in the url.
Examples here:
http://www.youtube.com/watch?v=DU0Q0U08gAc&feature=g-all-esi
http://youtu.be/DU0Q0U08gAc
In this case, DU0Q0U08gAc is the movie id.
This just gets inserted in the embed tag, as you can see here:
<iframe width="560" height="315" src="http://www.youtube.com/embed/DU0Q0U08gAc" frameborder="0" allowfullscreen></iframe>
So you need to parse the url for the id and insert it to an embed tag. I believe that in the case of youtu.be style links, it's just whatever's after the '/', and in the case of youtube.com links it's probably best practice to use the urlparse library to get the 'v' variable from the url's query string. Hopefully someone will chime in if there's a corner case I'm not aware of.
Your solution is Micawber...which is available in pure Python, as well as Django and Flask plugins. Works nicely with Jinja. Embeds vids and pics into your app exactly like Facebook. You can install it via pip. Good docs; easy to follow examples. The author is responsive to questions. Works great, and it's totally free. Check it out:
http://readthedocs.org/projects/micawber/
You can also check out http://oembed.com and http://embed.ly ...although the latter is not free, and starts at $19/mo (as of July 2012).
Use this code when getting embed link from list value. In the template inside the iframe use below code
src="{{results[0].video_link}}"
"video_link" is the Field name.