pisa to generate a table of content via html convert - python

Does anyone have any idea how to use the tag so the table of content comes onto the 1st page and all text is coming behind. This is what i've got so far, it generates the table of content behind my text...
pdf.html
<htmL>
<body>
<div>
<pdf:toc />
</div>
<pdf:nextpage>
<br/>
<h1> test </h1>
<h2> second </h2>
some text
<h1> test_two </h1>
<h2> second </h2>
some text
</body>
</html>
I can't seem to get everything in the right position, even with the it doesn't seem to work... any help or documentation somewhere? The PISA docs are rly crappy with details actually...
Btw 1 more extra thing, is it possible to make this table of content jump to the right page? If yes how does this works?
Regards,

I found I couldn't get that pagebreak to work for me, so I used inline CSS and, specifically, the page-break property to fix it.
In your case, this should do the trick:
<div style="page-break-after:always;>
<pdf:toc />
</div>
<h1> test </h1> ...etc...

As far as the links are concerned, there may be a way to automatically generate them, but I found it easier to manually create a table of contents using links and anchors:
<h1>Table of Contents</h1>
<ul>
<li><a href="section1">The name of section 1</li>
<li><a href="section2">The name of section 2</li>
</ul>
<h2>The name of section 1</h2>
<a name="section1"></a>
<h2>The name of section 2</h2>
<a name="section2"></a>
There's obviously some duplication, but I haven't found it difficult to maintain for my documents. It depends how long or complicated you expect yours to became.
The bigger downside is that this option won't include page numbers.
Steve's comment about the page-break property is correct. I personally used a separate CSS file with
h2 {
page-break-before:always;
}
so that all of my sections would start on a new page.

Related

Create hyperlink and give it an appealing name in Python

I am trying to rename my hyperlink to place in a pdf file. Thus, I do not want to give to the user a massive long link.
Let's say my link is like:
https://www.google.com/search?q=images+of+dogs&rlz=1C1OKWM_esES969ES969&sxsrf=AOaemvJFDb3FKdXO1Yqb3A1BdjWNfw0Edg:1632237403618&tbm=isch&source=iu&ictx=1&fir=D5X9VdSPli-xYM%252CHUMB4Zy1hHwFaM%252C_&vet=1&usg=AI4_-kShuarwW69ikZrP2YUHRVOpRHKKfQ&sa=X&ved=2ahUKEwiPs4aVrpDzAhUR1RoKHQiNAZIQ9QF6BAgPEAE&biw=2133&bih=1013&dpr=0.9#imgrc=D5X9VdSPli-xYM
And I want it to appear in the pdf like:
"Link to picture"
My code:
texto_body=f"Hi,<br> <br> This is a test with a link {link} <br> <br> Thanks,"
body=f"""\
<html>
<body>
<p style="color:black;"> {texto_body}</p>
<img src="cid:image1" alt="Logo" style="width:90px;height:90px;"><br>
</body>
</html>
"""
Solved. I found that text of the link
is the way to set up hyperlinks with a given name

Find the elements only after a specific text in html using selenium python

Lets say I have following HTML Code
<div class="12">
<div class="something"></div>
</div>
<div class="12">
<div class="34">
<span>TODAY</span>
</div>
</div>
<div class="12">
<div class="something"></div>
</div>
<div class="12">
<div class="something"></div>
</div>
Now If I use driver.find_elements_by_class_name("something") then I get all the classes present in the HTML code. But I want to get classes only after a specific word ("Today") in HTML. How to exclude classes that appear before the specific word. Next divs and classes could be at any level.
You can use search by XPath as below:
driver.find_elements_by_xpath('//*/text()[.="some specific word"]/following-sibling::div[#class="something"]')
Note that you might need some modifications in case your real HTML differs from provided simplified HTML
Update
replace following-sibling with following if required div nodes are not siblings:
driver.find_elements_by_xpath('//*/text()[.="some specific word"]/following::div[#class="something"]')

How to wrap string by tag in Beautifulsoup?

I want to wrap the content of a lot of div-elements/blocks with p tags:
<div class='value'>
some content
</div>
It should become:
<div class='value'>
<p>
some content
</p>
</div>
My idea was to get the content (using bs4) by filtering strings with find_all and then wrap it with the new tag. Don't know, if its working. I cant filter content from tags with specific attributes/values.
I can do this instead of bs4 with regex. But I'd like to do all transformations (there are some more beside this one) in bs4.
Believe it or not, you can use wrap. :-)
Because you might, or might not, want to wrap inner div elements I decided to alter your HTML code a little bit, so that I could give you code that shows how to alter an inner div without changing the one 'outside' it. You will see how to alter all divs, I'm sure.
Here's how.
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(open('pjoern.htm').read(), 'lxml')
>>> inner_div = soup.findAll('div')[1]
>>> inner_div
<div>
some content
</div>
>>> inner_div.contents[0].wrap(soup.new_tag('p'))
<p>
some content
</p>
>>> print(soup.prettify())
<html>
<body>
<div class="value">
<div>
<p>
some content
</p>
</div>
</div>
</body>
</html>

Show additional data when checkbox is true Django

I'm working at my first project in Django and learning it by the way. Currently I'm stuck at the part of the project where I have to show additional data if checkbox is selected as true.
Long story short, there have to be checkbox at my page which name is "Cars", if user check the checkbox as True, select list of models of cars should be displayed immediately, below this checkbox, without affecting other input sources on current page.
Is there easy way how to accomplis this? Thanks for help.
/edit
HTML code
<p class="dhcp"> DHCP: {{ form.dhcp }} </p>
<p class="collapse1 collapse"> IPv4 adresy: {{form.ipv4_adress }} </p>
<p class="dhcpv6"> DHCPv6: {{ form.dhcpv6 }} </p>
<p class="collapse2 collapse"> IPv6 adresy: {{form.ipv6_adress }} </p>
JS code
$(document).ready(function() {
$(".dhcp").click(function(event) {
$(".collapse1").fadeToggle().delay(100);
});
$(".dhcpv6").click(function(event) {
$(".collapse2").fadeToggle().delay(100);
});
})
It's just part of it, its quite "big" by now and not telling anything because of being generated by django form. Main issue right now is that if I click on same row as is checkbox, it trigger action of displaying data.
take a look at this JS Fiddle . Its a simple JS solution with a fadeToggle for your problem. The Delay of 100 is optional but it makes the hole thing more elegant although this stays in conflict with your "should be displayed immediately" ;)
<input class="collapsed" type="checkbox">Cars
<div class="collapse2 collapse" style="display:none;">
<li>car hello</li>
<li>car lala</li>
<li>car 1</li>
</div>
$(".collapsed").click(function(event){
$(".collapse2").fadeToggle().delay(100);
})
If there are multiple checkboxes like this you maybe would have to use the this keyword in JS.
That should do the Trick if not leave a comment.

django templates displaying the html tags as it is

below is the index.html file inside my workspace/projectname/templates/appname
<!DOCTYPE html>
<html>
<head>
<title>my news</title>
</head>
<body>
<h1>look below for news</h1>
{%if categories%}
<ul>
{%for category in categories%}
<li>{{category.name}}</li>
{%endfor%}
</ul>
{%endif%}
{%if headings%}
<p>
{%for heading in headings%}
{{heading.title}}
<br>
{{heading.content}}
{%endfor%}
</p>
{%endif%}
</body>
</html>
the problem is <ul> and <li> tags are working and displaying the list as it should do.the <a> tag is also displaying a hyperlink,but the <p> tag and <br> tags are not being rendered and are being displayed as a text,cant think what might be the problem.i am fairly new to django.
Try using {{heading.content|safe}} or turn autoescape off (See docs).
Although, the other answer accurately solves your problem, but that approach is not safe everytime.
If you know that only trustworthy people are going to write that article/post, then you can simply turn Django's autoescaping off (as pointed in the other answer).
But if you want to display HTML from an untrustworthy source, you are prone to XSS attacks. In that case you should use applications like django-bleach. It will escape specific HTML tags like <script> and any other tags that you want to escape.

Categories

Resources