Find duplicate id attributes - python

Before uploading on my server I want to check if I accidentally defined an id two or more times in one of my html files:
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>The HTML5 Herald</title>
<meta name="description" content="The HTML5 Herald">
<meta name="author" content="SitePoint">
<link rel="stylesheet" href="css/styles.css?v=1.0">
</head>
<body>
<div id="test"></div>
<div id="test"></div>
</body>
</html>
The idea is to print an error message if there are duplicates:
"ERROR: The id="test" is not unique."

You can do this by using find_all to gather all elements with an id attribute, and then collections.Counter to collect the ids that contain duplicates
import bs4
import collections
soup = bs4.BeautifulSoup(html)
ids = [a.attrs['id'] for a in soup.find_all(attrs={'id': True})]
ids = collections.Counter(ids)
dups = [key for key, value in ids.items() if value > 1]
for d in dups:
print('ERROR: The id="{}" is not unique.'.format(d))
>>> ERROR: The id="test" is not unique.

You could use a regex to find all ids in the HTML and then search for duplicates.
For example:
import re
html_page = """
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>$The HTML5 Herald</title>
<div id="test1"></div>
<meta name="description" content="The HTML5 Herald">
<meta name="author" content="SitePoint">
<link $rel="stylesheet" href="css/styles.css?v=1.0">
</head>
<body>
<div id="test2"></div>
<div id="test2"></div>
</body>
<div id="test3"></div>
</html>
"""
ids_match = re.findall(r'(?<=\s)id=\"\w+\"',html_page)
print(ids_match) #-> ['id="test1"', 'id="test2"', 'id="test2"', 'id="test3"']
print(len(ids_match)) #-> 4
print(len(set(ids_match))) #->3
# the following returns True if there are dupicates in ids_match
print(len(ids_match) != len(set(ids_match))) #->True

Related

how can I redirect to a dynamic url using HTML?

My code is
{% if user.employee.employee_id == 2 %}
<head>
<title>HTML Redirect</title>
<meta http-equiv="refresh"
content="1; url = employee/2/" />
</head>
{% elif user.employee.employee_id == 4 %}
<head> Wait!
<title>HTML Redirect</title>
<meta http-equiv="refresh"
content="1; url = employee/4/" />
</head>
As you can see from above, i am using if elif to access pages based on the id of employee and this method is not efficient. I want to change the employee_id to variable. something like this:
x = user.employee.empolyee_id
<head> Wait!
<title>HTML Redirect</title>
<meta http-equiv="refresh"
content="1; url = employee/x/" />
</head>
put the employee_id in url:
url = employee/{{user.employee.employee_id}}/

The current path, FirstApp/Challenge/{url 'FirstApp:Challenge1'}, didn't match any of these error

i have a challenge page to which pages challenge1 and challenge2 are linked but when i got to challenge page and try to access the challenge1 and challenge2 page i get following error
The current path, FirstApp/Challenge/{url 'FirstApp:Challenge1'}, didn't match any of these.
i have done similar things before but this one does not seem to be working ,can someone point out error and solution,please help
<!DOCTYPE html>
<html lang="en" dir="ltr">
<head>
<meta charset="utf-8">
<title>Challenge</title>
</head>
<body>
<h2>There are two challenges for you.</h2>
Accept challenge1
=========================================================================<br>
Accept challenge2.
</body>
</html>
<!DOCTYPE html>
<html lang="en" dir="ltr">
<head>
<meta charset="utf-8">
<title>Challenge 1</title>
</head>
<body>
This is challenge 1.
<body>
</html>
<!DOCTYPE html>
<html lang="en" dir="ltr">
<head>
<meta charset="utf-8">
<title>Challenge 2</title>
</head>
<body>
this is challenge2.
<body>
</html>
Views.py file
def Challenge2View(request):
Challenge2Status= Challenge2.objects.filter(user=request.user)
form=Challenge1Form()
if request.method=="POST":
form=Challenge1Form(request.POST)
if form.is_valid():
form.save(commit=True)
return index(request)
else:
print("Invalid form")
return render(request,'FirstApp/Challenge2.html',{'forms':form,'Challenge2Status':Challenge2Status})
def Challenge1View(request):
Challenge1Status= Challenge1.objects.get(user=request.user)
time=request.POST.get('Time')
if time<=10:
Challenge1Status.Hours_spent_learning+=time
Challenge1Status.save()
else:
print("Time limit exceeded")
return render(request,'FirstApp/Challenge1.html',{'Challenge1Status':Challenge1Status})
def ChallengeView(request):
return render(request,'FirstApp/Challenge.html')
url.py application file
url(r'^Challenge/$',views.ChallengeView,name='Challenge'),
url(r'^Challenge1/$',views.Challenge1View,name='Challenge1'),
url(r'^Challenge2/$',views.Challenge2View,name='Challenge2'),

Get data from API and display on tempate

I am trying to get data from the stackoverflow API and display them on an html table in my template.
So far I have managed to get the data but cannot display them in the template. I end up getting the last one. I do know my loop is wrong, have tried a bunch of stuff but can't seem to figure it out.
My code so far:
def get_questions(request):
context = {}
r = requests.get('https://api.stackexchange.com/2.2/questions?fromdate=1525737600&order=desc&sort=activity&tagged=python&site=stackoverflow').json()
for item in r['items']:
context['owner'] = item['owner']['display_name']
context['title'] = item['title']
#some other attrs here
template = 'questions/questions_list.html'
context['greeting'] = 'Hello'
return render(request,template,context)
My template code:
I haven't done anything fancy yet. Pretty simple.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="ie=edge">
<link href="https://stackpath.bootstrapcdn.com/bootstrap/4.1.1/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-WskhaSGFgHYWDcbwN70/dfYBj47jz9qbsMId/iRN3ewGhXQFZCSftd1LZCfmhktB" crossorigin="anonymous">
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.1.1/js/bootstrap.min.js" integrity="sha384-smHYKdLADwkXOn1EmN1qk/HfnUcbVRZyYmZ4qpPea6sjB/pTJ0euyQp0Mk8ck+5T" crossorigin="anonymous"></script>
<title>Questions</title>
</head>
<body>
{{ owner }} - {{ title }}
</body>
</html>
You need to append your result to a list and the render that list in your template.
Demo:
views.py
def get_questions(request):
context = {}
r = requests.get('https://api.stackexchange.com/2.2/questions?fromdate=1525737600&order=desc&sort=activity&tagged=python&site=stackoverflow').json()
dataList = []
for item in r['items']:
dataList.append({'owner': item['owner']['display_name'], 'title': item['title']})
#some other attrs here
template = 'questions/questions_list.html'
context['greeting'] = 'Hello'
context['data'] = dataList
return render(request,template,context)
Template
Iterate over your result and get all data
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="ie=edge">
<link href="https://stackpath.bootstrapcdn.com/bootstrap/4.1.1/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-WskhaSGFgHYWDcbwN70/dfYBj47jz9qbsMId/iRN3ewGhXQFZCSftd1LZCfmhktB" crossorigin="anonymous">
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.1.1/js/bootstrap.min.js" integrity="sha384-smHYKdLADwkXOn1EmN1qk/HfnUcbVRZyYmZ4qpPea6sjB/pTJ0euyQp0Mk8ck+5T" crossorigin="anonymous"></script>
<title>Questions</title>
</head>
<body>
{% for i in data %}
{{ i.owner }} - {{ i.title }}
{% endfor %}
</body>
</html>

How to avoid printing utf-8 characters in BeautifulSoup with replace_with

I am having a problem and I can find a way to solve it. I am trying to parse an html page and then replace a string, while using Beautiful Soup. Although the process looks correct and I do not get any errors when I open the new html page I get some utf-8 characters inside that I do not want.
Sample of working code:
#!/usr/bin/python
import codecs
from bs4 import BeautifulSoup
html_sample = """
<!DOCTYPE html>
<html><head lang="en"><meta charset="UTF-8"><meta name="viewport" content="width=device-width, initial-scale=1"></head>
<body>
<div class="date">LAST UPDATE</div>
</body>
</html>
"""
try:
my_soup = BeautifulSoup(html_sample.decode('utf-8'), 'html.parser') # html5lib or html.parser
forecast = my_soup.find("div", {"class": "date"})
forecast.tag = unicode(forecast).replace('LAST UPDATE', 'TEST')
forecast.replace_with(forecast.tag)
# print(my_soup.prettify())
f = codecs.open('test.html', "w", encoding='utf-8')
f.write(my_soup.prettify().encode('utf-8'))
f.close()
except UnicodeDecodeError as e:
print('Error, encoding/decoding: {}'.format(e))
except IOError as e:
print('Error Replacing: {}'.format(e))
except RuntimeError as e:
print('Error Replacing: {}'.format(e))
And the output with utf-8 characters in the new html page:
<!DOCTYPE html>
<html>
<head lang="en">
<meta charset="utf-8">
<meta content="width=device-width, initial-scale=1" name="viewport"/>
</meta>
</head>
<body>
<div class="date">TEST</div>
</body>
</html>
I think that I have mixed up, the encoding and decoding process. Someone with more knowledge on this area can possible elaborate more. I am a total beginner on coding and encoding.
Thank you for your time and effort in advance.
There is no need to get into encoding here. You can replace the text content of a Beautiful Soup element by setting the element.string as follows:
from bs4 import BeautifulSoup
html_sample = """
<!DOCTYPE html>
<html><head lang="en"><meta charset="UTF-8"><meta name="viewport" content="width=device-width, initial-scale=1"></head>
<body>
<div class="date">LAST UPDATE</div>
</body>
</html>
"""
soup = BeautifulSoup(html_sample)
forecast = soup.find("div", {"class": "date"})
forecast.string = 'TEST'
print(soup.prettify())
Output
<!DOCTYPE html>
<html>
<head lang="en">
<meta charset="utf-8"/>
<meta content="width=device-width, initial-scale=1" name="viewport"/>
</head>
<body>
<div class="date">
TEST
</div>
</body>
</html>

This XML file does not appear to have any style information associated with it. The document tree is shown below.2

When using the following code in django template:
<!DOCTYPE html>
<html lang="en">
<head>
<link href="http://52.11.183.14/static/wiki/bootstrap/css/wiki-bootstrap.css" type="text/css" rel="stylesheet"/>
<link href="http://52.11.183.14/static/wiki/bootstrap/css/simple-sidebar.css" type="text/css" rel="stylesheet"/>
<title> Profile - Technology βιβλιοθήκη </title>
</head>
<body>
<div class="container">
{% for p in profiles %}
{{p}}
{% endfor %}
</div>
</body>
</html>
I receive the following error:
This XML file does not appear to have any style information associated with it. The document tree is shown below.
Why? And what can I do to fix it?
Solved: by change HttpResponse on render_to_response
my_context={'profiles': profiles}
c = RequestContext(request,{'profiles': profiles})
return render_to_response('wiki/profile.html',
my_context,
context_instance=RequestContext(request))
#return HttpResponse(t.render(c), content_type="application/xhtml+xml")
You must replace your content html tag with this.
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html dir="rtl" xmlns="http://www.w3.org/1999/xhtml">

Categories

Resources