Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
not sure if its even possible tbh.. all im trying to do is dynamically edit the text.
HTML:
<aside class="banner">
Place <span class=red>open</span></a>
</aside>
python:
reds = root.find_class("red")
for element in reds:
*not sure what goes here*
I already have code that i can use to edit the text remotel
I know you are asking about lxml but great alternative when it comes to html files is bs4.
With bs4/BeautifulSoup it looks like this:
for element in soup.find_all("span", { "class": "red"}):
element.string = NEW_VALUE
with open("out.html", "w") as out_file:
out_file.write(str(soup))
https://github.com/poleszcz/stack-misc/blob/main/69193852-bs4-edit-content/edit.py
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 10 months ago.
Improve this question
In my scrapper I use .select("div.class-name") method but have a trouble: it returns non-separated values.
Structure of my html:
<div class="class-name">
<div>Text1</div>
<div>Text2</div>
<div>Text3</div>
</div>
And as a result it gives me a list ["Text1Text2Text3"]. Is there any way to separate it as in html?
You mean like this?
from bs4 import BeautifulSoup
sample_html = '''<div class="class-name">
<div>Text1</div>
<div>Text2</div>
<div>Text3</div>
</div>'''
print(BeautifulSoup(sample_html, "lxml").select("div.class-name div"))
Output:
[<div>Text1</div>, <div>Text2</div>, <div>Text3</div>]
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
this is how my .dat file looks like i want to know how to extract data from it like i want it like 1::Toy Story (1995) each thing in separate column. also i want to do it without pandas and numpy is there anyway possible
with open('ml-1m/movies.dat',encoding='iso-8859-1 ') as datFile:
print([data.split()[0] for data in datFile])
here is one way with result as a dictionnary
dict_of_film = {}
for i in open(r"path").readlines():
index,name,genre,_ = (i.replace("\n",'').split('::'))
dict_of_film[index] = { "name" : name , "genre" : genre }
print(dict_of_film)
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
How can I grab the invite code from this string?
{awarded:1,inviteURL:https:\/\/www.example.com\/refer\/invite\/111A111A\/}
The expected output would be "111A111A".
Any help is appreciated
I tried it in a simple way, You could give more details for further improvement.
s = "{awarded:1,inviteURL:https:\/\/www.example.com\/refer\/invite\/111A111A\/}"
print(s[-11: -3])
This will do it with ReGex
import re
def findInvite(s):
return re.search(r"(?<=/invite\\/).*(?=\\/)",s).group()
assert findInvite("{awarded:1,inviteURL:https:\/\/www.example.com\/refer\/invite\/111A111A\/}") == "111A111A"
And if this isn't a string but a dict, then change the function to:
def findInvite(d):
s = d["inviteURL"]
return re.search(r"(?<=/invite\\/).*(?=\\/)",s).group()
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm looking to return all instances of the following in Python, but not sure how. As in, how can I search a String and print every time the following format is found:
<a href="[what I'm trying to return is here]" class="faux-block-link__overlay-link"
You need an HTML parser, like BeautifulSoup. Sample:
>>> from bs4 import BeautifulSoup
>>>
>>> s = 'link'
>>> BeautifulSoup(s, "html.parser").a["href"]
u"[what I'm trying to return is here]"
where .a is equivalent to .find("a"). Note that BeautifulSoup provides a convenient dictionary-like access to element attributes.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I am newbie in grep and I'm familiar with Python. My problem is to find and replace every string inside the quote like "text" by < em >text< /em >
The source file has the html form
Thanks
That'll do the trick
import re
s = '"text" "some"'
res = re.subn('"([^"]*)"', '<em>\\1</em>', s)[0]