Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm looking to return all instances of the following in Python, but not sure how. As in, how can I search a String and print every time the following format is found:
<a href="[what I'm trying to return is here]" class="faux-block-link__overlay-link"
You need an HTML parser, like BeautifulSoup. Sample:
>>> from bs4 import BeautifulSoup
>>>
>>> s = 'link'
>>> BeautifulSoup(s, "html.parser").a["href"]
u"[what I'm trying to return is here]"
where .a is equivalent to .find("a"). Note that BeautifulSoup provides a convenient dictionary-like access to element attributes.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 10 months ago.
Improve this question
In my scrapper I use .select("div.class-name") method but have a trouble: it returns non-separated values.
Structure of my html:
<div class="class-name">
<div>Text1</div>
<div>Text2</div>
<div>Text3</div>
</div>
And as a result it gives me a list ["Text1Text2Text3"]. Is there any way to separate it as in html?
You mean like this?
from bs4 import BeautifulSoup
sample_html = '''<div class="class-name">
<div>Text1</div>
<div>Text2</div>
<div>Text3</div>
</div>'''
print(BeautifulSoup(sample_html, "lxml").select("div.class-name div"))
Output:
[<div>Text1</div>, <div>Text2</div>, <div>Text3</div>]
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
not sure if its even possible tbh.. all im trying to do is dynamically edit the text.
HTML:
<aside class="banner">
Place <span class=red>open</span></a>
</aside>
python:
reds = root.find_class("red")
for element in reds:
*not sure what goes here*
I already have code that i can use to edit the text remotel
I know you are asking about lxml but great alternative when it comes to html files is bs4.
With bs4/BeautifulSoup it looks like this:
for element in soup.find_all("span", { "class": "red"}):
element.string = NEW_VALUE
with open("out.html", "w") as out_file:
out_file.write(str(soup))
https://github.com/poleszcz/stack-misc/blob/main/69193852-bs4-edit-content/edit.py
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I'm trying to get all keys' values that equal "url" ignoring nesting from a JSON file and then output them to a text file. How would I go about doing this?
I'm running Python 3.7 and cannot seem to find a solution.
r = requests.get('https://launchermeta.mojang.com/mc/game/version_manifest.json')
j = r.json()
The result expected from this would be a text file filled with links from this json file.
https://launchermeta.mojang.com/v1/packages/31fa028661857f2e3d3732d07a6d36ec21d6dbdc/a1.2.3_02.json
https://launchermeta.mojang.com/v1/packages/2dbccc4579a4481dc8d72a962d396de044648522/a1.2.3_01.json
https://launchermeta.mojang.com/v1/packages/48f077bf27e0a01a0bb2051e0ac17a96693cb730/a1.2.3.json
etc.
Using requests library
import requests
response = requests.get('https://launchermeta.mojang.com/mc/game/version_manifest.json').json()
url_list = []
for result in response['versions']:
url_list.append(result['url'])
print(url_list)
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
Currently I'm working with an API of an online game I play to build a tool, and in doing so I've encountered a problem. The API returns JSON files to the user. While working on creating a class that parses these JSON files for me I've realized I'd like to be able to format a number inside of one of them so instead of being "585677088.5" it's "585,677,088". This would be easy enough if the string just contained this number however this string contains a bunch of other text as well.
Here's a block of the text
loan:0
unpaidfees:-3510000
total:585677088.5
I'm using python to do this.
The only existing code I have in place is:
import urllib2
data = urllib2.urlopen("URL")
Like this:
>>> my_str = """loan:0
... unpaidfees:-3510000
... total:585677088.5"""
>>> map(lambda x: (x.split(":")[0], int(float(x.split(":")[-1]))), my_str.split("\n"))
[('loan', 0), ('unpaidfees', -3510000), ('total', 585677088)]
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I am newbie in grep and I'm familiar with Python. My problem is to find and replace every string inside the quote like "text" by < em >text< /em >
The source file has the html form
Thanks
That'll do the trick
import re
s = '"text" "some"'
res = re.subn('"([^"]*)"', '<em>\\1</em>', s)[0]