Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I'm building relatively complicated xpath expressions in Python, in order to pass them to selenium. However, its pretty easy to make a mistake, so I'm looking for a library that allows me to build the expressions without messing about with strings. For example, instead of writing
locator='//ul[#class="comment-contents"][contains(., "West")]/li[contains(., "reply")]
I could write something like:
import xpathbuilder as xpb
locator = xpb.root("ul")
.filter(attr="class",value="comment-contents")
.filter(xpb.contains(".", "West")
.subclause("li")
.filter(xpb.contains (".", "reply"))
which is maybe not as readable, but is less error-prone. Does anything like this exist?
though this is not exactly what you want.. you can use css selector
...
import lxml.cssselect
csssel = 'div[class="main"]'
selobj = lxml.cssselect.CSSSelector(csssel)
elements = selobj(documenttree)
generated XPath expression is in selobj.path
>>> selobj.path
u"descendant-or-self::div[#class = 'main']"
You can use lxml.etree that allows to write code as the following:
from lxml.builder import ElementMaker # lxml only !
E = ElementMaker(namespace="http://my.de/fault/namespace", nsmap={'p' : "http://my.de/fault/namespace"})
DOC = E.doc
TITLE = E.title
SECTION = E.section
PAR = E.par
my_doc = DOC(
TITLE("The dog and the hog"),
SECTION(
TITLE("The dog"),
PAR("Once upon a time, ..."),
PAR("And then …")
),
SECTION(
TITLE("The hog"),
PAR("Sooner or later …")
)
)
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 1 year ago.
Improve this question
Does anyone knows how to do similar in Python to Bash' seq ?
For example, I would like to search for a line in a document that contains the chain of characters "measurement = abc" and replace "abc" with something of my own (using "{}".format() or something like it).
Do I need to check each string/substring in the document with an iterative loop or is there a built-in command or package in Python that already does that ?
Thank you!
I would do this with the regex implementation in Python (built-in re module).
This can be done with a positive lookbehind assertion.
(?<=...) Matches if the current position in the string is preceded by a match for ... that ends at the current position.
import re
data = "measurement = abc some other text measurement = abc some other text"
data = re.sub(r'(?<=measurement\s=\s)abc', '123', data) # positive lookbehind assertion
print(data) # "measurement = 123 some other text measurement = 123 some other text"
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
More specifically, I have an collection of string elements in C# that I want to print out like so:
{ "Element 1", "Element 2", "Element 3" }
Is there a library already built to do this? I know I could write a little bit of code for myself but I am curious. In python, if I had a list, I could easily print out the list with nice formatting like so.
import pprint
my_list = ['foo', 'bar', 'cats', 'dogs']
pprint.pprint(my_list)
which would yield me ['foo', 'bar', 'cats', 'dogs'] to the console.
Comma separated values are fine, I do not need the curly brackets or other formatting.
You can do it with string.Join(),
Concatenates the members of a constructed IEnumerable collection of
type String, using the specified separator between each member.
using System;
using System.Linq;
using System.Collections.Generic;
...
var result = string.Join(", ", my_list); //Concatenate my_list elements using comma separator
Console.WriteLine(result); //Print result to console
.Net Fiddle
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
Below is the code and output.
article = Article(url,keep_article_html=True)
article.download()
article.parse()
print(article.article_html)
<div><p class="tt_adsense_top">
</p>
<p> test test </p>
I want to delete this part from the string
<div><p class="tt_adsense_top">
</p>
<p>
only leave
<p> test test </p>
when i use python re to match it, i can only match this line i don't know how to match blank line and blank space.
<div><p class="tt_adsense_top">
who can give me a example to delete it
The 'replace' function for python strings would be the easiest way. So for example you would do
a = "some string removethis"
a = a.replace("removethis", "")
You could also use 'remove' function, and the important distinction between that and replace is that 'remove' only removes the first string it finds while replace will replace all found substrings with the 2nd argument you pass to the function. I would advise reading the python documentation on string methods it has a lot interesting and important functions you can play around with.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
First of all, I am new to python.
I am trying to make a web crawler that keeps keeps checking a particular website until it finds a specific tv series keywords available.
I have gone through the other answers many times by now, but nothing works for me.
I am unable to install pygame as suggested here:- Playing mp3 song on python
Tried importing the the webbrowser, installing vlc module from git, things dont work out.
Can someone just give me a simple detailed guide to do this?
Assuming the song 'your_song.mp3' is stored in your computer in the directory '/Users/', you can open the song from your python program doing this:
import subprocess
subprocess.Popen(['open','/Users/your_song.mp3'])
Note: the previous code will work on OS X. For windows, type instead:
subprocess.Popen(['start','/Users/your_song.mp3'],shell=True)
Assuming you are on a *nix based system, here is a simple example using the python standard library.
>>> import urllib.request
>>> import subprocess
>>> url = 'http://example.com/'
>>> req = urllib.request.Request(url)
>>> data = urllib.request.urlopen(req)
>>> data
<http.client.HTTPResponse object at 0x7f483018be80>
>>> html_data = data.read()
>>> ## replace it with your logic
>>> your_case_here = True
>>> if your_case_here:
... subprocess.run(["gnome-open", "/path/to/mo3_file/hello.mp3", "/dev/null"], stdout=subprocess.PIPE)
... else:
... pass
>>>
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 months ago.
Improve this question
I don't know if this is the right place to ask this question, but here goes…
I have an XML file that I want to read. Thus far, I have been using lxml.etree.ElementTree. However, I find that I need functionality that allows me to travel from a child node, to its parent node in the XML.
This seems to be not possible with lxml.etree.ElementTree. Is there an xml parsing library that will allow me to do this?
In case it matters, I'm on Python 2.7.3
Thank you
xml.etree.ElementTree.Elements do not keep track of parent elements, but lxml.etree._Elements do:
parent = elt.getparent()
For example,
import lxml.etree as ET
# import xml.etree.ElementTree as ET
text = '''\
<root>
<foo>
<bar/>
<bar/>
</foo>
</root>'''
root = ET.fromstring(text)
for elt in root.findall('foo/bar'):
parent = elt.getparent()
print(parent)
yields
<Element foo at 0xb7415c34>
<Element foo at 0xb7415c34>