Cannot call xml.dom.minidom.parse inside class - python

I am unable to call xml.dom.minidom.parse() within my class
As a sheer example,
class XmlReader:
def __init__(self, xml):
self.xml = xml
DOMTree = xml.dom.minidom.parse("test.xml")
xmlReader = XmlReader("test.xml")
Throws
File "handler2.py", line 10, in ?
xmlReader = XmlReader("test.xml")
File "handler2.py", line 8, in __init__
DOMTree = xml.dom.minidom.parse("test.xml")
AttributeError: 'str' object has no attribute 'dom'
However outside I am able to call xml.dom.minidom.parse just fine.
What do I need to change in order to be able to call the function within my XmlReader class?

Inside your constructor, xml refers to the parameter xml instead of the module xml. This is called shadowing. Choose a different name for one of them.
import xml as xml_module
or
from xml.dom import minidom
or
def __init__(self, xml_data):

Related

Python class attribute 'is not defined' when referenced by another class attribute

Using the following, I am able to successfully create a parser and add my arguments to self._parser through the __init()__ method.
class Parser:
_parser_params = {
'description': 'Generate a version number from the version configuration file.',
'allow_abbrev': False
}
_parser = argparse.ArgumentParser(**_parser_params)
Now I wish to split the arguments into groups so I have updated my module, adding some classes to represent the argument groups (in reality there are several subclasses of the ArgumentGroup class), and updating the Parser class.
class ArgumentGroup:
_title = None
_description = None
def __init__(self, parser) -> ArgumentParser:
parser.add_argument_group(*self._get_args())
def _get_args(self) -> list:
return [self._title, self._description]
class ArgumentGroup_BranchType(ArgumentGroup):
_title = 'branch type arguments'
class Parser:
_parser_params = {
'description': 'Generate a version number from the version configuration file.',
'allow_abbrev': False
}
_parser = argparse.ArgumentParser(**_parser_params)
_argument_groups = [cls(_parser) for cls in ArgumentGroup.__subclasses__()]
However, I'm now seeing an error.
Traceback (most recent call last):
...
File "version2/args.py", line 62, in <listcomp>
_argument_groups = [cls(_parser) for cls in ArgumentGroup.__subclasses__()]
NameError: name '_parser' is not defined
What I don't understand is why _parser_params do exist when they are referred by another class attribute, but _parser seemingly does not exist in the same scenario? How can I refactor my code to add the parser groups as required?
This comes from the confluence of two quirks of Python:
class statements do not create a new local scope
List comprehensions do create a new local scope.
As a result, the name _parser is in a local scope whose closest enclosing scope is the global scope, so it cannot refer to the about-to-be class attribute.
A simple workaround would be to replace the list comprehension with a regular for loop.
_argument_groups = []
for cls in ArgumentGroup.__subclasses()__:
_argument_groups.append(cls(_parser))
(A better solution would probably be to stop using class attributes where instance attributes make more sense.)

Python: Not able to read properties from property file

I'm trying to read configurations from a property file and store those properties in a variable so that it can be accessed from any other class.
I'm able to read the configuration from the config file and print the same but I'm getting an exception when those variables are accessed from some other class.
my config file
Config.cfg.txt
[Ysl_Leader]
YSL_LEADER=192
Generic class where i will store my properties in a variable.
ConfigReader.py
import configparser
class DockerDetails:
config = configparser.RawConfigParser()
_SECTION = 'Ysl_Leader'
config.read('Config.cfg.txt')
YSL_Leader = config.get('Ysl_Leader', 'YSL_LEADER')
print(YSL_Leader)
Another class where I'm trying to get the get the 'YSL_Leader' value
def logger(request):
print(ConfigReader.DockerDetails.YSL_Leader)
Exception:
File "C:\Users\pvivek\AppData\Local\Programs\Python\Python37-32\lib\configparser.py", line 780, in get
d = self._unify_values(section, vars)
File "C:\Users\pvivek\AppData\Local\Programs\Python\Python37-32\lib\configparser.py", line 1146, in _unify_values
raise NoSectionError(section) from None
configparser.NoSectionError: No section: 'Ysl_Leader'
FYI: I'm not getting any exception when I run ConfigReader.py alone
analyzing your question you try to create an environment file, if it is the case because you are using a class to read the file, you must perform this operation in its constructor (remember to make the reference self) and instantiate to be able to access its values, You can perfectly use a function to perform this reading, remembering that to access the result can be treated as a dictionary
configuration file name = (config.ini)
[DEFAULT]
ANY = ANY
[Ysl_Leader]
YSL_LEADER = 192
[OTHER]
VALUE = value_config
# using classes
class Cenv(object):
"""
[use the constructor to start reading the file]
"""
def __init__(self):
self.config = configparser.ConfigParser()
self.config.read('config.ini')
# using functions
def Fenv():
config = configparser.ConfigParser()
config.read('config.ini')
return config
def main():
# to be able to access it is necessary to instantiate the class
instance = Cenv()
cfg = instance.config
# access the elements using the key (like dictionaries)
print(cfg['Ysl_Leader']['YSL_LEADER'])
print(cfg['OTHER']['VALUE'])
# call function and assign returned values
cfg = Fenv()
print(cfg['Ysl_Leader']['YSL_LEADER'])
print(cfg['OTHER']['VALUE'])
# in the case of django assign values ​​in module settings
if __name__ == '__main__':
main()
you can interpret the result as follows (dictionary)
{
"Ysl_Leader": {
"YSL_LEADER": "192"
},
"OTHER": {
"VALUE": "value_config"
}
}

Instantiate a class from an abstract syntax tree

I have a .py file that contains the definition of a class. I cannot change that file.
I have parsed the contents of that file and gotten an abstract syntax tree (ast) for it, more specifically I've got an instance of ast.ClassDef().
How can I instantiate that class from the ast.ClassDef()?
Here's a contrived demo showing what I want to do:
import ast
src = """
class MyClass():
def __init__():
self.a = 1
self.b = 2
self.thesum = self.a + self.b
"""
p = ast.parse(src)
class_ = [node for node in ast.walk(p) if isinstance(node, ast.ClassDef)][0]
class_ #I want to instantiate the class that is referenced by class_
Googling suggests that maybe compile() could help but I couldn't get that to work as it seems to expect executable code rather than a class definition. In other words, this didn't do very much:
exec(compile(class_.body, "fakemodule", 'exec'))
it throws an error:
TypeError: expected a readable buffer object
So, how can I instantiate that class?

Using python class with spark DataFrame to parse URL's

I'm trying to process URL's in a pyspark dataframe using a class that I've written and a udf. I'm aware of urllib and other url parsing libraries but for this case I need to use my own code.
In order to get the tld of a url I cross check it against the iana public suffix list.
Here's a simplification of my code
class Parser:
# list of available public suffixes for extracting top level domains
file = open("public_suffix_list.txt", 'r')
data = []
for line in file:
if line.startswith("//") or line == '\n':
pass
else:
data.append(line.strip('\n'))
def __init__(self, url):
self.url = url
#the code here extracts port,protocol,query etc.
#I think this bit below is causing the error
matches = [r for r in self.data if r in self.hostname]
#extra functionality in my actual class
i = matches.index(self.string)
try:
self.tld = matches[i]
# logic to find tld if no match
The class works in pure python so for example I can run
import Parser
x = Parser("www.google.com")
x.tld #returns ".com"
However when I try to do
import Parser
from pyspark.sql.functions import udf
parse = udf(lambda x: Parser(x).url)
df = sqlContext.table("tablename").select(parse("column"))
When I call an action I get
File "<stdin>", line 3, in <lambda>
File "<stdin>", line 27, in __init__
TypeError: 'in <string>' requires string as left operand
So my guess is that it's failing to interpret the data as a list of strings?
I've also tried to use
file = sc.textFile("my_file.txt")\
.filter(lambda x: not x.startswith("//") or != "")\
.collect()
data = sc.broadcast(file)
to open my file instead, but that causes
Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transforamtion. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.
Any ideas?
Thanks in advance
EDIT: Apologies, I didn't have my code to hand so my test code didn't explain very well the problems I was having. The error I initially reported was a result of the test data I was using.
I've updated my question to be more reflective of the challenge I'm facing.
Why do you need a class in this case (the code for defining your class is incorrect, you never declared self.data before using it in the init method) the only relevant line that affects the output you want is self.string=string, so you are basically passing the identity function as udf.
The UnicodeDecodeError is due to an encoding issue in your file, it has nothing to do with your definition of the class.
The second error is in the line sc.broadcast(file) , details of which can be found here : Spark: Broadcast variables: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transforamtion
EDIT 1
I would redefine your class structure as follows. You basically need to create the instance self.data by calling self.data = data before you can use it. Also anything that you write before the init method is executed irrespective of whether you call that class or not. So moving out the file parsing part will not have any effect.
# list of available public suffixes for extracting top level domains
file = open("public_suffix_list.txt", 'r')
data = []
for line in file:
if line.startswith("//") or line == '\n':
pass
else:
data.append(line.strip('\n'))
class Parser:
def __init__(self, url):
self.url = url
self.data = data
#the code here extracts port,protocol,query etc.
#I think this bit below is causing the error
matches = [r for r in self.data if r in self.hostname]
#extra functionality in my actual class
i = matches.index(self.string)
try:
self.tld = matches[i]
# logic to find tld if no match

Attribute error in python while parsing an XML

I am kinda new to Python. I am working on a project that parses an XML in Python and my Python code to do so is :
from xml.dom import minidom
from copy import copy
class Xmlparse:
def __init__(self, xmlfile):
self = minidom.parse(xmlfile)
def findadress(self):
itemlist =self.getElementsByTagName('addresses')
return itemlist[0].attributes['firstname'].value
if __name__ == '__main__':
with open("sample.xml") as f:
parse = Xmlparse(f)
print parse.findadress()
But when I run this code I get an output error:
AttributeError: Xmlparse instance has no attribute 'findadress'
And findadress function is spelled correctly in the main, but for some reason what so ever i am getting this error.
Any help is really appreciated.
And I also wanted to know, how can I validate the xml with an XSD schema in python?
"self = minidom.parse(xmlfile)" overwrites the Xmlparse object you just created. You want to assign the xml doc to a variable instead:
from xml.dom import minidom
from copy import copy
class Xmlparse:
def __init__(self, xmlfile):
self.doc = minidom.parse(xmlfile)
def findadress(self):
itemlist =self.doc.getElementsByTagName('addresses')
return itemlist[0].attributes['firstname'].value
the evil is in self = minidom.parse(xmlfile)

Categories

Resources