How to represent file structure as python objects - python

I'm trying to find a way to represent a file structure as python objects so I can easily get a specific path without having to type out the string everything. This works for my case because I have a static file structure (Not changing).
I thought I could represent directories as class's and files in the directory as class/static variables.
I want to be able to navigate through python objects so that it returns the path I want i.e:
print(FileStructure.details.file1) # root\details\file1.txt
print(FileStructure.details) # root\details
What I get instead from the code below is:
print("{0}".format(FileStructure())) # root
print("{0}".format(FileStructure)) # <class '__main__.FileStructure'>
print("{0}".format(FileStructure.details)) # <class '__main__.FileStructure.details'>
print("{0}".format(FileStructure.details.file1)) # details\file1.txt
The code I have so far is...
import os
class FileStructure(object): # Root directory
root = "root"
class details(object): # details directory
root = "details"
file1 = os.path.join(root, "file1.txt") # File in details directory
file2 = os.path.join(root, "file2.txt") # File in details directory
def __str__(self):
return f"{self.root}"
def __str__(self):
return f"{self.root}"
I don't want to have to instantiate the class to have this work. My question is:
How can I call the class object and have it return a string instead
of the < class ....> text
How can I have nested classes use their parent classes?

Let's start with: you probably don't actually want this. Python3's pathlib API seems nicer than this and is already in wide support.
root = pathlib.Path('root')
file1 = root / 'details' / 'file1' # a Path object at that address
if file1.is_file():
file1.unlink()
else:
try:
file1.rmdir()
except OSError as e:
# directory isn't empty
But if you're dead set on this for some reason, you'll need to override __getattr__ to create a new FileStructure object and keep track of parents and children.
class FileStructure(object):
def __init__(self, name, parent):
self.__name = name
self.__children = []
self.__parent = parent
#property
def parent(self):
return self.__parent
#property
def children(self):
return self.__children
#property
def name(self):
return self.__name
def __getattr__(self, attr):
# retrieve the existing child if it exists
fs = next((fs for fs in self.__children if fs.name == attr), None)
if fs is not None:
return fs
# otherwise create a new one, append it to children, and return it.
new_name = attr
new_parent = self
fs = self.__class__(new_name, new_parent)
self.__children.append(fs)
return fs
Then use it with:
root = FileStructure("root", None)
file1 = root.details.file1
You can add a __str__ and __repr__ to help your representations. You could even include a path property
# inside FileStructure
#property
def path(self):
names = [self.name]
cur = self
while cur.parent is not None:
cur = cur.parent
names.append(cur.name)
return '/' + '/'.join(names[::-1])
def __str__(self):
return self.path

Up front: This is a bad solution, but it meets your requirements with minimal changes. Basically, you need instances for __str__ to work, so this cheats using the decorator syntax to change your class declaration into a singleton instantiation of the declared class. Since it's impossible to reference outer classes from nested classes implicitly, the reference is performed explicitly. And to reuse __str__, file1 and file2 were made into #propertys so they can use the str form of the details instance to build themselves.
#object.__new__
class FileStructure(object): # Root directory
root = "root"
#object.__new__
class details(object): # details directory
root = "details"
#property
def file1(self):
return os.path.join(str(self), 'file1')
#property
def file2(self):
return os.path.join(str(self), 'file2')
def __str__(self):
return f"{os.path.join(FileStructure.root, self.root)}"
def __str__(self):
return f"{self.root}"
Again: While this does produce your desired behavior, this is still a bad solution. I strongly suspect you've got an XY problem here, but this answers the question as asked.

There exists such class in HTMLgen:
import HTMLutil
class Directory(UserList)
def __cmp__(self, item)
def __init__(self, name='root', data=None)
def add_object(self, pathlist, object)
def ls(self, pad='')
def tree(self)
# Methods inherited by Directory from UserList
def __add__(self, list)
def __delitem__(self, i)
def __delslice__(self, i, j)
def __getitem__(self, i)
def __getslice__(self, i, j)
def __len__(self)
def __mul__(self, n)
def __mul__(self, n)
def __radd__(self, list)
def __repr__(self)
def __setitem__(self, i, item)
def __setslice__(self, i, j, list)
def append(self, item)
def count(self, item)
def index(self, item)
def insert(self, i, item)
def remove(self, item)
def reverse(self)
def sort(self, *args)
Unfortunately, this package is very old. When I tried to understand code for Directory.__cmp__, I failed. It is supposed to compare directory structures, but it calls a function cmp for that, and where that function comes from is unclear to me (types in Python 3 have no such function). I'm not sure it works any longer.
I thought about creating such package myself to test directory hierarchies (I created a file synchronization program). However, as suggested in answers above, pathlib might be a better approach, as well as dircmp from filecmp and rmtree from shutil.
The downside of these standard modules is that they are low-level, so if someone creates a library to deal with all these tasks in one place, that would be great.

Related

I want to add an object to a list from a different file

I have 3 files. The first is a Runners file which is abstract. The other two are CharityRunner and ProfessionalRunners. In these I can create runners.
Runners:
class Runner(object):
def __init__ (self, runnerid, name):
self._runnerid = runnerid
self._name = name
#property
def runnerid(self):
return self._runnerid
#property
def name(self):
return self._name
#name.setter
def name(self, name):
self._name = name
def get_fee(self, basicfee, moneyraised):
raise NotImplementedError("AbstractMethod")
CharityRunners:
from Runner import *
class CharityRunner(Runner):
def __init__ (self, runnerid, name, charityname):
super().__init__(runnerid, name)
self._charityname = charityname
#property
def charityname(self):
return self._charityname
#charityname.setter
def charityname(self, charityname):
self._charityname = charityname
def get_fee(self, basicfee, moneyraised):
if moneyraised >= 100:
basicfee = basicfee * 0.25
elif moneyraised >= 50 and moneyraised < 100:
basicfee = basicfee * 0.5
else:
basicfee = basicfee
return basicfee
ProfessionalRunners:
from Runner import *
class ProfessionalRunner(Runner):
def __init__ (self, runnerid, name, sponsor):
super().__init__(runnerid, name)
self._sponsor = sponsor
#property
def sponsor(self):
return self._sponsor
#sponsor.setter
def sponsor(self, sponsor):
self._sponsor = sponsor
def get_fee(self, basicfee):
basicfee = basicfee * 2
return basicfee
Now I have also created a club object that has a club id and club name. There is also a list called self._runners = []. I'm trying to get a add function that will add the runners created in the list. But it must make sure that the runner is not already in the list.
The object printing method should be in the format of:
Club: <club id> <club name>
Runner: <runner id 1> <runner name 1>
Runner: <runner id 2> <runner name 2>
At the moment I only have this for the club object:
from Runner import *
class Club (object):
def __init__(self, clubid, name):
self._clubid = clubid
self._name = name
self._runners = []
#property
def clubid(self):
return self._clubid
#property
def name(self):
return self._name
#name.setter
def name(self, name):
self._name = name
def add_runner(self):
self._runner.append(Runner)
I'm guessing the part you're missing is:
im trying to get a add function that will add the runners created in the list.
Your existing code does this:
def add_runner(self):
self._runner.append(Runner)
This has multiple problems.
First, you're trying to modify self._runner, which doesn't exist, instead of self._runners.
Next, you're appending the Runner class, when you almost certainly want an instance of it, not the class itself.
In fact, you almost certainly want an instance of one of its subclasses.
And I'm willing to bet you want a specific instance, that someone will pass to the add_runner function, not just some random instance.
So, what you want is probably:
def add_runner(self, runner):
self._runners.append(runner)
And now that you posted the UML diagram, it says that explicitly: add_runner(Runner: runner). In Python, you write that as:
def add_runner(self, runner):
Or, if you really want:
def add_runner(self, runner: Runner):
… but that will probably mislead you into thinking that this is a Java-style definition that requires an instance of Runner or some subclass thereof and checks it statically, and that it can be overloaded with different parameter types, etc., none of which is true.
To use it, just do this:
doe_club = Club(42, "Doe Family Club")
john_doe = CharityRunner(23, "John Doe", "Toys for John Doe")
doe_club.add_runner(john_doe)
Next:
But it must make sure that the runner is not already in the list.
You can translate that almost directly from English to Python:
def add_runner(self, runner):
if runner not in self._runners:
self._runners.append(runner)
However, this does a linear search through the list for each new runner. If you used an appropriate data structure, like a set, this wouldn't be a problem. You could use the same code (but with add instead of append)… but you don't even need to do the checking with a set, because it already takes care of duplicates for you. So, if you set self._runners = {}, you just need:
def add_runner(self, runner):
self._runners.add(runner)

Python "callable" attribute (pseudo-property)

In python, I can alter the state of an instance by directly assigning to attributes, or by making method calls which alter the state of the attributes:
foo.thing = 'baz'
or:
foo.thing('baz')
Is there a nice way to create a class which would accept both of the above forms which scales to large numbers of attributes that behave this way? (Shortly, I'll show an example of an implementation that I don't particularly like.) If you're thinking that this is a stupid API, let me know, but perhaps a more concrete example is in order. Say I have a Document class. Document could have an attribute title. However, title may want to have some state as well (font,fontsize,justification,...), but the average user might be happy enough just setting the title to a string and being done with it ...
One way to accomplish this would be to:
class Title(object):
def __init__(self,text,font='times',size=12):
self.text = text
self.font = font
self.size = size
def __call__(self,*text,**kwargs):
if(text):
self.text = text[0]
for k,v in kwargs.items():
setattr(self,k,v)
def __str__(self):
return '<title font={font}, size={size}>{text}</title>'.format(text=self.text,size=self.size,font=self.font)
class Document(object):
_special_attr = set(['title'])
def __setattr__(self,k,v):
if k in self._special_attr and hasattr(self,k):
getattr(self,k)(v)
else:
object.__setattr__(self,k,v)
def __init__(self,text="",title=""):
self.title = Title(title)
self.text = text
def __str__(self):
return str(self.title)+'<body>'+self.text+'</body>'
Now I can use this as follows:
doc = Document()
doc.title = "Hello World"
print (str(doc))
doc.title("Goodbye World",font="Helvetica")
print (str(doc))
This implementation seems a little messy though (with __special_attr). Maybe that's because this is a messed up API. I'm not sure. Is there a better way to do this? Or did I leave the beaten path a little too far on this one?
I realize I could use #property for this as well, but that wouldn't scale well at all if I had more than just one attribute which is to behave this way -- I'd need to write a getter and setter for each, yuck.
It is a bit harder than the previous answers assume.
Any value stored in the descriptor will be shared between all instances, so it is not the right place to store per-instance data.
Also, obj.attrib(...) is performed in two steps:
tmp = obj.attrib
tmp(...)
Python doesn't know in advance that the second step will follow, so you always have to return something that is callable and has a reference to its parent object.
In the following example that reference is implied in the set argument:
class CallableString(str):
def __new__(class_, set, value):
inst = str.__new__(class_, value)
inst._set = set
return inst
def __call__(self, value):
self._set(value)
class A(object):
def __init__(self):
self._attrib = "foo"
def get_attrib(self):
return CallableString(self.set_attrib, self._attrib)
def set_attrib(self, value):
try:
value = value._value
except AttributeError:
pass
self._attrib = value
attrib = property(get_attrib, set_attrib)
a = A()
print a.attrib
a.attrib = "bar"
print a.attrib
a.attrib("baz")
print a.attrib
In short: what you want cannot be done transparently. You'll write better Python code if you don't insist hacking around this limitation
You can avoid having to use #property on potentially hundreds of attributes by simply creating a descriptor class that follows the appropriate rules:
# Warning: Untested code ahead
class DocAttribute(object):
tag_str = "<{tag}{attrs}>{text}</{tag}>"
def __init__(self, tag_name, default_attrs=None):
self._tag_name = tag_name
self._attrs = default_attrs if default_attrs is not None else {}
def __call__(self, *text, **attrs):
self._text = "".join(text)
self._attrs.update(attrs)
return self
def __get__(self, instance, cls):
return self
def __set__(self, instance, value):
self._text = value
def __str__(self):
# Attrs left as an exercise for the reader
return self.tag_str.format(tag=self._tag_name, text=self._text)
Then you can use Document's __setattr__ method to add a descriptor based on this class if it is in a white list of approved names (or not in a black list of forbidden ones, depending on your domain):
class Document(object):
# prelude
def __setattr__(self, name, value):
if self.is_allowed(name): # Again, left as an exercise for the reader
object.__setattr__(self, name, DocAttribute(name)(value))

Implementing recursion with a deep copy

How can I implement recursion in a deep copy function object? This is the relevant code (if you want more then please ask):
PS: I would like the recursion to iterate through a filtered list of references. The goal is to download and insert any missing objects.
copy.py
from put import putter
class copier:
def __init__(self, base):
self.base = base
def copyto(self, obj):
put = putter(obj)
for x in self.base.__dict__:
put(x)
put.py
class putter:
def __init__(self, parent):
self.parent = parent
def put(self, name, obj):
self.parent.__dict__[name] = obj
Check out the documentation for copy.deepcopy, if you can implement what you want with __getinitargs__(), __getstate__() and __setstate__(), then that will save you a lot of grief. Otherwise, you will need to reimplement it yourself, it should look something like:
def deepcopyif(obj, shouldcopyprop):
copied = {} # Remember what has already been copied
def impl(obj):
if obj in copied:
return copied[obj]
newobj = *** Create a copy ***
copied[obj] = newobj # IMPORTANT: remember the new object before recursing
for name, value in obj.__dict__: # or whatever...
if shouldcopyprop(obj.__class__, name): # or whatever
value = impl(value) # RECURSION: this will copy the property value
newobj.__dict__[prop] = value
return newobj
return impl(obj)

Does Python support something like literal objects?

In Scala I could define an abstract class and implement it with an object:
abstrac class Base {
def doSomething(x: Int): Int
}
object MySingletonAndLiteralObject extends Base {
override def doSomething(x: Int) = x*x
}
My concrete example in Python:
class Book(Resource):
path = "/book/{id}"
def get(request):
return aBook
Inheritance wouldn't make sense here, since no two classes could have the same path. And only one instance is needed, so that the class doesn't act as a blueprint for objects. With other words: no class is needed here for a Resource (Book in my example), but a base class is needed to provide common functionality.
I'd like to have:
object Book(Resource):
path = "/book/{id}"
def get(request):
return aBook
What would be the Python 3 way to do it?
Use a decorator to convert the inherited class to an object at creation time
I believe that the concept of such an object is not a typical way of coding in Python, but if you must then the decorator class_to_object below for immediate initialisation will do the trick. Note that any parameters for object initialisation must be passed through the decorator:
def class_to_object(*args):
def c2obj(cls):
return cls(*args)
return c2obj
using this decorator we get
>>> #class_to_object(42)
... class K(object):
... def __init__(self, value):
... self.value = value
...
>>> K
<__main__.K object at 0x38f510>
>>> K.value
42
The end result is that you have an object K similar to your scala object, and there is no class in the namespace to initialise other objects from.
Note: To be pedantic, the class of the object K can be retrieved as K.__class__ and hence other objects may be initialised if somebody really want to. In Python there is almost always a way around things if you really want.
Use an abc (Abstract Base Class):
import abc
class Resource( metaclass=abc.ABCMeta ):
#abc.abstractproperty
def path( self ):
...
return p
Then anything inheriting from Resource is required to implement path. Notice that path is actually implemented in the ABC; you can access this implementation with super.
If you can instantiate Resource directly you just do that and stick the path and get method on directly.
from types import MethodType
book = Resource()
def get(self):
return aBook
book.get = MethodType(get, book)
book.path = path
This assumes though that path and get are not used in the __init__ method of Resource and that path is not used by any class methods which it shouldn't be given your concerns.
If your primary concern is making sure that nothing inherits from the Book non-class, then you could just use this metaclass
class Terminal(type):
classes = []
def __new__(meta, classname, bases, classdict):
if [cls for cls in meta.classes if cls in bases]:
raise TypeError("Can't Touch This")
cls = super(Terminal, meta).__new__(meta, classname, bases, classdict)
meta.classes.append(cls)
return cls
class Book(object):
__metaclass__ = Terminal
class PaperBackBook(Book):
pass
You might want to replace the exception thrown with something more appropriate. This would really only make sense if you find yourself instantiating a lot of one offs.
And if that's not good enough for you and you're using CPython, you could always try some of this hackery:
class Resource(object):
def __init__(self, value, location=1):
self.value = value
self.location = location
with Object('book', Resource, 1, location=2):
path = '/books/{id}'
def get(self):
aBook = 'abook'
return aBook
print book.path
print book.get()
made possible by my very first context manager.
class Object(object):
def __init__(self, name, cls, *args, **kwargs):
self.cls = cls
self.name = name
self.args = args
self.kwargs = kwargs
def __enter__(self):
self.f_locals = copy.copy(sys._getframe(1).f_locals)
def __exit__(self, exc_type, exc_val, exc_tb):
class cls(self.cls):
pass
f_locals = sys._getframe(1).f_locals
new_items = [item for item in f_locals if item not in self.f_locals]
for item in new_items:
setattr(cls, item, f_locals[item])
del f_locals[item] # Keyser Soze the new names from the enclosing namespace
obj = cls(*self.args, **self.kwargs)
f_locals[self.name] = obj # and insert the new object
Of course I encourage you to use one of my above two solutions or Katrielalex's suggestion of ABC's.

Namespaces inside class in Python3

I am new to Python and I wonder if there is any way to aggregate methods into 'subspaces'. I mean something similar to this syntax:
smth = Something()
smth.subspace.do_smth()
smth.another_subspace.do_smth_else()
I am writing an API wrapper and I'm going to have a lot of very similar methods (only different URI) so I though it would be good to place them in a few subspaces that refer to the API requests categories. In other words, I want to create namespaces inside a class. I don't know if this is even possible in Python and have know idea what to look for in Google.
I will appreciate any help.
One way to do this is by defining subspace and another_subspace as properties that return objects that provide do_smth and do_smth_else respectively:
class Something:
#property
def subspace(self):
class SubSpaceClass:
def do_smth(other_self):
print('do_smth')
return SubSpaceClass()
#property
def another_subspace(self):
class AnotherSubSpaceClass:
def do_smth_else(other_self):
print('do_smth_else')
return AnotherSubSpaceClass()
Which does what you want:
>>> smth = Something()
>>> smth.subspace.do_smth()
do_smth
>>> smth.another_subspace.do_smth_else()
do_smth_else
Depending on what you intend to use the methods for, you may want to make SubSpaceClass a singleton, but i doubt the performance gain is worth it.
I had this need a couple years ago and came up with this:
class Registry:
"""Namespace within a class."""
def __get__(self, obj, cls=None):
if obj is None:
return self
else:
return InstanceRegistry(self, obj)
def __call__(self, name=None):
def decorator(f):
use_name = name or f.__name__
if hasattr(self, use_name):
raise ValueError("%s is already registered" % use_name)
setattr(self, name or f.__name__, f)
return f
return decorator
class InstanceRegistry:
"""
Helper for accessing a namespace from an instance of the class.
Used internally by :class:`Registry`. Returns a partial that will pass
the instance as the first parameter.
"""
def __init__(self, registry, obj):
self.__registry = registry
self.__obj = obj
def __getattr__(self, attr):
return partial(getattr(self.__registry, attr), self.__obj)
# Usage:
class Something:
subspace = Registry()
another_subspace = Registry()
#MyClass.subspace()
def do_smth(self):
# `self` will be an instance of Something
pass
#MyClass.another_subspace('do_smth_else')
def this_can_be_called_anything_and_take_any_parameter_name(obj, other):
# Call it `obj` or whatever else if `self` outside a class is unsettling
pass
At runtime:
>>> smth = Something()
>>> smth.subspace.do_smth()
>>> smth.another_subspace.do_smth_else('other')
This is compatible with Py2 and Py3. Some performance optimizations are possible in Py3 because __set_name__ tells us what the namespace is called and allows caching the instance registry.

Categories

Resources