Is It "Python-esque" to wrap all functions in a class? [closed] - python

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
So, I'm developing a simple web-scraper in Python right now, but I had a question on how to structure my code. In other programming languages (especially compiled languages like C++ and C#), I've been in the habit of wrapping all of my functions in classes. I.e. in my web-scraping example I would have a class called something like "WebScraper" maybe and then hold all of the functions within that class. I might even go so far as to create a second helper class like "WebScraperManager" if I needed to instantiate multiple instances of the original "WebScraper" class.
This leads me to my current question, though. Would similiar logic hold in the current example? Or would I simply define a WebScraper.py file, without a wrapper class inside that file, and then just import the functions as I needed them into some main.py file?

The difference between a class and a function should be that a class has state. Some classes don't have state, but this is rarely a good idea (I'm sure there's exceptions, abstract base classes (ABCs) for instance but I'm not sure if they count), and some functions do have state, but this is rarely a good idea (caching or instrumentation might be exceptions).
If you want an URL as input, and say a dict as output, and then you are done with that website, there's no reason to have a class. Just have a function that takes an URL and returns a dict. Stateless functions are simpler abstractions than classes, so all other things being equal, prefer them.
However, very often there may be intermediate state involved. For instance, maybe you are scraping a family of pages rooted in a base URL, and it's too expensive to do all this eagerly. Maybe then what you want is a class that takes the root URL as its constructor. It then has some methods for querying which child URLs it can follow down, and methods for ordering subsequent scraping of children, which might be stored in nested data structures.
And of course, if your task is reasonably complicated, you may well have layers with functions using classes, or classes calling function. But persisting state is a good indicator of whether the immediate task should be written as a class or set of functions.
Edit: just to close the loop and come round to the original question: No, I would say it's not pythonesque to wrap all functions in classes. Free functions are just fine in python, it all depends what's appropriate. Also, the term pythonesque is not very pythonic ;-)

You mean "pythonic".
That depends in how much Object Oriented, scalable... do you want your implementation. I would use class over simple functions. Lets says tomorrow you want an CraiglistScraper and a FacebookScraper... I would create an abstract class "Scraper " and then the two above inherit from this one and reimplement what you need (Polymorphism). I mean the Object Oriented Principles and Patterns are language independent. Now I wouldn't "hold all the functions" in a class (Single responsibility principle), every time you code remember this word "SOLID".

Related

How to avoid class inheritance [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I've recently been dealing a lot with class inheritance on a project I'm working on, and I've started to become disenchanted with it as a programming concept. I do understand its appeal: it provides a clean way to extend an existing base class with new methods, thereby avoiding having to rewrite the same code multiple times and adding a nice logical structure to how classes are related to one another.
However, now that I've been using it more extensively, its drawbacks have become much more apparent. Not only does it add a layer of opacity to where a method or attribute comes from, forcing me to go down a rabbit hole of inherited classes every time I want to figure out where a given method is being defined, but it also breaks encapsulation by allowing you to unwittingly redefine public and private functions and variables in an inherited class.
Here's a very simple example of how easy it is to break things with inheritance.
class Parent:
def __init__(self):
self._private_var = 10
def add_ten(self, n):
return n + self._private_var
class Child(Parent):
def __init__(self):
self._private_var = 100
def add_hundred(self, n):
return n + self._private_var
Now, let's say I want to use Child's inherited .add_ten method:
c = Child()
c.add_ten(4)
>> 104
Since I unknowingly redefined Parent's ._private_var, the .add_ten method now adds 100 instead of 10.
Granted, inheritance might be dealt with slightly differently in other languages (I know Python doesn't have any truly "private" methods or variables, so perhaps this is not as much of an issue in Java or C++). Still, the downsides of inheritance seem to me to outweigh its advantages and make we want to avoid using it altogether if I can.
The issue is that the alternative seems to add a lot of redundancy.
For example, I could have defined ChildTwo as:
class ChildTwo:
def __init__(self):
self._parent = Parent()
self._private_var = 100
def add_ten(self, n):
return self._parent.add_ten(n)
def add_hundred(self, n):
return n + self._private_var
This would allow both .add_ten and .add_hundred to behave as expected, but it would also require me to manually add every method I would like to inherit from the Parent class, which seems wasteful in terms of keeping my code lean. This is especially true when there are multiple methods I'd like to inherit from Parent.
I'm also not sure (?) if instantiating the Parent class for every ChildTwo class might have some impact on performance.
What's the best way to avoid using inheritance while still avoiding code repetition as much as possible and having a minimal impact on performance?
Edit: Someone pointed out that this is a bad example, since .add_ten should probably be defined as n + 10 instead of n + self._private_var. That's a fair point, but it requires that I know how Parent is implemented, which may not always be the case. If Parent is in some external module then there's nothing I can do about it. Furthermore, if its implementation of .add_ten changes in the future, it has an impact on the Child class as well.
There are obviously no hard rules on when and when not to use inheritance. However, there are a few key things I do to help avoid issues.
I treat child classes as just extensions of the parent's logic. I therefore try to avoid overwriting objects, instead only extending them.
For example, I commonly have a parent class which receives the configs for a project. Then, any child classes can use these configs and do whatever necessary logic with them. All the configs are the same, they're not being changed, so inheritance will not cause any issues.
class Parent:
def __init__(self, name, configs):
self.name = name
self.theory = configs['theory']
self.log_file = configs['log_file']
...
class Child(Parent):
def __init__(self, name, configs):
super().__init__(name, configs)
I would not however have a method in the parent class that performed some action with the configs and then alter that method in the child classes. Despite that being perfectly acceptable python code, I find it easy to make mistakes and it adds unnecessary complexity. Why bother writing a method if you're going to constantly override it?
With multiple inheritance, if it's not something you've encountered before, it can be surprisingly easy to run into issues with "Method Resolution Order". The Diamond of Death or whatever other dramatic names it has. This occurs when multiple inheritance leads to ambiguity in how a child class should inherit from above it in the inheritance tree. For this reason I completely avoid ever making classes "siblings".
Inheritance can often scale badly. By which I mean, adding lots of logic to a pre-existing inheritance structure can cause issues. Maybe your child classes all used the parent class method in the same way but now you've a new child class which is slightly different. Ok so you can overwrite that method. But what if you begin adding more and more child classes which also need to overwrite that method? Now it makes sense to rewrite the base class method which means you need to rewrite all of the overwritten methods.
Sometimes inheritance will be instrumental in reducing repetition, other times it will be a headache for maintenance, testing and extension. As always in programming, if you find yourself writing the same thing over and over, you're doing something wrong. Knowing exactly what a class structure will be used for in the future, for me has been the best way of making sure any inheritance won't cause issues.
I would just say that your example seems a bit of a straw-man. You set up a demonstrably bad structure then dismiss inheritance as the reason for failure. If you're going to add ten, add ten, don't add some changeable variable.
Finally, while I have banged on about personal preference, be aware in the working environment, people's preferences will be drastically different to yours. You should understand how to use, extend and debug all different class structures.

How to structure a multi-file Python app [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
Using Python 2.7, I have a large class object, in a single file, I want to break into multiple files by groups or related functions. For example, I have "class API(object)", and I would like to have separate Python files for all user methods (add user, update user, delete user) and a separate Python file for all order methods (create order, update order, delete order). But this separation should not be known to the end-user, such as:
z = test.api() __init__.py
z.adduser(jeff) user.py
z.createOrder(65, "jeff") orders.py
z.showOpenOrders("jeff") orders.py
z.completeOrder() orders.py
z.emailUser("Jeff") email.py
I have been searching for "extending python class", but I don't believe I am searching using the right term. Please help.
I would instead create specialized classes (Users, Orders) where instances are created in API.__init__ (if necessary they could hold a reference to the API instance). The specialized instances can then be retrieved through member attributes or properties of the API instance.
Calls to them would then look like:
z = test.api()
z.users.add(jeff)
z.orders.create(65, "jeff")
z.orders.showOpen("jeff")
and so on.
First, I'd recommend against this approach. It's typical in python to keep the definition of class in one file. Another answer discusses options for splitting your API across multiple classes, and if that works for your application that may be a great option. However, what you propose is possible in Python and so I'll describe how to do it.
In a.py:
class A:
# some methods go here
in b.py:
import a
def extra_a_method(self, arg1):
# method body
a.A.extra_a_method = extra_a_method
del extra_a_method
So, we create a function and add it into the A class. we had to create the function in some scope, and so to keep that scope clean, we delete the function from that scope.
You cannot do def A.new_a_method for syntactic reasons.
I know this will mostly work in Python3, but haven't analyzed it for Python2.
There are some catches:
b must be imported before the method appears in A; this is a big deal
If A has a nontrivial metaclass, the handling of the extra methods will be different from the handling of the methods in the original class. As an example, I think SQLAlchemy would get this sort of addition right, but other frameworks might not.
There's an alternative that can work sometimes if you know all the things that A should contain from the beginning
In a.py:
import a_extra_1, a_extra_2
class A(a_extra_1.AExtra1, a_extra_2.AExtra2):
# methods can also go here
Here, we're treating the additional methods as mixins to A. Generally, you'd want to have the mixins added after any classes for inheritance.
This approach also has some drawbacks:
If you are not careful it can lead to circular imports
You have to know all the methods you need for A ahead of time
All the code gets imported at once (this is a feature too)
There can also be issues with nontrivial metaclasses in this approach, although if the framework supports inheritance, this is likely to work.

Python: one single module (file .py) for each class? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I've started programming in python 2 weeks ago. I'm making a separate file (module) for each class as I've done before in languages like Java or C#.
But now, seeing tutorials and code from other people, I've realized that many people use the same files to define more than 1 class and the main function but I don't know if they do it like that because are just examples or because it's a python convention or something like that (to define and group many classes in the same files).
So, in Python, one file for each class or many classes in the same files if they can be grouped by any particular feature? (like motor vehicles by one side and just vehicles by the other side).
It's obvious that each one has his own style, but when I ask, I hope general answers or just the conventions, anyway, if someone wants to tell me his opinion about his own style and why, feel free to do it! ;)
one file for each class
Do not do this. In Java, you usually will not have more than one class in a file (you can, of course nest).
In Python, if you group related classes in a single file, you are on the safe side. Take a look at the Python standard library: many modules contain multiple classes in a single file.
As for the why? In short: Readability. I, personally, enjoy not having to switch between files to read related or similar code. It also makes imports more concise.
Imagine socketserver.py would spread UDPServer, TCPServer, ForkingUDPServer, ForkingTCPServer, ThreadingUDPServer, ThreadingTCPServer, BaseRequestHandler, StreamRequestHandler, DatagramRequestHandler into nine files. How would you import these? Like this?
from socketserver.tcp.server import TCPServer
from socketserver.tcp.server.forking import ForkingTCPServer
...
That's plain overhead. It's overhead, when you write it. It's overhead, when you read it. Isn't this easier?
from socketserver import TCPServer, ForkingTCPServer
That said, no one will stop you, if you put each class into a single file. It just might not be pythonic.
Python has the concept of packages, modules and classes. If you put one class per module, the advantage of having modules is gone. If you have a huge class, it might be ok to put this class in a separate file, but then again, is it good to have big classes? NO, it's hard to test and maintain. Better have more small classes with specific tasks and put them logically grouped in as few files as possible.
It's not wrong to have one class per file at all. Python isn't directly aimed at object oriented design so that's why you can get away with multiple classes per file.
I recommend having a read over some style guides if you're confused about what the 'proper' way to do it is.
I suggest either Google's style guide or the official style guide by the Python Foundation
You can also find more material relating to Python's idioms and meta analysis in the PEP index

Is a language without interfaces a bad choice for teaching OOP (Python)? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Perhaps I am dated on my information, but I was going to help someone learn an OOP language, concepts, etc. I wanted to use something dynamic. My thought was Python but then I read it has no Interfaces. Isn't an Interface something someone needs to know to learn OOP concepts?
My thought was Python but then I read it has no Interfaces.
Well, you can use interfaces in Python. The standard library has the abc module, and there are third-party modules like PyProtocols (and frameworks like Zope and Twisted have their own similar ideas).
The point is that they aren't required, and frequently aren't even necessary. You "wanted to use something dynamic"? That's what it means to be dynamic: Functions take any object with the right interface, without needing that interface to be statically defined anywhere, much less needing the object to statically declare that it supports the interface.
So, when you ask:
Isn't an Interface something someone needs to know to learn OOP concepts?
The answer is "No". An Interface is something someone needs to know to learn Java-style OOP, but it's not something someone needs to know to learn OOP in general.
Consider this from a different angle: It's possible, but not easy, to do Javascript-style OOP in Python; as in Java, classes are too fundamental to Python to live without. Does that make Python and Java bad OO languages and Javascript a good one? No; it just makes them different OO languages.
monkut wanted to know "wtf is an interface???"
An interface (aka protocol, abstract base class, abstract type, …) is just a type that can be used for static and/or dynamic type checking and/or switching in exactly same way a class can.
So, what's the good of that? Well, in theory, you should never need to inspect types—that's not just part of the Zen of Python, it's part of core OO dogma as well—but in practice sometimes you do. So, being able to name your abstract types, and declare that a class supports a variety of different abstract types, can be useful.
Of course you could do that with plain old classes, but being able to explicitly declare that a type is abstract can also help with readability and debugging. PEP 3119 explains the rationale in a Python-centric way.
But in languages like Java, there's are two additional benefits.
First, if you have static type checking, you can't write a single function that can take, e.g., a list, a tuple, a set, a frozenset, or an iterator. But you can write a function that takes an Iterable, and then declare that list, tuple, etc. all provide the Iterable interface, and then everything is fine. (In a language with dynamic type checking, duck typing already takes care of this for you—your code works with any object that has an __iter__ method that returns something that behaves the way you expect it to.)
Second, if you've got static data member layout and/or vtable-style method override mechanism, multiple inheritance is very tricky. But multiple inheritance is also very useful. So in Java, an interface is something just like a class, but with no data members or method implementations, and you can inherit from as many interfaces as you want, but only one class. This gives Java some of the benefits of multiple inheritance, without any of the problems. (In a language with dynamic data members and dynamic method lookup, as long as you have a sensible MRO algorithm, as Python does, you can get all of the benefits of multiple inheritance without any of the problems.)

Factory pattern in Python [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I'm currently implementing the Factory design pattern in Python and I have a few questions.
Is there any way to prevent the direct instantiation of the actual concrete classes? For example, if I have a VehicleFactory that spawns Vehicles, I want users to just use that factory, and prevent anyone from accidentally instantiating Car() or Truck() directly. I can throw an exception in init() perhaps, but that would also mean that the factory can't create an instance of it...
It seems to me now that factories are getting addictive. Seems like everything should become a factory so that when I change internal implementation, the client codes will not change. I'm interested to know when is there an actual need to use factories, and when is it not appropriate to use. For example, I might have a Window class and there's only one of this type now (no PlasticWindow, ReinforcedWindow or anything like that). In that case, should I use a factory for the client to generate the Window, just in case I might add more types of Windows in the future?
I'm just wondering if there is a usual way of calling the factories. For example, now I'm calling my Vehicle factory as Vehicles, so the codes will go something like Vehicles.create(...). I see a lot of tutorials doing it like VehicleFactory, but I find it too long and it sort of exposes the implementation as well.
EDIT: What I meant by "exposes the implementation" is that it lets people know that it's a factory. What I felt was that the client need not know that it's a factory, but rather as some class that can return objects for you (which is a factory of course but maybe there's no need to explicitly tell clients that?). I know that the soure codes are easily exposed, so I didn't mean "exposing the way the functionalities are implemented in the source codes".
Thanks!
Be Pythonic. Don't overcomplicate your code with "enterprise" language (like Java) solutions that add unnecessary levels of abstraction.
Your code should be simple, and intuitive. You shouldn't need to delegate to another class to instantiate another.
Don't expose the class (for example make it private __MyClass, or obvious that you don't want it used directly _MyClass). This way it can only be instantiated via the factory function.
Perhaps you should review the use of keyword arguments, and inheritance. It sounds like you may be overlooking these, which will generally reduce your dependence on complex factories (To be honest, I've rarely needed factories).
In Python you cannot easily protect against exposing implementation, it goes against the Zen of Python. (It's the same in any language, a determined individual can get what they want eventually). At most you should try to ensure that a user of your code does not accidentally do the wrong thing, but never presume to know what the end-user may decide to achieve with your code. Don't make it obfuscated and difficult to work with.
Is there any way to prevent the direct instantiation of the actual concrete classes?
Why? Are your programmers evil sociopaths who refuse to follow the rules? If you provide a factory -- and the factory does what people need -- then they'll use the factory.
You can't "prevent" anything. Remember. This is Python -- they have the source.
should I use a factory for the client to generate the Window, just in case I might add more types of Windows in the future?
Meh. Neither good nor bad. It can get cumbersome to manage all the class-hierarchy-and-factory details.
Adding a factory isn't hard. This is Python -- you have all the source at all times -- you can use grep to find a class constructor and replace it with a factory when you need to.
Since you can use grep to find and fix your mistakes, you don't need to pre-plan this kind of thing as much as you might in Java or C++.
I see a lot of tutorials doing it like VehicleFactory, but I find it too long and it sort of exposes the implementation as well.
"Too Long"? It's used so rarely that it barely matters. Use long names -- it helps other folks understand what you're doing. This is not Code Golf where fewest keystrokes wins.
"exposes the implementation"? First, It exposes nothing. Second, this is Python -- you have all the source at all times -- everything is already exposed.
Stop thinking so much about prevention and privacy. It isn't helpful.

Categories

Resources