I'm struggling a little understanding how to use classes effectively. I have written a program which I hope to count the number of occurrences of a phrase or word in a .txt file.
I'm not quite sure how to call the function properly, any help would be much appreciated.
Thanks.
class WordCounter:
def The_Count(self):
print "Counting words..."
txt_doc = open("file")
for line in txt_doc:
if "word" in txt_doc:
word_freq = word_freq + 1
return word_freq
print "Frequency of word: %s" % word_freq
WordCounter.The_Count
Using classes is a little different than what you have tried to do here. Think of it more in terms of preserving variables and state of objects in code. To accomplish your task, something more like the following would work:
class CountObject(object):
"""Instance of CountObject for measuring file lengths"""
def __init__(self, filename):
self.filename = filename
def getcount(self, word):
count = 0
infile = open(self.filename,'r')
for line in infile.readlines():
x = line.count(word)
count = count + x
return count
mycounter = CountObject('C:\list.txt')
print 'The occcurence of awesome is %s' %(str(mycounter.getcount('in')))
First, just to agree on the names, a function inside a class is called a method of that class.
In your example, your method performs the action of counting occurrences of words, so to make it clearer, you could simply call your method count. Note also that in Python, it is a convention to have method names start with a lower case.
Also, it is good practice to use so called new-style classes which are simply classes that inherits from object.
Finally, in Python, a method needs to have at least one parameter, which is by convention called self and which should be an instance of the class.
So if we apply these changes, we get something like:
class WordCounter(object):
def count(self):
print "Counting words..."
# Rest of your code
# ...
Now that your class has a method, you first need to create an instance of your class before you can call that method on it. So, to create an instance of a class Foo in Python, you simply need to call Foo(). Once you have your instance, you can then call your method. Using your example
# Create an instance of your class and save it in a variable
my_word_counter = WordCounter()
# Call your method on the instance you have just created
my_word_counter.count()
Note that you don't need to pass in an argument for self because the Python interpreter will replace self with the value of word_counter for you, i.e. it calls WordCounter.count(my_word_counter).
A note on OO
Has noted by others, your example is not a great use of classes in Python. OO classes aim at putting together behaviours (instance methods) along with the data they interact with (instance attributes). You example being a simple one, there is no real internal data associated with your class. A good warning could be the fact that you never use self inside your method.
For behaviour that is not tied to some particular data, Python gives you the flexibility to write module-level functions - Java, in opposition, forces you to put absolutely everything inside classes.
As suggested by others too, to make your example more OO, you could pass the filename as a param to __init__ and save it as self.filename. Probably even better would be to have your WordCounter expect a file-like object, so that it is not responsible for opening/closing the file itself. Something like:
class WordCounter(object):
def __init__(self, txt_doc):
self.word_file = txt_doc
def count(self):
print "Counting words..."
for line in self.txt_doc:
# Rest of your code
# ...
with open(filename) as f:
word_counter = WordCounter(f)
word_counter.count()
Finally, if you want more details on classes in Python, a good source of information is always the official documentation.
you have several problems here, the code you posted isn't correct python. class methods should take a reference to self as an argument:
def The_Count(self):
you need to initialize word_freq for the case where there are no words to count:
word_freq = 0
as others have mentioned, you can call your function this way:
counter = WordCounter()
print(counter.The_Count())
It's not really idiomatic python to wrap these kinds of stateless functions in classes, as you might do in Java or something. I would separate this function into a module, and let the calling class handle the file I/O, etc.
To call a method in a class, first you have to create an instance of that class:
c = WordCounter()
Then you call the method on that instance:
c.TheCount()
However, in this case you don't really need classes; this can just be a top-level function. Classes are most useful when you want each object to have its own internal state.
For such a small program, using classes may not be necessary. You could simply define the function and then call it.
However, if you wanted to implement a class design you could use (after class definition):
if __name__ == "__main__":
wc = WordCounter() #create instance
wc.TheCount() #call method
The use of a class design would use better design principles while increasing the readability/flexibility of your code if you wanted to further expand the capabilities of the class later.
In this case, you'd have to change the code to this:
class WordCounter:
def CountWords(self):
# For functions inside classes, the first parameter must always be `self`
# I'm sure there are exceptions to that rule, but none you should worry about
# right now.
print "Counting words..."
txt_doc = open("file")
word_freq = 0
for line in txt_doc:
if "word" in line: # I'm assuming you mean to use 'line' instead of 'txt_doc'
word_freq += 1
# count all the words first before returning it
txt_doc.close() # Always close files after you open them.
# (also, consider learning about the 'with' keyword)
# Either print the frequency
print "Frequency of word: %s" % word_freq
# ...or return it.
return word_freq
...then to call it, you would do....
>>> foo = WordCounter() # create an instance of the class
>>> foo.CountWords() # run the function
As other posters have noted, this is not the most effective uses of classes. It would be better if you made this into a top-level function, and changed it to this:
def CountWords(filename, word):
with open(filename) as txt_doc:
word_freq = 0
for line in txt_doc:
if word in line:
word_freq += 1
return word_freq
...and called it like this:
>>> output = CountWords("file.txt", "cat")
>>> print "Frequency of word: %s" % output
39
It would make a bit more sense to use a class if you had something like the below, where you have a bunch of variables and functions all related to one conceptual 'object':
class FileStatistics:
def init(self, filename):
self.filename = filename
def CountWords(self, word):
pass # Add code here...
def CountNumberOfLetters(self):
pass
def AverageLineLength(self):
pass
# ...etc.
Related
Trying to do some optimization here on a class. We're trying not to change too much the class definitions. In essence we are instantiating a ClassA N times but one of the methods has a nasty file read.
for x in range(0, N):
cl = ClassA()
cl.dostuff(x)
The class looks like this:
class ClassA:
def dostuff(self, x):
#open nasty file here
nastyfile = open()
do something else
We could bring that file read out of the class and put in before the loop as the file will not change. But is there a way we can ensure that we only ever open the nasty file once for instances of the class. I.e. so for example on the first instantiate of the class it is defined for all future instances of the class without having to read in again. Is there a way to do this in the current form without really changing the structure too much of the existing code base.
One question relates to the interpreter - i.e. is python smart enough to cache variables just as nastyfile, so that we do as we are, or is the quick and dirty solution the following:
nastyfile = open()
for x in range(0, 1):
cl = ClassA()
cl.dostuff(x)
Looking for a pythonic way to do this.
You could encapsulate opening the file in a classmethod.
class ClassA():
#classmethod
def open_nasty_file(cls):
cls.nasty_file = open('file_path', 'file_mode')
def do_stuff(self):
if not hasattr(self, 'nasty_file'):
self.open_nasty_file()
This approach relies on the fact that attribute look-ups will try finding the attribute on the class if not found on the instance.
You could put this check/instantiation in the __init__ function if you want it opened when the first instance is instantiated.
Note that this method will leave the file open, so it will need to be closed at some point.
You could have a class method that opens the file when the first instance asks for it. I've wrapped it in a lock so that it is thread safe.
import threading
class ClassA:
_nasty_file = None
_nasty_file_lock = threading.Lock()
def dostuff(self, x):
#open nasty file here
nastyfile = get_nasty_file()
do something else
#classmethod
def get_nasty_file(cls):
with cls._nasty_file_lock:
if cls._nasty_file is None:
with open('nastyfile') as fp:
cls._nasty_file = fp.read()
return cls._nasty_file
Instances can access and modify class attributes by themselves. So you can just set up an attribute on the class and provide it with a default (None) value, and then check for that value before doing anything in dostuff. Example:
class A():
nastyfileinfo=None
def dostuff(self,x):
if A.nastyfileinfo: print('nastyfileinfo already exists:',A.nastyfileinfo)
if not A.nastyfileinfo:
print('Adding nastyfileinfo')
A.nastyfileinfo='This is really nasty' ## open()
print('>>>nastyfileinfo:',A.nastyfileinfo)
## Continue doing your other stuff involving x
for j in range(0,10):
A().dostuff(j)
nastyfileinfo is also considered an attribute of the instance, so you can reference it with instance.nastyfileinfo, however if you modify it there it will only update for that one specific instance, whereas if you modify it on the class, all other instances will be able to see it (provided they didn't change their personal/self reference to nastyfileinfo).
instants=[]
for j in range(0,10):
instants.append(A())
for instance in instants:
print(instance.nastyfileinfo)
instants[5].dostuff(5)
for instance in instants:
print(instance.nastyfileinfo)
I'm trying to write a class that works kind of like the builtins and some of the other "grown-up" Python stuff I've seen. My Pythonic education is a little spotty, classes-wise, and I'm worried I've got it all mixed up.
I'd like to create a class that serves as a kind of repository, containing a dictionary of unprocessed files (and their names), and a dictionary of processed files (and their names). I'd like to implement some other (sub?)classes that handle things like opening and processing the files. The file handling classes should be able to update the dictionaries in the main class. I'd also like to be able to directly call the various submodules without having to separately instantiate everything, e.g.:
import Pythia
p = Pythia()
p.FileManager.addFile("/path/to/some/file")
or even
Pythia.FileManager.addFile("/path/to/some/file")
I've been looking around at stuff about #classmethod and super and such, but I can't say I entirely understand it. I'm also beginning to suspect that I might have the whole chain of inheritance backwards--that what I think of as my main class should actually be the child class of the handling and processing classes. I'm also wondering whether this would all work better as a package, but that's a separate, very intimidating issue.
Here's my code so far:
#!/usr/bin/python
import re
import os
class Pythia(object):
def __init__(self):
self.raw_files = {}
self.parsed_files = {}
self.FileManger = FileManager()
def listf(self,fname,f):
if fname in self.raw_files.keys():
_isRaw = "raw"
elif fname in self.parsed_files.keys():
_isRaw = "parsed"
else:
return "Error: invalid file"
print "{} ({}):{}...".format(fname,_isRaw,f[:100])
def listRaw(self,n=None):
max = n or len(self.raw_files.items())
for item in self.raw_files.items()[:max]:
listf(item[0],item[1])
def listParsed(self,n=None):
max = n or len(self.parsed_files.items())
for item in self.parsed_files.items()[:max]:
listf(item[0],item[1])
class FileManager(Pythia):
def __init__(self):
pass
def addFile(self,f,name=None,recurse=True,*args):
if name:
fname = name
else:
fname = ".".join(os.path.basename(f).split(".")[:-1])
if os.path.exists(f):
if not os.path.isdir(f):
with open(f) as fil:
Pythia.raw_files[fname] = fil.read()
else:
print "{} seems to be a directory.".format(f)
if recurse == False:
return "Stopping..."
elif recurse == True:
print "Recursively navingating directory {}".format(f)
addFiles(dir,*args)
else:
recurse = raw_input("Recursively navigate through directory {}? (Y/n)".format(f))
if recurse[0].lower() == "n":
return "Stopping..."
else:
addFiles(dir,*args)
else:
print "Error: file or directory not found at {}".format(f)
def addFiles(self,directory=None,*args):
if directory:
self._recursivelyOpen(directory)
def argHandler(arg):
if isinstance(arg,str):
self._recursivelyOpen(arg)
elif isinstance(arg,tuple):
self.addFile(arg[0],arg[1])
else:
print "Warning: {} is not a valid argument...skipping..."
pass
for arg in args:
if not isinstance(arg,(str,dict)):
if len(arg) > 2:
for subArg in arg:
argHandler(subArg)
else:
argHandler(arg)
elif isinstance(arg,dict):
for item in arg.items():
argHandler(item)
else:
argHandler(arg)
def _recursivelyOpen(self,f):
if os.path.isdir(f):
l = [os.path.join(f,x) for x in os.listdir(f) if x[0] != "."]
for x in l:
_recursivelyOpen(x)
else:
addFile(f)
First off: follow PEP8's guidelines. Module names, variable names, and function names should be lowercase_with_underscores; only class names should be CamelCase. Following your code is a little difficult otherwise. :)
You're muddying up OO concepts here: you have a parent class that contains an instance of a subclass.
Does a FileManager do mostly what a Pythia does, with some modifications or extensions? Given that the two only work together, I'd guess not.
I'm not quite sure what you ultimately want this to look like, but I don't think you need inheritance at all. FileManager can be its own class, self.file_manager on a Pythia instance can be an instance of FileManager, and then Pythia can delegate to it if necessary. That's not far from how you're using this code already.
Build small, independent pieces, then worry about how to plug them into each other.
Also, some bugs and style nits:
You call _recursivelyOpen(x) but forgot the self..
Single space after commas.
Watch out for max as a variable name: it's also the name of a builtin function.
Avoid type-checking (isinstance) if you can help it. It's extra-hard to follow your code when it does a dozen different things depending on argument types. Have very clear argument types, and create helper functions that accept different arguments if necessary.
You have Pythia.raw_files[fname] inside FileManager, but Pythia is a class, and it doesn't have a raw_files attribute anyway.
You check if recurse is True, then False, then... something else. When is it something else? Also, you should use is instead of == for testing against the builtin singletons like this.
There is a lot here and you are probably best to educate yourself some more.
For your intended usage:
import Pythia
p = Pythia()
p.file_manager.addFile("/path/to/some/file")
A class structure like this would work:
class FileManager(object):
def __init__(self, parent):
self.parent = parent
def addFile(self, file):
# Your code
self.parent.raw_files[file] = file
def addFiles(self, files)
# Your code
for file in files:
self.parent.raw_files[file] = file
class Pythia(object):
def __init__(self):
self.file_manager = FileManager(self)
However there are a lot of options. You should write some client code first to work out what you want, then implement your class/objects to match that. I don't tend to ever use inheritance in python, it is not really required due to pythons duck typing.
Also if you want a method to be called without instantiating the class use staticmethod, not classmethod. For example:
class FileManager(object):
#staticmethod
def addFiles(files):
pass
Firstly, I don't know what the most appropriate title for this question would be. Contender: "how to implement list.append in custom class".
I have a class called Individual. Here's the relevant part of the class:
from itertools import count
class Individual:
ID = count()
def __init__(self, chromosomes):
self.chromosomes = list(chromosomes)
self.id = self.ID.next()
Here's what I want to do with this class:
Suppose I instantiate a new individual with no chromosomes: indiv = Individual([]) and I want to add a chromosome to this individual later on. Currently, I'd have to do:
indiv.chromosomes.append(makeChromosome(params))
Instead, what I would ideally like to do is:
indiv.append(makeChromosome(params))
with the same effect.
So my question is this: when I call append on a list, what really happens under the hood? Is there an __append__ (or __foo__) that gets called? Would implementing that function in my Individual class get me the desired behavior?
I know for instance, that I can implement __contains__ in Individual to enable if foo in indiv functionality. How would I go about enable indiv.append(…) functionality?
.append() is simply a method that takes one argument, and you can easily define one yourself:
def append(self, newitem):
self.chromosomes.append(newitem)
No magic methods required.
Do python class-methods have a method/member themselves, which indicates the class, they belong to?
For example ...:
# a simple global function dummy WITHOUT any class membership
def global_function():
print('global_function')
# a simple method dummy WITH a membership in a class
class Clazz:
def method():
print('Clazz.method')
global_function() # prints "global_function"
Clazz.method() # prints "Clazz.method"
# until here, everything should be clear
# define a simple replacement
def xxx():
print('xxx')
# replaces a certain function OR method with the xxx-function above
def replace_with_xxx(func, clazz = None):
if clazz:
setattr(clazz, func.__name__, xxx)
else:
func.__globals__[func.__name__] = xxx
# make all methods/functions print "xxx"
replace_with_xxx(global_function)
replace_with_xxx(Clazz.method, Clazz)
# works great:
global_function() # prints "xxx"
Clazz.method() # prints "xxx"
# OK, everything fine!
# But I would like to write something like:
replace_with_xxx(Clazz.method)
# instead of
replace_with_xxx(Clazz.method, Clazz)
# note: no second parameter Clazz!
Now my question is: How is it possible, to get all method/function calls print "xxx", WITHOUT the "clazz = None" argument in the replace_with_xxx function???
Is there something possible like:
def replace_with_xxx(func): # before it was: (func, clazz = None)
if func.has_class(): # something possible like this???
setattr(func.get_class(), func.__name__, xxx) # and this ???
else:
func.__globals__[func.__name__] = xxx
Thank you very much for reading. I hope, i could make it a little bit clear, what i want. Have a nice day! :)
I do not think this is possible and as a simple explanation why we should think about following: you can define a function and attach it to the class without any additional declarations and it will be stored as a field of the class. And you can assign the same function as a class method to 2 or more different classes.
So methods shouldn't contain any information about the class.
Clazz.method will have an attribute im_class, which will tell you what the class is.
However, if you find yourself wanting to do this, it probably means you are doing something the hard way. I don't know what you are trying to accomplish but this is a really bad way to do just about anything unless you have no other option.
For methods wrapped in #classmethod, the method will be bound and contain the reference im_self pointing to the class.
I'm new to using classes and I'm trying to pass a variable to one of the methods inside of my class. How do I do it?
Here's an example of what I'm trying to accomplish:
class a_class():
def a_method(txt):
print txt
instance = a_class()
instance.a_method('hello world!)
P.S. I don't understand the whole self and __blah__ concepts yet, and I will avoid them at this point if I don't have to use them.
When writing an instance method for a class in Python- which looks exactly like what you've just coded up- you can't avoid using self. The first parameter to an instance method in Python is always the object the method is being called on. self is not a reserved word in Python- just the traditional name for that first parameter.
To quote from the official Python tutorial, chapter 9:
[...] the special thing about methods is that the object is passed as the first argument of the function. In our example, the call x.f() is exactly equivalent to MyClass.f(x). In general, calling a method with a list of n arguments is equivalent to calling the corresponding function with an argument list that is created by inserting the method’s object before the first argument.
Therefore, you need to define two parameters for your method. The first is always self- at least that is the conventional name- and the second is your actual parameter. So your code snippet should be:
class a_class(object):
def a_method(self, txt):
print txt
instance = a_class()
instance.a_method('hello world!')
Note that the class explicitly inherits from object (I'm not sure empty parentheses there are legal). You can also provide no inheritance, which is identical for most purposes, but different in some details of the behavior of the type system; the inheritance from object defines a_class as a new-style class rather than an old-style class, which is irrelevant for most purposes but probably worth being aware of.
You need to have
class a_class():
def a_method(self,txt):
print txt
The first variable of a class method always contains a reference to the object no matter what variable name you use. (Unless you are using it as a static method).
Instance Methods in Python must be provided the instance (given as self) as the first parameter in the method signature.
class a_class():
def a_method(self,txt):
print txt
That should be what you're looking for. Additionally, if you were to interact with a member variable you'd want to do something like this:
class a_class():
name = "example"
def a_method(self,txt):
print txt
print self.name
The self concept and the use of __init__ really isn't that confusing and it is essential to writing good Python code. __init__ is called on instantiation of a class, and simply include a self parameter in every class method, you can then use self to reference the instance of the class.
class a_class():
def __init__(self):
self.count = 0
def a_method(self, txt):
self.count += 1
print str(self.count), txt
instance = a_class()
instance.a_method('hello world!')
# prints "1 hello world!"
instance.a_method('hello again!')
# prints "2 hello again!"