How to properly call one method from another in python.
I get some data from the AWS S3 bucket after I want to sort this data and write it into a .txt.
import boto3
import string
import json
import collections
def handler(event, context):
print(f'Event: {event}')
s3 = boto3.resource('s3')
bucket = s3.Bucket(event["bucket"])
for obj in bucket.objects.all():
key = obj.key
body = obj.get()['Body'].read()
b = json.loads(body)
c = WordCorrection.create_duplicated_words_file(b)
# WordCorrection.create_duplicated_words_file(WordCorrection.word_frequency(
# WordCorrection.correct_words(b)))
# WordCorrection.spell_words(WordCorrection.dict_spell_words(WordCorrection.unrecognized_words_from_textrtact(b)))
return c
CONFIDENT_LEVEL = 98
class WordCorrection:
def correct_words(data):
spell = SpellChecker()
correct_words_from_amazon = []
for items in data['Blocks']:
if items['BlockType'] == "WORD" and items['Confidence'] > CONFIDENT_LEVEL and {items["Text"]} != spell.known([items['Text']]):
correct_words_from_amazon.append(items['Text'])
correct_words_from_amazon = [''.join(c for c in s if c not in string.punctuation) for s in
correct_words_from_amazon]
return correct_words_from_amazon
def word_frequency(self, correct_words_from_amazon):
word_counts = collections.Counter(correct_words_from_amazon)
word_frequency = {}
for word, count in sorted(word_counts.items()):
word_frequency.update({word: count})
return dict(sorted(word_frequency.items(), key=lambda item: item[1], reverse=True))
def create_duplicated_words_file(word_frequency):
with open("word_frequency.txt", "w") as filehandle:
filehandle.write(str(' '.join(word_frequency)))
I was trying to use self but I cannot see a good result, and from the reason I use
WordCorrection.create_duplicated_words_file(WordCorrection.word_frequency(WordCorrection.correct_words(b)))
but I'm in 100% sure that it is not correct, there is another way to call one method from another?
I think your trouble is the result of a misunderstanding about keywords/namespaces for modules vs classes.
Modules:
In python, files are modules so when you are inside of a file, all functions defined up to that point in the file are "in scope." So if I have two functions like this:
def func_foo():
return "foo"
def func_bar():
return func_foo() + "bar"
Then func_bar() will return "foobar".
Classes
When you define a class using the class keyword, that defines a new scope/namespace. It is considered proper (although technically not required) to use the word self as the first parameter to an instance method, and this refers to the instance the method is called upon.
For example:
class my_clazz:
def method_foo(self):
return "foo"
def method_bar(self):
return self.method_foo() + "bar"
Then if I have later in the file:
example = my_clazz()
ret_val = example.method_bar()
ret_val will be "foobar"
That said, because I did not really utilize object-oriented programming features in this example, the class definition was largely unnecessary.
Your Issue
So for your issue, it seems like your trouble is caused by what appears to be unnecessarily wrapping your functions inside a class definition. If you got rid of the class definition header and just made all of your functions in the module you would be able to use the calling techniques I used above. For more information on classes in Python I'd recommend reading here.
Related
I have a class which I want to use to extract data from a text file (already parsed) and I want do so using dynamically created class methods, because otherwise there would be a lot of repetitive code. Each created class method shall be asociated with a specific line of the text file, e.g. '.get_name()' --> read a part of 0th line of text file.
My idea was to use a dictionary for the 'to-be-created' method names and corresponding line.
import sys
import inspect
test_file = [['Name=Jon Hancock'],
['Date=16.08.2020'],
['Author=Donald Duck']]
# intented method names
fn_names = {'get_name': 0, 'get_date': 1, 'get_author': 2}
class Filer():
def __init__(self, file):
self.file = file
def __get_line(cls):
name = sys._getframe().f_code.co_name
line = fn_names[name] # <-- causes error because __get_line is not in fn_names
print(sys._getframe().f_code.co_name) # <-- '__get_line'
print(inspect.currentframe().f_code.co_name) # <-- '__get_line'
return print(cls.file[line][0].split('=')[1])
for key, val in fn_names.items():
setattr(Filer, key, __get_line)
f = Filer(test_file)
f.get_author()
f.get_date()
When I try to access the method name to link the method to the designated line in the text file, I do get an error because the method name is always '__get_line' instead of e.g. 'get_author' (what I had hoped for).
Another way how I thought to solve this was to make '__get_line' accepting an additional argument (line) and set it by passing the val during 'the setattr()' as shown below:
def __get_line(cls, line):
return print(cls.file[line][0].split('=')[1])
and
for key, val in fn_names.items():
setattr(Filer, key, __get_line(val))
however, then Python complains that 1 argument (line) is missing.
Any ideas how to solve that?
I would propose a much simpler solution, based on some assumptions. Your file appears to consist of key-value pairs. You are choosing to map the line number to a function that processes the right hand side of the line past the = symbol. Python does not conventionally use getters. Attributes are much nicer and easier to use. You can have getter-like functionality by using property objects, but you really don't need that here.
class Filer():
def __init__(self, file):
self.file = file
for line in file:
name, value = line[0].split('=', 1)
setattr(self, name.lower(), value)
That's all you need. Now you can use the result:
>>> f = Filer(test_file)
>>> f.author
'Donald Duck'
If you want to have callable methods exactly like the one you propose for each attribute, I would one-up your proposal and not even have a method to begin with. You can actually generate the methods on the fly in __getattr__:
class Filer():
def __init__(self, file):
self.file = file
def __getattr__(self, name):
if name in fn_names:
index = fn_names[name]
def func(self):
print(self.file[index][0].split('=', 1)[1])
func.__name__ = func.__qualname__ = name
return func.__get__(self, type(self))
return super().__getattr__(name)
Calling __get__ is an extra step that makes the function behave as if it were a method of the class all along. It binds the function object to the instance, even through the function is not part of the class.
For example:
>>> f = Filer(test_file)
>>> f.get_author
<bound method get_author of <__main__.Filer object at 0x0000023E7A247748>>
>>> f.get_author()
'Donald Duck'
Consider closing over your keys and values -- note that you can see the below code running at https://ideone.com/qmoZCJ:
import sys
import inspect
test_file = [['Name=Jon Hancock'],
['Date=16.08.2020'],
['Author=Donald Duck']]
# intented method names
fn_names = {'get_name': 0, 'get_date': 1, 'get_author': 2}
class Filer():
def __init__(self, file):
self.file = file
def getter(key, val):
def _get_line(self):
return self.file[val][0].split('=')[1]
return _get_line
for key, val in fn_names.items():
setattr(Filer, key, getter(key, val))
f = Filer(test_file)
print("Author: ", f.get_author())
print("Date: ", f.get_date())
This may be a newb question, but I'm a bit of newb with python, so here's what I'm trying to do...
I am using python2.7
I would like to assign a file path as a string into a dict in functionA, and then call this dict in functionB.
I looked at C-like structures in Python to try and use structs with no luck, possibly from a lack of understanding... The below sample is an excerpt from the link.
I also took a look at What are metaclasses in Python?, but I'm not sure if I understand metaclasses either.
So, how would I call assigned parameters in functionaA, within frunctionB such as:
class cstruct:
path1 = ""
path2 = ""
path3 = ""
def functionA():
path_to_a_file1 = os.path.join("/some/path/", "filename1.txt")
path_to_a_file2 = os.path.join("/some/path/", "filename2.txt")
path_to_a_file3 = os.path.join("/some/path/", "filename3.txt")
obj = cstruct()
obj.path1 = path_to_a_file1
obj.path2 = path_to_a_file2
obj.path3 = path_to_a_file3
print("testing string here: ", obj.path1)
# returns the path correctly here
# this is where things fall apart and the print doesn't return the string that I've tested with print(type(obj.path))
def functionB():
obj = cstructs()
print(obj.path1)
print(obj.path2)
print(obj.path3)
print(type(obj.path))
# returns <type 'str'>, which is what i want, but no path
Am I passing the parameters properly for the paths? If not, could someone please let me know what would be the right way to pass the string to be consumed?
Thanks!
You need to do something like this:
class Paths:
def __init__(self, path1, path2, path3):
self.path1 = path1
self.path2 = path2
self.path3 = path3
def functionA():
path_to_a_file1 = os.path.join("/some/path/", "filename1.txt")
path_to_a_file2 = os.path.join("/some/path/", "filename2.txt")
path_to_a_file3 = os.path.join("/some/path/", "filename3.txt")
obj = Paths(path_to_a_file1, path_to_a_file2, path_to_a_file3)
return obj
def functionB(paths): # should take a parameter
# obj = cstructs() don't do this! This would create a *new empty object*
print(paths.path1)
print(paths.path2)
print(paths.path3)
print(type(paths.path))
paths = functionA()
functionB(paths) # pass the argument
In any case, you really should take the time to read the official tutorial on classes. And you really should be using Python 3, Python 2 is passed its end of life.
I'm trying to learn OOP but I'm getting very confused with how I'm supposed to run the methods or return values. In the following code I want to run read_chapters() first, then sendData() with some string content that comes from read_chapters(). Some of the solutions I found did not use __init__ but I want to use it (just to see/learn how i can use them).
How do I run them? Without using __init__, why do you only return 'self'?
import datetime
class PrinceMail:
def __init__(self):
self.date2 = datetime.date(2020, 2, 6)
self.date1 = datetime.date.today()
self.days = (self.date1 - self.date2).days
self.file = 'The_Name.txt'
self.chapter = '' # Not sure if it would be better if i initialize chapter here-
# or if i can just use a normal variable later
def read_chapters(self):
with open(self.file, 'r') as book:
content = book.readlines()
indexes = [x for x in range(len(content)) if 'CHAPTER' in content[x]]
indexes = indexes[self.days:]
heading = content[indexes[0]]
try:
for i in (content[indexes[0]:indexes[1]]):
self.chapter += i # can i use normal var and return that instead?
print(self.chapter)
except IndexError:
for i in (content[indexes[0]:]):
self.chapter += i
print(self.chapter)
return self????? # what am i supposed to return? i want to return chapter
# The print works here but returns nothing.
# sendData has to run after readChapters automatically
def sendData(self):
pass
#i want to get the chapter into this and do something with it
def run(self):
self.read_chapters().sendData()
# I tried this method but it doesn't work for sendData
# Is there anyother way to run the two methods?
obj = PrinceMail()
print(obj.run())
#This is kinda confusing as well
Chaining methods is just a way to shorten this code:
temp = self.read_chapters()
temp.sendData()
So, whatever is returned by read_chapters has to have the method sendData. You should put whatever you want to return in read_chapters in a field of the object itself (aka self) in order to use it after chaining.
First of all, __init__ has nothing to do with what you want to achieve here. You can consider it as a constructor for other languages, this is the first function that is called when you create an object of the class.
Now to answer your question, if I am correct you just want to use the output of read_chapters in sendData. One of the way you can do that is by making the read_chapters a private method (that is if you don't want it to use through the object) using __ in the starting of the name like __read_chapters then make a call to the function inside the sendData function.
Another point to consider here is, when you are using self and don't intend to use the function through the object you don't need to return anything. self assigns the value to the attribute of the current instance. So, you can leave the function read_chapters at self.chapter = i and access the same in sendData.
Ex -
def sendData(self):
print(self.chapter)
I'm not an expert but, the reason to return self is because it is the instance of the class you're working with and that's what allows you to chain methods.
For what you're trying to do, method chaining doesn't seem to be the best approach. You want to sendData() for each iteration of the loop in read_chapters()? (you have self.chapter = i which is always overwritten)
Instead, you can store the chapters in a list and send it after all the processing.
Also, and I don't know if this is a good practice but, you can have a getter to return the data if you want to do something different with (return self.chapter instead of self)
I'd change your code for:
import datetime
class PrinceMail:
def __init__(self):
self.date2 = datetime.date(2020, 2, 6)
self.date1 = datetime.date.today()
self.days = (self.date1 - self.date2).days
self.file = 'The_Name.txt'
self.chapter = []
def read_chapters(self):
with open(self.file, 'r') as book:
content = book.readlines()
indexes = [x for x in range(len(content)) if 'CHAPTER' in content[x]]
indexes = indexes[self.days:]
heading = content[indexes[0]]
try:
for i in (content[indexes[0]:indexes[1]]):
self.chapter.append(i)
except IndexError:
#not shure what you want to do here
for i in (content[indexes[0]:]):
self.chapter.append(i)
return self
# sendData has to run after readChapters automatically
def sendData(self):
pass
#do what ever with self.chapter
def get_raw_chapters(self):
return self.chapter
Also, check PEP 8 Style Guide for naming conventions (https://www.python.org/dev/peps/pep-0008/#function-and-variable-names)
More reading in
Method chaining - why is it a good practice, or not?
What __init__ and self do on Python?
Given a class with class methods that contain only self input:
class ABC():
def __init__(self, input_dict)
self.variable_0 = input_dict['variable_0']
self.variable_1 = input_dict['variable_1']
self.variable_2 = input_dict['variable_2']
self.variable_3 = input_dict['variable_3']
def some_operation_0(self):
return self.variable_0 + self.variable_1
def some_operation_1(self):
return self.variable_2 + self.variable_3
First question: Is this very bad practice? Should I just refactor some_operation_0(self) to explicitly take the necessary inputs, some_operation_0(self, variable_0, variable_1)? If so, the testing is very straightforward.
Second question: What is the correct way to setup my unit test on the method some_operation_0(self)?
Should I setup a fixture in which I initialize input_dict, and then instantiate the class with a mock object?
#pytest.fixture
def generator_inputs():
f = open('inputs.txt', 'r')
input_dict = eval(f.read())
f.close()
mock_obj = ABC(input_dict)
def test_some_operation_0():
assert mock_obj.some_operation_0() == some_value
(I am new to both python and general unit testing...)
Those methods do take an argument: self. There is no need to mock anything. Instead, you can simply create an instance, and verify that the methods return the expected value when invoked.
For your example:
def test_abc():
a = ABC({'variable_0':0, 'variable_1':1, 'variable_2':2, 'variable_3':3))
assert a.some_operation_0() == 1
assert a.some_operation_1() == 5
If constructing an instance is very difficult, you might want to change your code so that the class can be instantiated from standard in-memory data structures (e.g. a dictionary). In that case, you could create a separate function that reads/parses data from a file and uses the "data-structure-based" __init__ method, e.g. make_abc() or a class method.
If this approach does not generalize to your real problem, you could imagine providing programmatic access to the key names or other metadata that ABC recognizes or cares about. Then, you could programmatically construct a "defaulted" instance, e.g. an instance where every value in the input dict is a default-constructed value (such as 0 for int):
class ABC():
PROPERTY_NAMES = ['variable_0', 'variable_1', 'variable_2', 'variable_3']
def __init__(self, input_dict):
# implementation omitted for brevity
pass
def some_operation_0(self):
return self.variable_0 + self.variable_1
def some_operation_1(self):
return self.variable_2 + self.variable_3
def test_abc():
a = ABC({name: 0 for name in ABC.PROPERTY_NAMES})
assert a.some_operation_0() == 0
assert a.some_operation_1() == 0
I have a yaml script that we use to specify functions. The yaml file parses into a dictionary (actually, nested dictionaries) that I want to use to construct the functions described in this yaml file. Here's an example yaml entry:
Resistance:
arguments:
voltage: "V"
current: "A"
parameters:
a: -1.23
b: 0.772
format: "{a}*voltage+{b}*current+f(voltage)"
subfunctions:
f:
arguments:
voltage: "V"
parameters:
a: -6.32
format: "exp({a}*voltage)"
Now, what need to do is parse this file and then build up the namespaces so that at the end, I can bind a variable called "Resistance" to a closure or lambda that reflects the above function (with nested "f" subfunction).
My strategy was to go "bottom up" using a recursive algorithm. Here is my code:
def evaluateSimpleFunction(entry):
functionString = entry['format']
functionArgs = []
Params = []
if "arguments" in entry and entry["arguments"] != None:
functionArgs = entry['arguments'].keys()
if "parameters" in entry and entry["parameters"] != None:
Params = entry['parameters']
formatString = ""
for param in Params:
formatString += str(param)+"="+str(Params[param])+","
functionString = eval("functionString.format("+formatString+")")
lambdaString = ""
for arg in functionArgs:
lambdaString += str(arg)+","
return eval("lambda " + lambdaString + ":" + functionString)
def recursiveLoader(entry):
if "subfunctions" in entry:
subfunctions = entry['subfunctions']
bindingString = ""
for subFunc in subfunctions:
bindingString +=str(subFunc)+"=[];"
exec(bindingString)
for subFunc in subfunctions:
exec(str(subFunc)+"= recursiveLoader(subfunctions[subFunc])")
return lambda : evaluateSimpleFunction(entry)
else:
return lambda : evaluateSimpleFunction(entry)
import yaml,os, math
os.chdir(r"C:\Users\212544808\Desktop\PySim\xferdb")
keyFields = ["Resistance","OCV"]
containerKeys = ["_internalResistance","_OCV"]
functionContainer = {}
with open("LGJP1.yml",'r') as modelFile:
parsedModelFile = yaml.load(modelFile)
#for funcKey,containerKey in zip(keyFields,containerKeys):
entry = parsedModelFile["capacityDegrade"]
g = recursiveLoader(entry)
Now, as it stands, I get an error because I am using unqualified exec with a nested function.
However, I don't want to resort to globals, because I will use this process for multiple functions and will therefore overwrite any globals I use.
I'm hoping for suggestions on how to construct nested functions algorithmically from an external config file like the yaml file - exec doesn't seem to be the way to go.
BTW: I'm using Python 2.7
UPPDATE
Another, more robust option may be to use a global class instance to create a namespace for each function. For example:
class Namespace(): pass
namespace_1 = Namespace()
#assume that the function "exponent" has arguments X, Y and body "Q(X*Y)",
#where "Q" has body "x**2+3*y"
exec("namespace_1.exponent = lambda X,Y: Q(X*Y)")
exec("namespace_1.Q = lambda x,y: x**2+3*y")
The benefit of this approach is that I can then loop through the members of the class for a particular function to create a single source code string that I can pass to "eval" to get the final function.
I'm doing all of this because I have not found a reliable way to create nested closures using eval and exec.
Here's a simplified example of what I mean using your input. I have hardcoded it, but you could easily build up a similar module file using your parser:
def makeModule(**kwargs):
print repr(kwargs)
module_filename = 'generated_module.py'
with open(module_filename, 'w') as module_file:
module_file.write('''\
from math import *
def func(voltage, current):
def f(voltage):
return exp({a1} * voltage)
return {a0}*voltage+{b}*current+f(voltage)
'''.format(**kwargs))
module_name = module_filename.replace('.py', '')
module = __import__(module_name)
return module.func
def main():
func = makeModule(a0=-1.23, b=0.772, a1=-6.32)
print 'Result:', func(2, 3)
if __name__ == '__main__':
main()
It works by generating a file called generated_module.py and then using the builtin function __import__ to import it as a module that is stored into the variable module. Like any other module, then you can access the names defined in it, namely func.