python structs for dict of files

python structs for dict of files - python

This may be a newb question, but I'm a bit of newb with python, so here's what I'm trying to do...
I am using python2.7
I would like to assign a file path as a string into a dict in functionA, and then call this dict in functionB.
I looked at C-like structures in Python to try and use structs with no luck, possibly from a lack of understanding... The below sample is an excerpt from the link.
I also took a look at What are metaclasses in Python?, but I'm not sure if I understand metaclasses either.
So, how would I call assigned parameters in functionaA, within frunctionB such as:
class cstruct:
path1 = ""
path2 = ""
path3 = ""
def functionA():
path_to_a_file1 = os.path.join("/some/path/", "filename1.txt")
path_to_a_file2 = os.path.join("/some/path/", "filename2.txt")
path_to_a_file3 = os.path.join("/some/path/", "filename3.txt")
obj = cstruct()
obj.path1 = path_to_a_file1
obj.path2 = path_to_a_file2
obj.path3 = path_to_a_file3
print("testing string here: ", obj.path1)
# returns the path correctly here
# this is where things fall apart and the print doesn't return the string that I've tested with print(type(obj.path))
def functionB():
obj = cstructs()
print(obj.path1)
print(obj.path2)
print(obj.path3)
print(type(obj.path))
# returns <type 'str'>, which is what i want, but no path
Am I passing the parameters properly for the paths? If not, could someone please let me know what would be the right way to pass the string to be consumed?
Thanks!

You need to do something like this:
class Paths:
def __init__(self, path1, path2, path3):
self.path1 = path1
self.path2 = path2
self.path3 = path3
def functionA():
path_to_a_file1 = os.path.join("/some/path/", "filename1.txt")
path_to_a_file2 = os.path.join("/some/path/", "filename2.txt")
path_to_a_file3 = os.path.join("/some/path/", "filename3.txt")
obj = Paths(path_to_a_file1, path_to_a_file2, path_to_a_file3)
return obj
def functionB(paths): # should take a parameter
# obj = cstructs() don't do this! This would create a *new empty object*
print(paths.path1)
print(paths.path2)
print(paths.path3)
print(type(paths.path))
paths = functionA()
functionB(paths) # pass the argument
In any case, you really should take the time to read the official tutorial on classes. And you really should be using Python 3, Python 2 is passed its end of life.

Related

How to properly call one method from another method

How to properly call one method from another in python.
I get some data from the AWS S3 bucket after I want to sort this data and write it into a .txt.
import boto3
import string
import json
import collections
def handler(event, context):
print(f'Event: {event}')
s3 = boto3.resource('s3')
bucket = s3.Bucket(event["bucket"])
for obj in bucket.objects.all():
key = obj.key
body = obj.get()['Body'].read()
b = json.loads(body)
c = WordCorrection.create_duplicated_words_file(b)
# WordCorrection.create_duplicated_words_file(WordCorrection.word_frequency(
# WordCorrection.correct_words(b)))
# WordCorrection.spell_words(WordCorrection.dict_spell_words(WordCorrection.unrecognized_words_from_textrtact(b)))
return c
CONFIDENT_LEVEL = 98
class WordCorrection:
def correct_words(data):
spell = SpellChecker()
correct_words_from_amazon = []
for items in data['Blocks']:
if items['BlockType'] == "WORD" and items['Confidence'] > CONFIDENT_LEVEL and {items["Text"]} != spell.known([items['Text']]):
correct_words_from_amazon.append(items['Text'])
correct_words_from_amazon = [''.join(c for c in s if c not in string.punctuation) for s in
correct_words_from_amazon]
return correct_words_from_amazon
def word_frequency(self, correct_words_from_amazon):
word_counts = collections.Counter(correct_words_from_amazon)
word_frequency = {}
for word, count in sorted(word_counts.items()):
word_frequency.update({word: count})
return dict(sorted(word_frequency.items(), key=lambda item: item[1], reverse=True))
def create_duplicated_words_file(word_frequency):
with open("word_frequency.txt", "w") as filehandle:
filehandle.write(str(' '.join(word_frequency)))
I was trying to use self but I cannot see a good result, and from the reason I use
WordCorrection.create_duplicated_words_file(WordCorrection.word_frequency(WordCorrection.correct_words(b)))
but I'm in 100% sure that it is not correct, there is another way to call one method from another?

I think your trouble is the result of a misunderstanding about keywords/namespaces for modules vs classes.
Modules:
In python, files are modules so when you are inside of a file, all functions defined up to that point in the file are "in scope." So if I have two functions like this:
def func_foo():
return "foo"
def func_bar():
return func_foo() + "bar"
Then func_bar() will return "foobar".
Classes
When you define a class using the class keyword, that defines a new scope/namespace. It is considered proper (although technically not required) to use the word self as the first parameter to an instance method, and this refers to the instance the method is called upon.
For example:
class my_clazz:
def method_foo(self):
return "foo"
def method_bar(self):
return self.method_foo() + "bar"
Then if I have later in the file:
example = my_clazz()
ret_val = example.method_bar()
ret_val will be "foobar"
That said, because I did not really utilize object-oriented programming features in this example, the class definition was largely unnecessary.
Your Issue
So for your issue, it seems like your trouble is caused by what appears to be unnecessarily wrapping your functions inside a class definition. If you got rid of the class definition header and just made all of your functions in the module you would be able to use the calling techniques I used above. For more information on classes in Python I'd recommend reading here.

How do I run two or more methods in a class like a chain?

I'm trying to learn OOP but I'm getting very confused with how I'm supposed to run the methods or return values. In the following code I want to run read_chapters() first, then sendData() with some string content that comes from read_chapters(). Some of the solutions I found did not use __init__ but I want to use it (just to see/learn how i can use them).
How do I run them? Without using __init__, why do you only return 'self'?
import datetime
class PrinceMail:
def __init__(self):
self.date2 = datetime.date(2020, 2, 6)
self.date1 = datetime.date.today()
self.days = (self.date1 - self.date2).days
self.file = 'The_Name.txt'
self.chapter = '' # Not sure if it would be better if i initialize chapter here-
# or if i can just use a normal variable later
def read_chapters(self):
with open(self.file, 'r') as book:
content = book.readlines()
indexes = [x for x in range(len(content)) if 'CHAPTER' in content[x]]
indexes = indexes[self.days:]
heading = content[indexes[0]]
try:
for i in (content[indexes[0]:indexes[1]]):
self.chapter += i # can i use normal var and return that instead?
print(self.chapter)
except IndexError:
for i in (content[indexes[0]:]):
self.chapter += i
print(self.chapter)
return self????? # what am i supposed to return? i want to return chapter
# The print works here but returns nothing.
# sendData has to run after readChapters automatically
def sendData(self):
pass
#i want to get the chapter into this and do something with it
def run(self):
self.read_chapters().sendData()
# I tried this method but it doesn't work for sendData
# Is there anyother way to run the two methods?
obj = PrinceMail()
print(obj.run())
#This is kinda confusing as well

Chaining methods is just a way to shorten this code:
temp = self.read_chapters()
temp.sendData()
So, whatever is returned by read_chapters has to have the method sendData. You should put whatever you want to return in read_chapters in a field of the object itself (aka self) in order to use it after chaining.

First of all, __init__ has nothing to do with what you want to achieve here. You can consider it as a constructor for other languages, this is the first function that is called when you create an object of the class.
Now to answer your question, if I am correct you just want to use the output of read_chapters in sendData. One of the way you can do that is by making the read_chapters a private method (that is if you don't want it to use through the object) using __ in the starting of the name like __read_chapters then make a call to the function inside the sendData function.
Another point to consider here is, when you are using self and don't intend to use the function through the object you don't need to return anything. self assigns the value to the attribute of the current instance. So, you can leave the function read_chapters at self.chapter = i and access the same in sendData.
Ex -
def sendData(self):
print(self.chapter)

I'm not an expert but, the reason to return self is because it is the instance of the class you're working with and that's what allows you to chain methods.
For what you're trying to do, method chaining doesn't seem to be the best approach. You want to sendData() for each iteration of the loop in read_chapters()? (you have self.chapter = i which is always overwritten)
Instead, you can store the chapters in a list and send it after all the processing.
Also, and I don't know if this is a good practice but, you can have a getter to return the data if you want to do something different with (return self.chapter instead of self)
I'd change your code for:
import datetime
class PrinceMail:
def __init__(self):
self.date2 = datetime.date(2020, 2, 6)
self.date1 = datetime.date.today()
self.days = (self.date1 - self.date2).days
self.file = 'The_Name.txt'
self.chapter = []
def read_chapters(self):
with open(self.file, 'r') as book:
content = book.readlines()
indexes = [x for x in range(len(content)) if 'CHAPTER' in content[x]]
indexes = indexes[self.days:]
heading = content[indexes[0]]
try:
for i in (content[indexes[0]:indexes[1]]):
self.chapter.append(i)
except IndexError:
#not shure what you want to do here
for i in (content[indexes[0]:]):
self.chapter.append(i)
return self
# sendData has to run after readChapters automatically
def sendData(self):
pass
#do what ever with self.chapter
def get_raw_chapters(self):
return self.chapter
Also, check PEP 8 Style Guide for naming conventions (https://www.python.org/dev/peps/pep-0008/#function-and-variable-names)
More reading in
Method chaining - why is it a good practice, or not?
What __init__ and self do on Python?

Accessible variables at the root of a python script

I've declared a number of variables at the start of my script, as I'm using them in a number of different methods ("Functions" in python?). When I try to access them, I can't seem to get their value = or set them to another value for that matter. For example:
baseFile = open('C:/Users/<redacted>/Documents/python dev/ATM/Data.ICSF', 'a+')
secFile = open('C:/Users/<redacted>/Documents/python dev/ATM/security.ICSF', 'a+')
def usrInput(raw_input):
if raw_input == "99999":
self.close(True)
else:
identity = raw_input
def splitValues(source, string):
if source == "ident":
usrTitle = string.split('>')[1]
usrFN = string.split('>')[2]
usrLN = string.split('>')[3]
x = string.split('>')[4]
usrBal = Decimal(x)
usrBalDisplay = str(locale.currency(usrBal))
elif source == "sec":
usrPIN = string.split('>')[1]
pinAttempts = string.split('>')[2]
def openAccount(identity):
#read all the file first. it's f***ing heavy but it'll do here.
plString = baseFile.read()
xList = plString.split('|')
parm = str(identity)
for i in xList:
substr = i[0:4]
if parm == substr:
print "success"
usrString = str(i)
else:
lNumFunds = lNumFunds + 1
splitValues("ident", usrString)
When I place baseFile and secFile in the openAccount method, I can access the respective files as normal. However, when I place them at the root of the script, as in the example above, I can no longer access the file - although I can still "see" the variable.
Is there a reason to this? For reference, I am using Python 2.7.

methods ("Functions" in python?)
"function" when they "stand free"; "methods" when they are members of a class. So, functions in your case.
What you describe does definitely work in python. Hence, my diagnosis is that you already read something from the file elsewhere before you call openAccount, so that the read pointer is not at the beginning of the file.

How can I refer to a function not by name in its definition in python?

I am maintaining a little library of useful functions for interacting with my company's APIs and I have come across (what I think is) a neat question that I can't find the answer to.
I frequently have to request large amounts of data from an API, so I do something like:
class Client(object):
def __init__(self):
self.data = []
def get_data(self, offset = 0):
done = False
while not done:
data = get_more_starting_at(offset)
self.data.extend(data)
offset += 1
if not data:
done = True
This works fine and allows me to restart the retrieval where I left off if something goes horribly wrong. However, since python functions are just regular objects, we can do stuff like:
def yo():
yo.hi = "yo!"
return None
and then we can interrogate yo about its properties later, like:
yo.hi => "yo!"
my question is: Can I rewrite my class-based example to pin the data to the function itself, without referring to the function by name. I know I can do this by:
def get_data(offset=0):
done = False
get_data.data = []
while not done:
data = get_more_starting_from(offset)
get_data.data.extend(data)
offset += 1
if not data:
done = True
return get_data.data
but I would like to do something like:
def get_data(offset=0):
done = False
self.data = [] # <===== this is the bit I can't figure out
while not done:
data = get_more_starting_from(offset)
self.data.extend(data) # <====== also this!
offset += 1
if not data:
done = True
return self.data # <======== want to refer to the "current" object
Is it possible to refer to the "current" object by anything other than its name?
Something like "this", "self", or "memememe!" is what I'm looking for.

I don't understand why you want to do this, but it's what a fixed point combinator allows you to do:
import functools
def Y(f):
#functools.wraps(f)
def Yf(*args):
return inner(*args)
inner = f(Yf)
return Yf
#Y
def get_data(f):
def inner_get_data(*args):
# This is your real get data function
# define it as normal
# but just refer to it as 'f' inside itself
print 'setting get_data.foo to', args
f.foo = args
return inner_get_data
get_data(1, 2, 3)
print get_data.foo
So you call get_data as normal, and it "magically" knows that f means itself.

You could do this, but (a) the data is not per-function-invocation, but per function (b) it's much easier to achieve this sort of thing with a class.
If you had to do it, you might do something like this:
def ybother(a,b,c,yrselflambda = lambda: ybother):
yrself = yrselflambda()
#other stuff
The lambda is necessary, because you need to delay evaluation of the term ybother until something has been bound to it.
Alternatively, and increasingly pointlessly:
from functools import partial
def ybother(a,b,c,yrself=None):
#whatever
yrself.data = [] # this will blow up if the default argument is used
#more stuff
bothered = partial(ybother, yrself=ybother)
Or:
def unbothered(a,b,c):
def inbothered(yrself):
#whatever
yrself.data = []
return inbothered, inbothered(inbothered)
This last version gives you a different function object each time, which you might like.
There are almost certainly introspective tricks to do this, but they are even less worthwhile.

Not sure what doing it like this gains you, but what about using a decorator.
import functools
def add_self(f):
#functools.wraps(f)
def wrapper(*args,**kwargs):
if not getattr(f, 'content', None):
f.content = []
return f(f, *args, **kwargs)
return wrapper
#add_self
def example(self, arg1):
self.content.append(arg1)
print self.content
example(1)
example(2)
example(3)
OUTPUT
[1]
[1, 2]
[1, 2, 3]

url builder for python

I know about urllib and urlparse, but I want to make sure I wouldn't be reinventing the wheel.
My problem is that I am going to be fetching a bunch of urls from the same domain via the urllib library. I basically want to be able to generate urls to use (as strings) with different paths and query params. I was hoping that something might have a syntax like:
url_builder = UrlBuilder("some.domain.com")
# should give me "http://some.domain.com/blah?foo=bar
url_i_need_to_hit = url_builder.withPath("blah").withParams("foo=bar") # maybe a ".build()" after this
Basically I want to be able to store defaults that get passed to urlparse.urlunsplit instead of constantly clouding up the code by passing in the whole tuple every time.
Does something like this exist? Do people agree it's worth throwing together?

Are you proposing an extension to http://docs.python.org/library/urlparse.html#urlparse.urlunparse that would substitute into the 6-item tuple?
Are you talking about something like this?
def myUnparse( someTuple, scheme=None, netloc=None, path=None, etc. ):
parts = list( someTuple )
if scheme is not None: parts[0] = scheme
if netloc is not None: parts[1]= netloc
if path is not None: parts[2]= path
etc.
return urlunparse( parts )
Is that what you're proposing?
This?
class URLBuilder( object ):
def __init__( self, base ):
self.parts = list( urlparse(base) )
def __call__( self, scheme=None, netloc=None, path=None, etc. ):
if scheme is not None: self.parts[0] = scheme
if netloc is not None: self.parts[1]= netloc
if path is not None: self.parts[2]= path
etc.
return urlunparse( self.parts )
bldr= URLBuilder( someURL )
print bldr( scheme="ftp" )
Something like that?

You might want consider having a look at furl because it might be an answer to your needs.

Still not quite sure what you're looking for... But I'll give it a shot. If you're just looking to make a class that will keep your default values and such, it's simple enough to make your own class and use Python magic like str. Here's a scratched-out example (suboptimal):
class UrlBuilder:
def __init__(self,domain,path="blah",params="foo=bar"):
self.domain = domain
self.path = path
self.params = params
def withPath(self,path):
self.path = path
return self
def withParams(self,params):
self.params = params
return self
def __str__(self):
return 'http://' + self.domain + '/' + self.path + '?' + self.params
# or return urlparse.urlunparse( ( "http", self.domain, self.path, self.params, "", "" )
def build(self):
return self.__str__()
if __name__ == '__main__':
u = UrlBuilder('www.example.com')
print u.withPath('bobloblaw')
print u.withParams('lawyer=yes')
print u.withPath('elvis').withParams('theking=true')
If you're looking for more of the Builder Design Pattern, the Wikipedia article has a reasonable Python example (as well as Java).

I think you want http://pythonhosted.org/uritools/.
Example from the docs:
parts = urisplit('foo://user#example.com:8042/over/there?name=ferret#nose')
orig_uri = uriunsplit(parts)
The split value is a named tuple, not a regular list. It is accessible by name or index:
assert(parts[0] == parts.schema)
assert(parts[1] == parts.authority)
assert(parts[2] == parts.path)
assert(parts[3] == parts.query)
assert(parts[4] == parts.fragment)
Make a copy to make changes:
new_parts = [part for part in parts]
new_parts[2] = "/some/other/path"
new_uri = uriunsplit(new_parts)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

python structs for dict of files - python

Related

How to properly call one method from another method

How do I run two or more methods in a class like a chain?

Accessible variables at the root of a python script

How can I refer to a function not by name in its definition in python?

url builder for python

Categories

Resources