I have a simple script that needs to poll for example 'weather data' every 24 hours using cron and dump the output to a DB.
What would be the most Pythonic way of implementing this?
I do however plan on using a WeatherRecord class used inside the parsing which is responsible for cleaning and modifying the raw input to the format required.
The second method below just feels like wrapping "main" in a class without adding any benefit.
Thanks for any comments
Function version:
file: bin/import-weather
from weather.weather import import_weather()
def main()
import_weather()
if __name__ == "__main__":
main()
file: weather/weather.py
def import_weather():
raw_weather = get_weather('http://weather.is.here/')
parsed_weather = parse_weather(raw_weather)
db_connection = DbConnection.getClient()
write_weather_to_db(parsed_weather, db_connection)
Class version:
file: bin/import-weather
from weather.weather import WeatherImporter
def main():
weather = WeatherImporter()
weather.parse_weather()
weather.write_to_db()
if __name__ == "__main__":
main()
file: weather/weather.py
class WeatherImporter(object):
def __init__(self):
self.db_connection = DbConnection.getClient()
self.url = 'http://weather.is.here/'
self.raw_weather = self._get_weather() # list of dict
self.parsed_weather = None # list of WeatherRecord
def _get_weather(self):
print(self.url)
return ["{JSON Data}"]
def parse_weather(self):
self.parsed_weather = self.raw_weather
def write_to_db(self):
print(self.parsed_weather)
pass
Related
I am trying to call the sum_method function from my evaluation class to my main one, however I run into many errors. I want to use the new_data as the data parameter of my sum_method function.
evaluation class:
class evaluation():
def __init__(self, data):
self.data = data
def sum_method(self):
montant_init = self.data.loc[self.data['Initiateur'] == 'Glovoapp', 'Montant (centimes)'].sum()
print(montant_init)
main class:
class main(evaluation):
new_data.to_csv("transactions.csv", index=False)
self.data = new_data
def call_sum(self, new_data):
init_eval = evaluation.sum_method(self=new_data)
print(init_eval)
init_evalobj = main()
init_evalobj.call_sum()
if you use the method in your inherence class just use self
so:
init_eval = self.sum_method()
the self argument is passed in python automaticly as first parameter
update
you also should return a value:
def sum_method(self):
montant_init = self.data.loc[self.data['Initiateur'] == 'Glovoapp', 'Montant (centimes)'].sum()
print(montant_init)
return montant_init
I'd suggest making some changes to the both classes, to encapsulate the .data member variable in the base class. My preference would also be to separate out the calculation from the display, so leave all the print statements in the call_sum() function.
class evaluation:
def __init__(self, data):
self.data = data
def sum_method(self):
montant_init = self.data.loc[self.data['Initiateur'] == 'Glovoapp', 'Montant (centimes)'].sum()
return montant_init
class main(evaluation):
def __init__(self):
# Reduce csv content to what's needed for analysis
data_csv = pd.read_csv('transactions.csv')
# --> removing unnecessary data
new_data = data_csv[['Opération', 'Initiateur', 'Montant (centimes)', 'Monnaie',
'Date', 'RĂ©sultat', 'Compte marchand', 'Adresse IP Acheteur', 'Marque de carte']]
# --> saving changes...
new_data.to_csv("transactions.csv", index=False)
super().__init__(new_data) //Initialize the base class
def call_sum(self):
print('Glovoapp "montant" generated')
init_eval = self.sum_method() //Call the method from the base class
print(init_eval)
So I am not even sure if what I want to do is possible but I thought I would ask and find out.
I want to build a chef "databag" via python. This is pretty much just a python dictionary. There are other things that need to happen with this databag that are encapsulated in the Databag class.
Now for the meat of the question...
I want to add key/values to this dictionary but need to build it in a way that is easily extensible. NOTE: the autodict is a class that makes it so you can build a dictionary using dot notation.
Here is what I am trying to do:
databag = Databag(
LogGroup=Sub("xva-${environment}-${uniqueid}-mygroup"),
RunList=[
"mysetup::default",
"consul::client"
]
)
databag.Consul() <-- Trying to add consul key/values to databag
print(databag.to_dict())
print(databag.to_string_list())
So you can see how I add the "consul" key values to the already existing databag object.
Here are the class definitions. I know this is wrong which is why I am here to see if this is even possible.
Databag Class
class Databag(object):
def __init__(self,uniqueid=Ref("uniqueid"),environment=Ref("environment"),LogGroup=None,RunList=[]):
self.databag = autodict()
self.databag.uniqueid = uniqueid
self.databag.environment = environment
self.databag.log.group = LogGroup
self.runlist=RunList
def to_string_list(self):
return self.convert_databag_to_string(self.databag)
def to_dict(self):
return self.databag
def get_runlist(self):
return self.convert_to_runlist_string(self.runlist)
Consul Class
class Consul(Databag):
def __init__(self, LogGroup=None):
if LogGroup == None:
Databag.consul.log.group = Databag.log.group
else:
Databag.consul.log.group = LogGroup
As you can see the Consul class is supposed to access the databag dictionary of the Databag class and add the "consul" variables, almost like an attribute. However, I don't want to add a new function to the databag class every time otherwise that class will end up being very, very large.
I was able to get something like this to work with the following method. Although I am up for an suggestions to get this to work. I just read the help posted on this link:
http://www.qtrac.eu/pyclassmulti.html
EDIT: This method is a lot easier:
Note: This uses the exact same implementation of the old method.
consul.py
from classes.databag.utils import *
class Consul:
def Consul(self, LogGroup=None):
if LogGroup == None:
self.databag.consul.log.group = self.databag.log.group
else:
self.databag.consul.log.group = LogGroup
databag.py
from classes.databag.utils import autodict
from classes.databag import consul
class Databag(consul.Consul):
def __init__(self,uniqueid=Ref("uniqueid"),environment=Ref("environment"),LogGroup=None,RunList=[]):
self.databag = autodict()
self.databag.uniqueid = uniqueid
...
...
Folder Structure
/classes/
databag/
utils.py
databag.py
consul.py
testing.py
---- OLD METHOD -----
How I implemented it
from classes.databag.databag import *
databag = Databag(
LogGroup=Sub("xva-${environment}-${uniqueid}-traefik"),
RunList=[
"mysetup::default",
"consul::client"
]
)
databag.Consul()
print(databag.to_dict())
print(databag.to_string_list())
lib.py
def add_methods_from(*modules):
def decorator(Class):
for module in modules:
for method in getattr(module, "__methods__"):
setattr(Class, method.__name__, method)
return Class
return decorator
def register_method(methods):
def register_method(method):
methods.append(method)
return method
return register_method
databay.py
from classes.databag import lib, consul
#lib.add_methods_from(consul)
class Databag(object):
def __init__(self,uniqueid=Ref("uniqueid"),environment=Ref("environment"),LogGroup=None,RunList=[]):
self.databag = autodict()
self.databag.uniqueid = uniqueid
....
....
consul.py
from classes.databag import lib
__methods__ = []
register_method = lib.register_method(__methods__)
#register_method
def Consul(self, LogGroup=None):
if LogGroup == None:
self.databag.consul.log.group = self.databag.log.group
else:
self.databag.consul.log.group = LogGroup
Folder Structure
/classes/
/databag
lib.py
databag.py
consul.py
utils.py
/testing.py
I'm new to python and the main() method and class def's are confusing me. I'm trying to create a bloom filter and my program keeps terminating because I don't think I'm calling things correctly.
class BloomFilter(object):
def __init__(self, numBits, numHashFunctions):
self.numBits = numBits
self.bitArray = [0] * numBits
self.hash = bloomFilterHash(numBits, numHashFunctions)
def insert(self, key):
def lookup(self, key):
def rand_inserts(self,num):
def main(): #not sure if i should put this inside or outside class
bloomfilter = BloomFilter(100,5)
bloomfilter.rand_inserts(15)
if __name__ == '__main__':
BloomFilter().main()
So if I wanted to create a bloom filter with 100 numBits and 5 hash functions, should i call that under the if __name__ == '__main__' or under def main()? I'm not sure if I'm calling these correctly as I'm much more familiar with Java. thanks!
def main():
bloomfilter = BloomFilter(100,5)
bloomfilter.rand_inserts(15)
the name == '__main__' clause is to make sure your code only runs when the module is called directly, not, for instance, if you are importing something from the module in another module. main() is not a special method for a python class, so I believe your objective here, in a simplified way, is the following:
class BloomFilter(object):
def __init__(self, numBits, numHashFunctions):
self.numBits = numBits
self.bitArray = [0] * numBits
self.hash = bloomFilterHash(numBits, numHashFunctions)
if __name__ == '__main__':
# creates an instance of the class
bloomfilter = BloomFilter(100,5)
# apply some method to instance...
bloomfilter.rand_inserts(15)
You would want to put main() outside the class:
class BloomFilter(object):
def __init__(self, numBits, numHashFunctions):
self.numBits = numBits
self.bitArray = [0] * numBits
self.hash = bloomFilterHash(numBits, numHashFunctions)
def insert(self, key):
def lookup(self, key):
def rand_inserts(self,num):
def main():
some_value = Bloomfilter(100, 5)
some_value.rand_inserts(15)
main()
I'd like to write class which reads the *.csv file and parse it using the pandas library. I'm wondering where I should initialize df.
#!/usr/bin/env python
import pandas as pd
import os
class ParseDataBase(object):
def __init__(self, name_file):
self.name_file = name_file
def read_file(self):
"""Read the file concent"""
try:
self.df = pd.read_csv(self.name_file)
except IndexError:
print ("Error: Wrong file name")
sys.exit(2)
return self.df
def dispaly_file(self):
print self.df
def main():
x = ParseDataBase('something.csv')
x.dispaly_file()
if __name__ == '__main__':
main()
The above code returns the following error: 'ParseDataBase' object has no attribute 'df'.
I don't want to pass to many variables while crating the object.
I'm new to object oriented programming, so any comments and hints are highly appreciated!
You aren't assigning self.df unless you run read_file(), which you aren't.
def main():
x = ParseDataBase('something.csv')
x.read_file()
x.dispaly_file()
the attribute df gets assigned in the read_file method. You are trying to access that attribute prior to it existing.
I'd do this:
#!/usr/bin/env python
import pandas as pd
import os
class ParseDataBase(object):
def __init__(self, name_file):
self.name_file = name_file
# Change I made to initiate in the init method.
self.df = self.read_file()
def read_file(self):
"""Read the file concent"""
try:
self.df = pd.read_csv(self.name_file)
except IndexError:
print ("Error: Wrong file name")
sys.exit(2)
return self.df
def dispaly_file(self):
print self.df
def main():
x = ParseDataBase('something.csv')
x.dispaly_file()
if __name__ == '__main__':
main()
I am a total python beginner and I have a variable created in a class of a file commandline_reader.py that I want to access from another script. I tried to do it by making the variable global, which doesn't work.
myscript.py:
from commandline_reader import Commandline_Reader
reader = Commandline_Reader('--get_serial_number')
reader.run()
print output
commandline_reader.py:
class Commandline_Reader:
def __init__(self,argString=''):
global output
output = []
def run(self):
# do stuff
a = 'somevariable'
output.append(a)
When I run myscript.py I always get a NameError: name 'output' is not defined. I've read that this is because global variables are only defined within a module. How do I correctly access the output variable in my script?
ouch. The whole reason object oriented programming takes place is to avoid the use of global variables. Make them instance variables to access them anywhere in the class.
class Commandline_Reader:
def __init__(self,argString=''):
self.output = []
def run(self):
# do stuff
a = 'somevariable'
self.output.append(a) #output is now part of the instance Commandline reader and can be accessed anywhere inside the class.
clr = Commandline_Reader(argstring='--get_serial_number')
clr.run()
print clr.output
>>>['somevariable']
Make output an instance attribute:
class Commandline_Reader:
def __init__(self,argString=''):
self.output = [] # note use of self here
def run(self):
# do stuff
a = 'somevariable'
self.output.append(a) # and here
The access it via the instance:
print reader.output
Maybe class attribute is more appropriate for you?
class Commandline_Reader:
output = []
def run(self):
# do stuff
a = 'somevariable'
self.output.append(a)
Just return the Value from the run() Method
myscript.py:
from commandline_reader import Commandline_Reader
reader = Commandline_Reader('--get_serial_number')
output = reader.run()
print output
commandline_reader.py:
class Commandline_Reader:
def __init__(self,argString=''):
self.output = []
def run(self):
# do stuff
a = 'somevariable'
self.output.append(a)
return self.output