2 responses from 2 functions into new function - python

I'm current writing a short bit of code that will compare an etag for a web server page in a saved document to the etag on the server. If they are different, the code will indicate this. My code is below:-
import httplib
def currentTag():
f = open('C:/Users/ME/Desktop/document.txt')
e = f.readline()
newTag(e)
def newTag(old_etag):
c = httplib.HTTPConnection('standards.ieee.org')
c.request('HEAD', '/develop/regauth/oui/oui.txt')
r = c.getresponse()
current_etag = r.getheader('etag').replace('"', '')
compareTag(old_etag, current_etag)
def compareTag(old_etag, current_etag):
if old_etag == current_etag:
print "the same"
else:
print "different"
if __name__ == '__main__':
currentTag()
Now, reviewing my code, there is actually no reason to pass 'etag' from the currentTag() method to the newTag() method given that the pre-existing etag is not processed in newTag(). Nonetheless, if I don't do this, how can I pass two different values to compareTag(). So for example, when defining compareTag(), how can I pass 'etag' from the currentTag() method and 'current_etag' from the newTag() method?

you shouldn't chain your function calls like that, have a main block of code that calls the functions serially, like so:
if __name__ == '__main__':
currtag = currentTag()
newtag = newTag()
compareTag(currtag,newtag)
adjust your functions to return the relevant data
the basic idea of a function is that it returns data, you usually use functions to do some processing and return a value and not for control flow.

change your main to:
if __name__ == '__main__':
compareTag(currentTag(), newTag())
and then have currentTag() return e and newTag() return current_etag

def checkTags():
c = httplib.HTTPConnection('standards.ieee.org')
c.request('HEAD', '/develop/regauth/oui/oui.txt')
r = c.getresponse()
with open('C:/Users/ME/Desktop/document.txt', 'r') as f:
if f.readline() == r.getheader('etag').replace('"', ''): print "the same"
else: print "different"

you could make the variables (i.e. etag) global

Related

Why doesn't the value of variable from another module get propagated to other functions even after it's set?

The variable SCRIPT_ENV gets set correctly in main block but that value does not propagate to other functions.
Here's my full working code:
import argparse
import settings
from multiprocessing import Pool
def set_brokers_and_cert_path():
brokers = None
cert_full_path = None
print("I am here man\n\n {0}".format(settings.SCRIPT_ENV))
if settings.SCRIPT_ENV == "some value":
brokers = # use these brokers
cert_full_path = settings.BASE_CERT_PATH + "test_env/"
if settings.SCRIPT_ENV == "some other value":
brokers = # use those brokers
cert_full_path = settings.BASE_CERT_PATH + "new_env/"
return brokers, cert_full_path
def func_b(partition):
kafka_brokers, cert_full_path = set_brokers_and_cert_path()
producer = KafkaProducer(bootstrap_servers=kafka_brokers,
security_protocol='SSL',
ssl_check_hostname=True,
ssl_cafile=cert_full_path +'cacert.pem',
ssl_certfile=cert_full_path + 'certificate.pem',
ssl_keyfile=cert_full_path + 'key.pem',
max_block_ms=1200000,
value_serializer=lambda v: json.dumps(v).encode('utf-8'),
key_serializer=str.encode
)
try:
producer.send(settings.KAFKA_TOPIC,
value="some val",
key="some key",
timestamp_ms=int(time.time()),
headers=[some headers],
partition=partition)
producer.flush()
except AssertionError as e:
print("Error in partition: {0}, {1}".format(partition, e))
def main():
with Pool(settings.NUM_PROCESSES) as p:
p.map(func_b, [i for i in range(0, 24)])
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--env", help="Environment against which the script needs to run")
args = parser.parse_args()
if args.env:
settings.SCRIPT_ENV = args.env
main()
else:
raise Exception("Please pass env argument. Ex: --env test/accept")
In the line "I am here man", it prints None as the value of SCRIPT_ENV.
Here, SCRIPT_ENV gets set perfectly in the if __name__ == "__main__" block, but in func_a, it comes as None.
contents of settings.py:
KAFKA_TOPIC = "some topic"
NUM_PROCESSES = 8
NUM_MESSAGES = 1000
SCRIPT_ENV = None
NUM_PARTITIONS = 24
TEST_BROKERS = [some brokers]
ACCEPT_BROKERS = [some brokers]
BASE_CERT_PATH = "base path"
I run it like this:
python <script.py> --env <value>
In your code example a few references are missing, without which it is not executable. First, we should find out the program parts that lead to the behavior you describe.
I do not suppose the argparse to be crucial. Basically you change an attribute in the module instance settings and call main(), so you can reduce the last part of the program to a few lines:
settings.SCRIPT_ENV = "test"
main()
Function main() propagates function func_b() over multiple processes. Function func_b() itself performs following steps:
Call of function set_brokers_and_cert_path() (you earlier named func_a())
Create an instance of KafkaProducer (producer)
Send some data per producer.
I cannot see that the created instance has any effect on the content of settings.SCRIPT_ENV, so I reduce function func_b() as well and leave only the first function call (for simplicity I take the original name func_a()).
Function func_a() (aka set_brokers_and_cert_path()) likewise has no apparent influence on settings.SCRIPT_ENV, since only a few strings are generated.
For debugging purposes I left the output of the text variables in the function.
All together, I come up with the following executable minimal example, which does not recreate for me the problem you describe (see output).
You can take this example and transfer it piece by piece back to your code and then see which part of the program leads to the problems.
import settings
from multiprocessing import Pool
def func_a():
print("I am here man\n{0}\n".format(settings.SCRIPT_ENV))
def func_b(partition):
func_a()
def main():
with Pool(settings.NUM_PROCESSES) as p:
p.map(func_b, [i for i in range(0, 24)])
settings.SCRIPT_ENV = "Test"
main()
Output
I am here man
Test
I am here man
Test
I am here man
Test
I am here man
Test
...
#RajatBhardwaj, at this point, it is up to you to decide whether you want to participate in finding a solution.
All others who are not interested in a solution would do well not to criticize solutions without having understood its intention.

How to mock a function which makes a mutation on an argument that is necessary for the caller fuction logic

I want to be able to mock a function that mutates an argument, and that it's mutation is relevant in order for the code to continue executing correctly.
Consider the following code:
def mutate_my_dict(mutable_dict):
if os.path.exists("a.txt"):
mutable_dict["new_key"] = "new_value"
return True
def function_under_test():
my_dict = {"key": "value"}
if mutate_my_dict(my_dict):
return my_dict["new_key"]
return "No Key"
def test_function_under_test():
with patch("stack_over_flow.mutate_my_dict") as mutate_my_dict_mock:
mutate_my_dict_mock.return_value = True
result = function_under_test()
assert result == "new_value"
**Please understand i know i can just mock os.path.exists in this case but this is just an example. I intentionally want to mock the function and not the external module.
**
I also read the docs here:
https://docs.python.org/3/library/unittest.mock-examples.html#coping-with-mutable-arguments
But it doesn't seem to fit in my case.
This is the test i've written so far, but it obviously doesn't work since the key changes:
def test_function_under_test():
with patch("stack_over_flow.mutate_my_dict") as mutate_my_dict_mock:
mutate_my_dict_mock.return_value = True
result = function_under_test()
assert result == "new_value"
Thanks in advance for all of your time :)
With the help of Peter i managed to come up with this final test:
def mock_mutate_my_dict(my_dict):
my_dict["new_key"] = "new_value"
return True
def test_function_under_test():
with patch("stack_over_flow.mutate_my_dict") as mutate_my_dict_mock:
mutate_my_dict_mock.side_effect = mock_mutate_my_dict
result = function_under_test()
assert result == "new_value"
How it works is that with a side effect you can run a function instead of the intended function.
In this function you need to both change all of the mutating arguments and return the value returned.

multiprocessing a function with parameters that are iterated through

I'm trying to improve the speed of my program and I decided to use multiprocessing!
the problem is I can't seem to find any way to use the pool function (i think this is what i need) to use my function
here is the code that i am dealing with:
def dataLoading(output):
name = ""
link = ""
upCheck = ""
isSuccess = ""
for i in os.listdir():
with open(i) as currentFile:
data = json.loads(currentFile.read())
try:
name = data["name"]
link = data["link"]
upCheck = data["upCheck"]
isSuccess = data["isSuccess"]
except:
print("error in loading data from config: improper naming or formating used")
output[name] = [link, upCheck, isSuccess]
#working
def userCheck(link, user, isSuccess):
link = link.replace("<USERNAME>", user)
isSuccess = isSuccess.replace("<USERNAME>", user)
html = requests.get(link, headers=headers)
page_source = html.text
count = page_source.count(isSuccess)
if count > 0:
return True
else:
return False
I have a parent function to run these two together but I don't think i need to show the whole thing, just the part that gets the data iteratively:
for i in configData:
data = configData[i]
link = data[0]
print(link)
upCheck = data[1] #just for future use
isSuccess = data[2]
if userCheck(link, username, isSuccess) == True:
good.append(i)
you can see how I enter all of the data in there, how would I be able to use multiprocessing to do this when I am iterating through the dictionary to collect multiple parameters?
I like to use mp.Pool().map. I think it is easiest and most straight forward and handles most multiprocessing cases. So how does map work? For starts, we have to keep in mind that mp creates workers, each worker receives a copy of the namespace (ya the whole thing), then each worker works on what they are assigned and returns. Hence, doing something like "updating a global variable" while they work, doesn't work; since they are each going to receive a copy of the global variable and none of the workers are going to be communicating. (If you want communicating workers you need to use mp.Queue's and such, it gets complicated). Anyway, here is using map:
from multiprocessing import Pool
t = 'abcd'
def func(s):
return t[int(s)]
results = Pool().map(func,range(4))
Each worker received a copy of t, func, and the portion of range(4) they were assigned. They are then automatically tracked and everything is cleaned up in the end by Pool.
Something like your dataLoading won't work very well, we need to modify it. I also cleaned the code a little.
def loadfromfile(file):
data = json.loads(open(file).read())
items = [data.get(k,"") for k in ['name','link','upCheck','isSuccess']]
return items[0],items[1:]
output = dict(Pool().map(loadfromfile,os.listdir()))

Passing arguments through several functions in Python

This is a question related to best practice.
Let's say I have the following code:
def clean_text(text, idx):
title = sqlalchemy.query("SELECT title FROM page_titles WHERE id = %s" % idx)
return {title: text}
def scrape_webpage(idx):
""" do some stuff here"""
response = requests.get(link)
return clean_text(response.text, idx)
def main():
for idx in range(10):
scrape_webpage(idx)
if __name__ == '__main__':
main()
The main problem I have with this code is there is some statefulness required (namely idx) which is just being passed through scrape_webpage. You can imagine if you have many functions before clean_text() is invoked, idx has to be passed through all of them without being used in any of them.
Is there a better way that clean_text can know the state of the loop without having it passed as an argument (and incidentally, through all of the functions which use it)? Perhaps generators or callbacks? Would appreciate an example.

Accessible variables at the root of a python script

I've declared a number of variables at the start of my script, as I'm using them in a number of different methods ("Functions" in python?). When I try to access them, I can't seem to get their value = or set them to another value for that matter. For example:
baseFile = open('C:/Users/<redacted>/Documents/python dev/ATM/Data.ICSF', 'a+')
secFile = open('C:/Users/<redacted>/Documents/python dev/ATM/security.ICSF', 'a+')
def usrInput(raw_input):
if raw_input == "99999":
self.close(True)
else:
identity = raw_input
def splitValues(source, string):
if source == "ident":
usrTitle = string.split('>')[1]
usrFN = string.split('>')[2]
usrLN = string.split('>')[3]
x = string.split('>')[4]
usrBal = Decimal(x)
usrBalDisplay = str(locale.currency(usrBal))
elif source == "sec":
usrPIN = string.split('>')[1]
pinAttempts = string.split('>')[2]
def openAccount(identity):
#read all the file first. it's f***ing heavy but it'll do here.
plString = baseFile.read()
xList = plString.split('|')
parm = str(identity)
for i in xList:
substr = i[0:4]
if parm == substr:
print "success"
usrString = str(i)
else:
lNumFunds = lNumFunds + 1
splitValues("ident", usrString)
When I place baseFile and secFile in the openAccount method, I can access the respective files as normal. However, when I place them at the root of the script, as in the example above, I can no longer access the file - although I can still "see" the variable.
Is there a reason to this? For reference, I am using Python 2.7.
methods ("Functions" in python?)
"function" when they "stand free"; "methods" when they are members of a class. So, functions in your case.
What you describe does definitely work in python. Hence, my diagnosis is that you already read something from the file elsewhere before you call openAccount, so that the read pointer is not at the beginning of the file.

Categories

Resources