Dynamically call functions in Python - python

I am trying to test a Python library called Tulip dynamically. To do it I need to call the proper ti.<indicator_name> and pass the arguments to call the method.
The problem is, each method has fixed number of parameters and I don't know how to pass it properly.
Let's say I want to test ti.sma method that requires two arguments real and period
def sma(real, period):
"""
Simple Moving Average
"""
return lib.sma([real], [period])
So, I would need to call it as:
sma_test = ti.sma(real=[25,20,22,30,22,28,25,30,21,23,24,22], period=5)
So my question is, how do I call the method above dynamically using the json below as payload?
{
"short_name": "sma",
"data": [74.333511, 73.61084, 74.197395, 73.848442, 75.036385, 76.630219, 76.803459, 77.385063],
"period": 5
}
I have made this validator where I could get the function object but how about the parameters?
import pandas as pd
import numpy as np
import tulipy as ti
import datetime
import uuid
import inspect
def validate_indicator(payload):
data = np.array(payload.get('data'))
try:
indicator = getattr(ti, payload.get('short_name'))
validation_test = indicator(data)
If I run the code above I get the Exception TypeError obviously because I didn't pass the required argument period in output = indicator(data). I believe the way to get there is to make a new function with optional *args
validate_indicator(
{
"short_name": "sma",
"data": [74.333511, 73.61084, 74.197395, 73.848442, 75.036385, 76.630219, 76.803459, 77.385063],
"period": 5
}
)
Result:
"sma() missing 1 required positional argument: 'period'"
Another example, if I want to test ti.bbands that requires real, period and stddev as arguments.
def bbands(real, period, stddev):
"""
Bollinger Bands
"""
return lib.bbands([real], [period, stddev])

You actually can use **kwargs:
File test.py
import test2
data = {
"short_name": "sma",
"data": [74.333511, 73.61084, 74.197395, 73.848442, 75.036385, 76.630219, 76.803459, 77.385063],
"period": 5
}
f = getattr(test2, data.pop('short_name'))
f(**data)
File test2.py:
def sma(data, period):
print(data)
print(period)
> python3 test.py
[74.333511, 73.61084, 74.197395, 73.848442, 75.036385, 76.630219, 76.803459, 77.385063]
5
Note:
If you want to use *args, you could call the function as:
f(*[value for value in data.values()])
Edit
This would be a function that accepts the data dict as a parameter and calls the corresponding function for you:
def validate_function(data):
f = getattr(ti, data.pop('short_name'))
f(**data)

Related

How can I fix this error? TypeError: compute_minimal_traces() missing 1 required positional argument: 'event_log'

This is the code: I need help on solving those errors.
I tried to create an instance. I get the following errors. The main goal is to find an pattern from that event_log. Please let me know what other details I need to fix on this.
import pandas as pd
from sklearn.cluster import KMeans
# Define the phoenixscript class
class phoenixscript:
def __init__(self):
pass
def parse_event_log(event_log):
# Extract the unique events from the event log
events = set()
for trace in event_log:
# Make sure that trace is a collection of events
if not isinstance(trace, (list, tuple)):
continue
# Iterate over the events in the trace
for event in trace:
# Add each element in the event to the set of unique events
events.update(event)
# Return the unique events
return list(events)
# Define the find_pattern method
def find_pattern(self, event_log):
# Parse the event log to extract the relevant information
events = self.parse_event_log(event_log)
# Convert the list of strings into a 2-dimensional array
events_array = [[event] for event in events]
# Use the K-means clustering algorithm to identify patterns in the data
kmeans = KMeans(n_clusters=3)
kmeans.fit(events_array)
patterns = kmeans.cluster_centers_
# Return the identified patterns
return patterns
# Define the compute_minimal_traces method
def compute_minimal_traces(self, event_log):
# Read the event log from the given traces
events_list = pd.DataFrame(event_log)
# Use the find_pattern function to identify the minimal number of traces needed
patterns = self.find_pattern(event_log)
num_traces = len(patterns)
# Print the number of minimal traces
print(num_traces)
# Return the number of minimal traces
return num_traces
# Define the event log
event_log = ([
["status-Open", "Workflow", "summary", "summary", "assignee", "Attachment", "Attachment",
"status-Patch Available",
"Attachment", "Attachment", "Attachment", "Attachment", "summary", "Fix Version", "resolution",
"status-Resolved",
"status-Closed"],
["status-Open", "Workflow", "summary", "issuetype", "summary"],
["status-Open", "Workflow", "summary", "description", "Fix Version", "labels", "Fix Version", "resolution",
"status-Resolved"],
["status-Open", "Workflow"]])
# Create an instance of the phoenixscript class
phoenix = phoenixscript()
num_traces = phoenix.compute_minimal_traces(event_log)
# Print the results
print("The minimal number of traces needed is:", num_traces)
I tried to create an instance. I get the following errors.
Create an instance of the phoenixscript class
phoenix = phoenixscript()
Traceback (most recent call last):
File "/Users/randeepsingh/Desktop/pythonProjectRDA/main.py", line 8, in <module>
class phoenixscript:
File "/Users/randeepsingh/Desktop/pythonProjectRDA/main.py", line 75, in phoenixscript
num_traces = compute_minimal_traces(event_log)
TypeError: compute_minimal_traces() missing 1 required positional argument: 'event_log'
Ignore the following statements(The code doesntworks fine and I get thfv dfe correct result.But if I have a huge list of wofsdbsfgbrds say 150sgdbsgb0, it's not feasible to type and creagsfbsfgbte a dictionary that long.So I made a function which takegsbs thogfbvse words from the list, implements the techsdgbsgdbnisdbfgbque I dicussed above and creates the graph for me which works just ffgdsdfinesbg until here.But wsfgbfghen I try to get the shsfgbsortest distance besgfbsfgbtween two words, I get the following error)

assert_called_with always picking up the arguments from the last call

I am very new to python and this is probably something trivial.
I have the following test:
import pytest
from pytest_mock import MockerFixture, call
# Create environment before importing anything from app/.
import makeenv
from data_f import balance_ledger_functions
import orm
from mock_orm import mock_nodes_db
def test_balance_ledger_process_settled(mock_nodes_db: None, mocker: MockerFixture) -> None:
settled_tranaction = created_transaction
settled_tranaction["recent_status"]["status_id"] = "4"
spy = mocker.spy(orm.Nodes, "balance_update")
assert balance_ledger_functions.balance_ledger(created_transaction) == settled_tranaction
to_node_id = settled_tranaction["to"]["id"]
amount = settled_tranaction["amount"]["amount"]
update_transaction_payload = {"balance":"{0}".format(-int(float(amount))), "is_cma" : False, "currency" : "cUSD"}
spy.assert_called_with(to_node_id, update_transaction_payload)
# fees
spy.assert_called_with(
settled_tranaction["fees"][0]["to"]["id"],
{"balance":"{0}".format(-int(float(settled_tranaction["fees"][0]["fee"])))}
)
spy.assert_called_with(
settled_tranaction["fees"][1]["to"]["id"],
{"balance":"{0}".format(-int(float(settled_tranaction["fees"][1]["fee"])))}
)
In the function that we are trying to test the order of the calls are exactly as defined in the test (with different arguments). However, the test is failing with the following error:
> spy.assert_called_with(to_node_id, update_transaction_payload)
E AssertionError: Expected call: balance_update('6156661f7c1c6b71adefbb40', {'balance': '-10000', 'is_cma': False, 'currency': 'cUSD'})
E Actual call: balance_update('559339aa86c273605ccd35df', {'balance': '5'})
Basically, it is asserting the last set of arguments.
What is the correct way to test something like that?
Tried this - didn't work either...
I created a pytest plugin to help me in those situations: pip install pytest-mock-generator
Once you install it, you'll have the mg fixture. You can put this line of code in your test and it would print and return the asserts for you: mg.generate_asserts(spy).
Here is a complete code example:
Say that you have a python file named example.py:
def hello(name: str) -> str:
return f"Hello {name}!"
Then you have this test:
import example
def test_spy_list_of_calls(mocker, mg):
example.hello("before spy")
my_spy = mocker.spy(example, "hello")
example.hello("after spy")
example.hello("another after spy")
example.hello("one more time")
mg.generate_asserts(my_spy)
The final line would print this:
from mock import call
assert 3 == my_spy.call_count
my_spy.assert_has_calls(
calls=[call('after spy'), call('another after spy'), call('one more time'), ])
Add those lines to your test and it should work:
import example
from mock import call
def test_spy_list_of_calls(mocker):
example.hello("before spy")
my_spy = mocker.spy(example, "hello")
example.hello("after spy")
example.hello("another after spy")
example.hello("one more time")
assert 3 == my_spy.call_count
my_spy.assert_has_calls(
calls=[call('after spy'), call('another after spy'), call('one more time'), ])

What can be alternate source of input for args getResolvedOptions() method in AWS GlueJob?

I have a Glue Job in which I want to pass parameters to getResolvedOptions. One way I know is by creating a JobRun within Lambda Function, I can pass it. What are the other ways to pass param1 and param2 in code below:
import sys
from awsglue.utils import getResolvedOptions
args = getResolvedOptions(sys.argv, ['param1',
'param2'])
Note: I don't want to pass parameters in code by hardcoding it.
Thanks in Advance.
You can easily achieve this through cloudformation (cfn) yaml templates or alternatively you could just add the variables directly to the job, via cli/sdk/console etc. If you wanted to go down the cfn route, you could define your resource as follows:
JobNAME:
Type: "AWS::Glue::Job"
Properties:
Name: String
Description: String
Role: String
GlueVersion: 1.0
Command:
Name: "glueetl"
ScriptLocation: String
PythonVersion: 3
DefaultArguments: {
"--job-language": "python",
"--param1" : VALUE,
"--param2" : VALUE,
"--TempDir" : String,
"--job-bookmark-option" : "job-bookmark-enable",
"--enable-continuous-cloudwatch-log" : "false",
"--enable-continuous-log-filter" : "false",
"--enable-metrics" : "false"
}
ExecutionProperty:
MaxConcurrentRuns: 1
MaxCapacity: 5
MaxRetries: 1
Timeout: 60
Once defined, you can call out the parameters through getResolvedOptions, noting there are reserved values for glue defaults, e.g.:
import sys
from awsglue.utils import getResolvedOptions
## #params: [JOB_NAME <--default assigned, param1 <---your value, param2 <---your value]
args = getResolvedOptions(sys.argv, ['JOB_NAME', 'param1','param2'])

Upload csv file in Anvil with uplink

I have an error like this:
TypeError: __init__() takes from 1 to 2 positional arguments but 3 were given at <ipython-input 4-66c1c8f89515>, line 8 called from Form1, line 18
this my code in anvil:
class Form1(Form1Template):
def __init__(self, **properties):
# Set Form properties and Data Bindings.
self.init_components(**properties)
def file_loader_1_change(self, file, **event_args):
"""This method is called when a new file is loaded into this FileLoader"""
anvil.server.call('import_csv_data',file)
and this code in jupyter notebook for upload the data to anvil data table:
import pandas as pd
import anvil.tables as tables
from anvil.tables import app_tables
import anvil.media
#anvil.server.callable
def import_csv_data(file):
with anvil.media.TempFile(file, "r") as f:
df = pd.read_csv(f)
for d in df.to_dict(orient="records"):
# d is now a dict of {columnname -> value} for this row
# We use Python's **kwargs syntax to pass the whole dict as
# keyword arguments
app_tables.NilaiTukar.add_row(**d)
I think the error you saw is because you are giving two arguments to anvil.media.TempFile and it is only designed to take one. I replicated your error with a simpler example:
import anvil.media
#anvil.server.callable
def import_csv_data(file):
with anvil.media.TempFile(file, "r") as f:
pass
if __name__ == "__main__":
import_csv_data("fname.txt")
According to the docs you don't need the "r" argument. You should just call:
#anvil.server.callable
def import_csv_data(file):
with anvil.media.TempFile(file) as f:
...
Then it should work for you.

Reset index name in elasticsearch dsl

I'm trying to create an ETL that extracts from mongo, process the data and loads into elastic. I will do a daily load so I thought of naming my index with the current date. This will help me for a later processing I need to do with this first index.
I used elasticsearch dsl guide: https://elasticsearch-dsl.readthedocs.io/en/latest/persistence.html
The problem that I have comes from my little experience with working with classes. I don't know how to reset the Index name from the class.
Here is my code for the class (custom_indices.py):
from elasticsearch_dsl import Document, Date, Integer, Keyword, Text
from elasticsearch_dsl.connections import connections
from elasticsearch_dsl import Search
import datetime
class News(Document):
title = Text(analyzer='standard', fields={'raw': Keyword()})
manual_tagging = Keyword()
class Index:
name = 'processed_news_'+datetime.datetime.now().strftime("%Y%m%d")
def save(self, ** kwargs):
return super(News, self).save(** kwargs)
def is_published(self):
return datetime.now() >= self.processed
And this is the part of the code where I create the instance to that class:
from custom_indices import News
import elasticsearch
import elasticsearch_dsl
from elasticsearch_dsl.connections import connections
import pandas as pd
import datetime
connections.create_connection(hosts=['localhost'])
News.init()
for index, doc in df.iterrows():
new_insert = News(meta={'id': doc.url_hashed},
title = doc.title,
manual_tagging = doc.customTags,
)
new_insert.save()
Every time I call the "News" class I would expect to have a new name. However, the name doesn't change even if I load the class again (from custom_indices import News). I know this is only a problem I have when testing but I'd like to know how to force that "reset". Actually, I originally wanted to change the name outside the class with this line right before the loop:
News.Index.name = "NEW_NAME"
However, that didn't work. I was still seeing the name defined on the class.
Could anyone please assist?
Many thanks!
PS: this must be just an object oriented programming issue. Apologies for my ignorance on the subject.
Maybe you could take advantage of the fact that Document.init() accepts an index keyword argument. If you want the index name to get set automatically, you could implement init() in the News class and call super().init(...) in your implementation.
A simplified example (python 3.x):
from elasticsearch_dsl import Document
from elasticsearch_dsl.connections import connections
import datetime
class News(Document):
#classmethod
def init(cls, index=None, using=None):
index_name = index or 'processed_news_' + datetime.datetime.now().strftime("%Y%m%d")
return super().init(index=index_name, using=using)
You can override the index when you call save() .
new_insert.save('processed_news_' + datetime.datetime.now().strftime("%Y%m%d"))
Example as following.
# coding: utf-8
import datetime
from elasticsearch_dsl import Keyword, Text, \
Index, Document, Date
from elasticsearch_dsl.connections import connections
HOST = "localhost:9200"
index_names = [
"foo-log-",
"bar-log-",
]
default_settings = {"number_of_shards": 4, "number_of_replicas": 1}
index_settings = {
"foo-log-": {
"number_of_shards": 40,
"number_of_replicas": 1
}
}
class LogDoc(Document):
level = Keyword(ignore_above=256)
date = Date(format="yyyy-MM-dd'T'HH:mm:ss.SSS")
hostname = Text(fields={'fields': Keyword(ignore_above=256)})
message = Text()
createTime = Date(format="yyyy-MM-dd'T'HH:mm:ss.SSS")
def auto_create_index():
'''自动创建ES索引'''
connections.create_connection(hosts=[HOST])
for day in range(3):
dt = datetime.datetime.now() + datetime.timedelta(days=day)
for index in index_names:
name = index + dt.strftime("%Y-%m-%d")
settings = index_settings.get(index, default_settings)
idx = Index(name=name)
idx.document(LogDoc)
idx.settings(**settings)
try:
idx.create()
except Exception as e:
print(e)
continue
print("create index %s" % name)
if __name__ == '__main__':
auto_create_index()

Categories

Resources