Repeating an extra when using a Dragonfly CompoundRule - python

Using dragonfly2, the voice command framework, you can make a grammar like so:
chrome_rules = MappingRule(
name='chrome',
mapping={
'down [<n>]': actions.Key('space:%(n)d'),
},
extras=[
IntegerRef("n", 1, 100)
],
defaults={
"n": 1
}
)
This lets me press space n times, where n is some integer. But what do I do if I want to use the same variable (n), multiple times in the same grammar? If I repeat it in the grammar, e.g. 'down <n> <n>' and then say something like "down three four", Dragonfly will parse it correctly, but it will only execute the actions.Key('space:%(n)d') with n=3, using the first value of n. How can I get it to execute it 3 times, and then 4 times using the same variable?
Ideally I don't want to have to duplicate the variable n, in the extras and defaults, because that seems like redundant code.

TL;DR: Your MappingRule passes data to your Action (e.g. Key, Text) in the form of a dictionary, so it can only pass one value per extra. Your best bet right now is probably to create multiple extras.
This is a side-effect of the way dragonfly parses recognitions. I'll explain it first with Action objects, then we can break down why this happens at the Rule level.
When Dragonfly receives a recognition, it has to deconstruct it and extract any extras that occurred. The speech recognition engine itself has no trouble with multiple occurrances of the same extra, and it does pass that data to dragonfly, but dragonfly loses that information.
All Action objects are derived from ActionBase, and this is the method dragonfly calls when it wants to execute an Action:
def execute(self, data=None):
self._log_exec.debug("Executing action: %s (%s)" % (self, data))
try:
if self._execute(data) == False:
raise ActionError(str(self))
except ActionError as e:
self._log_exec.error("Execution failed: %s" % e)
return False
return True
This is how Text works, same with Key. It's not documented here, but data is a dictionary of extras mapped to values. For example:
{
"n": "3",
"text": "some recognized dictation",
}
See the issue? That means we can only communicate a single value per extra. Even if we combine multiple actions, we have the same problem. For example:
{
"down <n> <n>": Key("%(n)d") + Text("%(n)d"),
}
Under the hood, these two actions are combined into an ActionSeries object - a single action. It exposes the same execute interface. One series of actions, one data dict.
Note that this doesn't happen with compound rules, even if each underlying rule shares an extra with the same name. That's because data is decoded & passed per-rule. Each rule passes a different data dict to the Action it wishes to execute.
If you're curious where we lose the second extra, we can navigate up the call chain.
Each rule has a process_recognition method. This is the method that's called when a recognition occurs. It takes the current rule's node and processes it. This node might be a tree of rules, or it could be something lower-level, like an Action. Let's look at the implementation in MappingRule:
def process_recognition(self, node):
"""
Process a recognition of this rule.
This method is called by the containing Grammar when this
rule is recognized. This method collects information about
the recognition and then calls *self._process_recognition*.
- *node* -- The root node of the recognition parse tree.
"""
# Prepare *extras* dict for passing to _process_recognition().
extras = {
"_grammar": self.grammar,
"_rule": self,
"_node": node,
}
extras.update(self._defaults)
for name, element in self._extras.items():
extra_node = node.get_child_by_name(name, shallow=True)
if extra_node:
extras[name] = extra_node.value()
elif element.has_default():
extras[name] = element.default
# Call the method to do the actual processing.
self._process_recognition(node, extras)
I'm going to skip some complexity - the extras variable you see here is an early form of the data dictionary. See where we lose the value?
extra_node = node.get_child_by_name(name, shallow=True)
Which looks like:
def get_child_by_name(self, name, shallow=False):
"""Get one node below this node with the given name."""
for child in self.children:
if child.name:
if child.name == name:
return child
if shallow:
# If shallow, don't look past named children.
continue
match = child.get_child_by_name(name, shallow)
if match:
return match
return None
So, you see the issue. Dragonfly tries to extract one value for each extra, and it gets the first one. Then, it stuffs that value into a dictionary and passes it down to Action. Additional occurrences are lost.

Related

Dynamically update all instances of multiple input function

I'm creating a program with a class that has 3 input attributes. The program calls a function that creates many of these objects with their inputs being given based on some other criteria not important to this question.
As I further develop my program, I may want to add more and more attributes to the class. This means that I have to go and find all instances of the function I am using to create these objects, and change the input arguments.
For example, my program may have many of these:
create_character(blue, pizza, running)
where inputs correspond to character's favorite color, food, and activity. Later, I may want to add a fourth input, such as favorite movie, or possibly a fifth or sixth or ninety-ninth input.
Do professional programmers have any advice for structuring their code so that they don't have to go through and individually change each line that the create_character function is called so that it now has the new, correct number of inputs?
Find and replace seems fine, but this makes error possible, and also seems tedious. I'm anticipating calling this function at least 50 times.
I can think of a few options for how you could design your class to make easier to extend later new kinds of "favorite" things.
The first approach is to make most (or all) of the arguments optional. That is, you should specify a default value for each one (which might be None if there's not a real value that could apply as a default). This way, when you add an extra argument, the existing places that call the function without the new argument will still work, they'll just get the default value.
Another option would be to use a container (like a dictionary) to hold the values, rather than using a separate variable or argument for each one. For instance, in your example could represent the character's favorites using a dictionary like favorites = {'color': blue, 'food': pizza, 'activity': running} (assuming the those values are defined somewhere), and then you could pass the dictionary around instead of the separate items. If you use the get method of the dictionary, you can also make this type of design use default values (favorites.get('movie') will return None if you haven't updated the code that creates the dictionary to add a 'movie' key yet).
You can take advantage of argument/keyword argument unpacking to support dynamically-changing function parameters. And also factory function/classes that generate the function you need:
def create_character(required1, required2, *opt_args, **kwargs):
""" create_character must always be called with required1 and required2
but can receive *opt_args sequence that stores arbitrary number of
positional args. kwargs hold a dict of optional keyword args """
for i, pos_arg in enumerate(opt_args):
# pos_arg walks opt_args sequence
print "position: {}, value: {}".format(i+3, pos_arg)
for keyword, value in kwargs:
print "Keyword was: {}, Value was: {}".format(keyword, value)
pos_args = (1,2,3)
create_character('this is required','this is also required', *pos_args)
""" position: 3, value: 1
position: 4, value: 2
position: 5, value: 3 """
a_dict = {
'custom_arg1': 'custom_value1',
'custom_arg2': 'custom_value2',
'custom_arg3': 'custom_value3'
}
create_character('this is required','this is also required', **a_dict)
""" Keyword was: custom_arg2, value: custom_value2
Keyword was: custom_arg3, value: custom_value3
Keyword was: custom_arg1, value: custom_value1 """
I really like the list or dictionary input method, but it was still messy and allowed for the possibility of error. What I ended up doing was this:
I changed the class object to have no inputs. Favorites were first assigned with random, default, or unspecified options.
After the class object was created, I then edited the attributes of the object, as so:
self.favorite_movie = "unspecified"
self.favorite_activity = "unspecified"
new_character = (character())
new_character.favorite_movie = "Dr. Strangelove"
I think that the downside to this approach is that it should be slower than inputting the variables directly. The upside is that this is easy to change in the future. Perhaps when the program is finished, it will make more sense to then convert to #Blckknight 's method, and give the input as a list or dictionary.

Data structure for syncing task lists in different formats

I am developing a Python program to sync tasks between lists in different to-do formats--initially Emacs org-mode and todo.txt. I am unsure what data structure I should use to track the tasks in a centralized form (or whether this is even the best approach).
My first attempt was to create dictionaries of each task property in which the key is the original line from the task list, and the value is a string of the relevant property. For example, I had dictionaries for the following:
#org-mode format
task_name["TODO [#C] Take out the trash"] = "Take out the trash"
priority["TODO [#C] Take out the trash"] = "C"
#todo.txt format
effort["(A) Refill Starbucks card #10min"] = 20 # meaning 20 minutes of estimated time
I then check which of the two text files was updated most recently, pull changed tasks from the most recent file, and overwrite these tasks on the old file. New tasks from either list are added to the other list. The tasks are also all stored in a centralized file: a CSV / tab-separated value file in which the headers are properties of the task (task_name, effort, priority, deadline, scheduled_date, todo_state, tags, etc.), and each row is a task.
It then occurred to me that maybe I should instead create an object class called "Task" in which each property is an attribute, and each task is an instance of the Task object instead of a series of dictionaries.
class Task(object):
def __init__(self, name, effort, priority):
name = self.name
effort = self.effort
priority = self.priority
Finally, it occurred to me that I might want to use nested dictionaries or JSON formats--something like this:
{line: "TODO [#C] Take out the trash" {
"task_name": "Take out the trash."
"priority": "C"
"todo_state": "TODO"
}}
Or I could put the tasks in an SQLite database.
Which approach is best, or is there another approach that is better than all of these? I am an intermediate Python developer with little experience with advanced data structures and classes, so I appreciate any help you can offer.
The priority queue as a data structure should fit well for this case.
There are at least two ways to implement it in Python.
The first one is based on the Heap data structure and could be described as
pq = [] # list of entries arranged in a heap
entry_finder = {} # mapping of tasks to entries
REMOVED = '' # placeholder for a removed task
counter = itertools.count() # unique sequence count
def add_task(task, priority=0):
'Add a new task or update the priority of an existing task'
if task in entry_finder:
remove_task(task)
count = next(counter)
entry = [priority, count, task]
entry_finder[task] = entry
heappush(pq, entry)
def remove_task(task):
'Mark an existing task as REMOVED. Raise KeyError if not found.'
entry = entry_finder.pop(task)
entry[-1] = REMOVED
def pop_task():
'Remove and return the lowest priority task. Raise KeyError if empty.'
while pq:
priority, count, task = heappop(pq)
if task is not REMOVED:
del entry_finder[task]
return task
raise KeyError('pop from an empty priority queue')
that is taken from here.
The second way is to use a Queue module in Python 2, which is queue in Python 3. This module contains a class PriorityQueue that can satisfy your requirements.
The first one may be considered as a bit more straightforward and flexible for modifications, but the second one may be especially useful in threaded programming due to thread support in Python.

Too many if statements

I have some topic to discuss. I have a fragment of code with 24 ifs/elifs. Operation is my own class that represents functionality similar to Enum. Here is a fragment of code:
if operation == Operation.START:
strategy = strategy_objects.StartObject()
elif operation == Operation.STOP:
strategy = strategy_objects.StopObject()
elif operation == Operation.STATUS:
strategy = strategy_objects.StatusObject()
(...)
I have concerns from readability point of view. Is is better to change it into 24 classes and use polymorphism? I am not convinced that it will make my code maintainable... From one hand those ifs are pretty clear and it shouldn't be hard to follow, on the other hand there are too many ifs.
My question is rather general, however I'm writing code in Python so I cannot use constructions like switch.
What do you think?
UPDATE:
One important thing is that StartObject(), StopObject() and StatusObject() are constructors and I wanted to assign an object to strategy reference.
You could possibly use a dictionary. Dictionaries store references, which means functions are perfectly viable to use, like so:
operationFuncs = {
Operation.START: strategy_objects.StartObject
Operation.STOP: strategy_objects.StopObject
Operation.STATUS: strategy_objects.StatusObject
(...)
}
It's good to have a default operation just in case, so when you run it use a try except and handle the exception (ie. the equivalent of your else clause)
try:
strategy = operationFuncs[operation]()
except KeyError:
strategy = strategy_objects.DefaultObject()
Alternatively use a dictionary's get method, which allows you to specify a default if the key you provide isn't found.
strategy = operationFuncs.get(operation(), DefaultObject())
Note that you don't include the parentheses when storing them in the dictionary, you just use them when calling your dictionary. Also this requires that Operation.START be hashable, but that should be the case since you described it as a class similar to an ENUM.
Python's equivalent to a switch statement is to use a dictionary. Essentially you can store the keys like you would the cases and the values are what would be called for that particular case. Because functions are objects in Python you can store those as the dictionary values:
operation_dispatcher = {
Operation.START: strategy_objects.StartObject,
Operation.STOP: strategy_objects.StopObject,
}
Which can then be used as follows:
try:
strategy = operation_dispatcher[operation] #fetch the strategy
except KeyError:
strategy = default #this deals with the else-case (if you have one)
strategy() #call if needed
Or more concisely:
strategy = operation_dispatcher.get(operation, default)
strategy() #call if needed
This can potentially scale a lot better than having a mess of if-else statements. Note that if you don't have an else case to deal with you can just use the dictionary directly with operation_dispatcher[operation].
You could try something like this.
For instance:
def chooseStrategy(op):
return {
Operation.START: strategy_objects.StartObject
Operation.STOP: strategy_objects.StopObject
}.get(op, strategy_objects.DefaultValue)
Call it like this
strategy = chooseStrategy(operation)()
This method has the benefit of providing a default value (like a final else statement). Of course, if you only need to use this decision logic in one place in your code, you can always use strategy = dictionary.get(op, default) without the function.
Starting from python 3.10
match i:
case 1:
print("First case")
case 2:
print("Second case")
case _:
print("Didn't match a case")
https://pakstech.com/blog/python-switch-case/
You can use some introspection with getattr:
strategy = getattr(strategy_objects, "%sObject" % operation.capitalize())()
Let's say the operation is "STATUS", it will be capitalized as "Status", then prepended to "Object", giving "StatusObject". The StatusObject method will then be called on the strategy_objects, failing catastrophically if this attribute doesn't exist, or if it's not callable. :) (I.e. add error handling.)
The dictionary solution is probably more flexible though.
If the Operation.START, etc are hashable, you can use dictionary with keys as the condition and the values as the functions to call, example -
d = {Operation.START: strategy_objects.StartObject ,
Operation.STOP: strategy_objects.StopObject,
Operation.STATUS: strategy_objects.StatusObject}
And then you can do this dictionary lookup and call the function , Example -
d[operation]()
Here is a bastardized switch/case done using dictionaries:
For example:
# define the function blocks
def start():
strategy = strategy_objects.StartObject()
def stop():
strategy = strategy_objects.StopObject()
def status():
strategy = strategy_objects.StatusObject()
# map the inputs to the function blocks
options = {"start" : start,
"stop" : stop,
"status" : status,
}
Then the equivalent switch block is invoked:
options["string"]()

Python loop appending same object twice [duplicate]

This question already has answers here:
How to avoid having class data shared among instances?
(7 answers)
Closed 14 days ago.
I'm following the YouTube video Learn Python through Data Hacking and am expanding on the sort of "legacy" inline way by adding classes to learn all the ins and outs of Python OOP. Basically the guy in the video has you grab the Chicago bus line's route, parse it, and find which bus his friend left his suitcase on by comparing lat, long, time, distace, etc.
The XML file I'm working with has a bunch of values like so:
<bus>
<id>1867</id>
<rt>22</rt>
<d>North Bound</d>
<dd>Northbound</dd>
<dn>N</dn>
<lat>41.89167051315307</lat>
<lon>-87.6297836303711</lon>
<pid>5421</pid>
<pd>Northbound</pd>
<run>P258</run>
<fs>Howard</fs>
<op>30090</op>
<dip>8858</dip>
<bid>7323012</bid>
<wid1>0P</wid1>
<wid2>258</wid2>
</bus>
From there, I need to find which bus' latitude are north from his position (defined in the class, moot for this example). From those nodes, I'm creating a Bus object like so:
class Bus:
# The original XML node
__xml = None
# Our dictionary for properties to shadow a get() function on this object
__tree = {}
def __init__(self, busxml):
self.__xml = busxml
for e in busxml:
self.__tree[e.tag] = busxml.findtext(e.tag)
def gettree(self):
return self.__tree
# Tries to return prop, or returns None
def get(self, prop):
try:
return self.__tree[prop]
except KeyError:
return -1
def getall(self):
return self.__tree
From the "main" file, I'm looping through the values and appending matches based on the lat node's text value:
# __getRouteData() is the url open and write function, works fine.
# parser is the xml.etree.ElementTree parse class
if self.__getRouteData() == True:
xmlParser = parser(self.__xmlFile)
busses = xmlParser.getnodes()
matches = []
# loop over all
for busTree in busses:
bus = Bus(busTree)
if float(bus.get('lat')) > self.__lat:
matches.append(bus)
print 'appending', bus.get('id')
for bus in matches:
print bus.get('id')
The snag I'm hitting is in the second for loop above. In the first loop, the output is telling me things are working well. The second one outputs the same value twice. It reminds me of the behavior with Javascript for() loops where, without a closure, only the last value is acted upon. My output from the console is as follows:
appending 1784
appending 4057
4057
4057
See... it's telling me it's appending unique busses to my matches list, but when I iterate over the matches list, its only giving me one bus.
Another snippet that tells me something's funky with the second loop:
print 'Matches', matches
for bus in matches:
print bus.get('id')
# Matches [<busutils.Bus instance at 0xb6f92b8c>, <busutils.Bus instance at 0xb6f92d0c>]
# 4057
# 4057
The output of the list is showing me different hashes (...right?) of the objects in the list, hence saying they're two different objects, and thus have different data, but the loop isn't playing nicely.
Obviously I'm just getting into python, but have experience in Java, Javascript, PHP, etc, so I'm not sure what I'm missing in these simple loops.
Thanks!
It's because of your use of class variables on the Bus class. Make them instance variables (created in __init__).

Reading a series of input / output in Python

For my app, I need to print out a series of outputs and then accepts inputs from the user. What would be the best way of doing this?
Like:
print '1'
x = raw_input()
print '2'
y = raw_input()
Something like this, but it would go on for at least 10 times. My only concern with doing the above is that it would make up for poor code readability.
How should I do it? Should I create a function like this:
def printOut(string):
print string
Or is there a better way?
First one note: raw_input() takes an optional argument ... a prompt string.
Regarding the broader question, a simplistic approach would be to create a class which defines the elements of your form and provides the functions for their input, validation, and later manipulations or output.
With such a class instances can be created (instantiated), and collected, stored, etc.
Such an approach need not any more complicated than something like:
#!/usr/bin/python
# I use /usr/bin/env python; but making SO's syntax highlighter happy.
class generic_form:
def __init__(self, element_list):
self.form_elements = element_list
self.contents= dict()
def fill_it_in(self):
for prompt in self.form_elements:
self.contents[prompt] = raw_input(prompt)
def get(self, item):
return self.contents[item]
def print_it(self):
for each in self.form_elements:
print each, self.contents[each]
if __name__ == '__main__':
sample_fields = ("Given Name: ",
"Surname: ",
"Date of Birth: ",
"Notes: ")
example = generic_form(sample_fields)
print "Fill in my form:"
example.fill_it_in()
print
print "Please review your input:"
example.print_it()
# store(:%s, %s: %s" % (example.get('Surname: '), \
# example.get('Given Name: '), example.get('Notes: '))
The main code is only a dozen lines long to define a generic form class with input
and output functionality (and a simple get() method for further illustrative purposes).
The rest of this example simply creates an instance and shows how it could be used.
Because my generic_form class is generic, we have to supply a list of field names which are to be filled in. The names are used as both the names of the fields for later access (see the get() method for an example). Personally I wouldn't do it this way, I'd provide a list of short field names and prompts similar to Marcelo's example. However, I wanted this particular example to be a short as possible to get the main point across.
(The comment at the end would be a call to a hypothetical "store()" function to store this for posterity, by the way).
This is the most minimal approach. However, you'd rapidly find that it's far more useful to have a richer class with validation for each field, and separate classes which format and output instances of that in different ways, and different classes for input. "teletype" input (as provided by the Python raw_input() built-in function) is the crudest form (primarily useful for simplicity and for the ability to process files using shell redirection). One could also support input with the GNU readline support (already included as a standard library in Python), curses support (also included), and one could imagine writing some HTML wrapper and CGI code for handling web-based input.
Coupling "raw_input()" and "print" into our class would mean more work if we ever needed or wanted to support any forms of input or output other than "dumb terminal."
If we create a class which only concerns itself with the data to be collected, then it could provide an interface for any other input class to get the list of the prompts with references to "setter" functions (and perhaps a "required" or "optional" flag). Then any instance of any input class could request the list of desired/required inputs for any form ... present the prompts, call the "setter" methods (which return a boolean to indicate if the data supplied was valid), loop over bad inputs on "required" fields, offer to skip "optional" fields, and so on.
Notice that the logic for displaying prompts, accepting responses, relaying those back to the data object via their setter methods, and handling invalid inputs and be the same for many types of forms. All we need is a way for the form to provide the list of prompts and their corresponding validation functions (and we need to ensure that all these validation functions have the same semantics --- taking the same parameters and so on).
Here's an example of separating the input behavior from the storage and validation of the data fields:
#!/usr/bin/env python
class generic_form:
def __init__(self, element_list):
self.hints = list()
for each in element_list:
self.hints.append((each, each, self.store))
self.contents= dict()
def store(self, key, data):
'''Called by client instances
'''
self.contents[key] = data
return True
def get_hints(self):
return self.hints
def get(self, item):
return self.contents[item]
def form_input(form):
for each, key, fn in form.get_hints():
while True:
if fn(key,raw_input(each)):
break
else:
keep_trying = raw_input("Try again:")
if keep_trying.lower() in ['n', 'no', 'naw']:
break
if __name__ == '__main__':
sample_fields = ("Given Name: ",
"Surname: ",
"Date of Birth: ",
"etc: ")
example = generic_form(sample_fields)
print "Fill in my form:"
form_input(example)
print
print "Please review your input:"
for i, x, x in example.get_hints():
print example.get(i),
In this case the extra complication is not doing anything useful. Our generic_form performs no validation. However, this same input function could be used with any data/form class that provided the same interface. That interface, in this example, only requires a get_hints() method providing tuples of "prompt string", storage key, and storage function references, and a store() method which must return "True" or "False" and take arguments for the key and data to be stored.
The fact that our storage key is passed to our input "client" as an opaque item that must be passed back through its calls to our store() method is a bit subtle; but it allows us to use any single validation function for multiple form elements ... all names can be any string, all dates must pass some call to time.strftime() or some third party parser ... and so on.
The main point is that I can create better forms classes which implement data validation methods as appropriate to the data being gathered and stored. The input example will work for our original dumb forms, but it will work better with forms that return meaningful results from our calls to store() (A better interface between forms and input handling might supply "error" and "help" prompts as well as the simple short "input" prompt we show here. A more complex system might pass "datum" objects through the get_hints() methods. That would require that the forms class instantiate such objects and store a list of them instead of the tuples I'm showing here).
Another benefit is that I can also write other input functions (or classes which implement such functions) that can also use this same interface to any form. Thus I could write some HTML rendering and CGI processing which could use all of the forms that had developed with no changes to my data validation semantics.
(In this example I'm using the get_hints() method as hints for my crude output function as well as my inputs. I'm only doing this to keep the example simple. In practice I'd want to separate input hinting from output handling).
If you are reading in several fields, you might want to do something like this:
field_defs = [
('name', 'Name'),
('dob' , 'Date of Birth'),
('sex' , 'Gender'),
#...
]
# Figure out the widest description.
maxlen = max(len(descr) for (name, descr) in field_defs)
fields = {}
for (name, descr) in field_defs:
# Pad to the widest description.
print '%-*s:' % (maxlen, descr),
fields[name] = raw_input()
# You should access the fields directly from the fields variable.
# But if you really want to access the fields as local variables...
locals().update(fields)
print name, dob, sex
"10 times... poor code readability"
Not really. You'll have to provide something more complex than that.
20 lines of code is hardly a problem. You can easily write more than 20 lines of code trying to save yourself from simply writing 20 lines of code.
You should, also, read the description of raw_input. http://docs.python.org/library/functions.html#raw_input
It writes a prompt. Your four lines of code is really
x = raw_input( '1' )
y = raw_input( '2' )
You can't simplify this much more.

Categories

Resources