Storing and evaluating nested string elements

Storing and evaluating nested string elements - python

Given the exampleString = "[9+[7*3+[1+2]]-5]"
How does one extract and store elements enclosed by [] brackets, and then evaluate them in order?
1+2 --+
|
7*3+3 --+
|
9+24-5
Does one have to create somekind of nested list? Sorry for this somewhat broad question and bad English.
I see, this question is really too broad... Is there a way to create a nested list from that string? Or maybe i should simply do regex search for every element and evaluate each? The nested list option (if it exists) would be a IMO "cleaner" approach than looping over same string and evaluating until theres no [] brackets.

Have a look at pyparsing module and some examples they have (four function calculator is something you want and more).
PS. In case the size of that code worries you, look again: most of this can be stripped. The lower half are just tests. The upper part can be stripped from things like supporting e/pi/... constants, trigonometric funcitons, etc. I'm sure you can cut it down to 10 lines for what you need.

A good starting point is the the shunting-yard algorithm.
There are multiple Python implementations available online; here is one.
The algorithm can be used to translate infix notation into a variety of representations. If you are not constrained with regards to which representation you can use, I'd recommend considering Reverse Polish notation as it's easy to work with.

Here is a regex solution:
import re
def evaluatesimple(s):
return eval(s)
def evaluate(s):
while 1:
simplesums=re.findall("\[([^\]\[]*)\]",s)
if (len(simplesums) == 0):
break
replacements=[('[%s]' % item,str(evaluatesimple(item))) for item in simplesums]
for r in replacements:
s = s.replace(*r)
return s
print evaluate("[9+[7*3+[1+2]]-5]")
But if you want to go the whole hog and build a tree to evaluate later, you can use the same technique but store the expressions and sub expressions in a dict:
def tokengen():
for c in 'abcdefghijklmnopqrstuvwyxz':
yield c
def makeexpressiontree(s):
d=dict()
tokens = tokengen()
while 1:
simplesums=re.findall("\[([^\]\[]*)\]",s)
if (len(simplesums) == 0):
break
for item in simplesums:
t = tokens.next()
d[t] = item
s = s.replace("[%s]"% item,t)
return d
def evaltree(d):
"""A simple dumb way to show in principle how to evaluate the tree"""
result=0
ev={}
for i,t in zip(range(len(d)),tokengen()):
ev[t] = eval(d[t],ev)
result = ev[t]
return result
s="[9+[7*3+[1+2]]-5]"
print evaluate(s)
tree=makeexpressiontree(s)
print tree
print evaltree(tree)
(Edited to extend my answer)

Related

is there a way to order a set to it compare to the original string to test if the string has a letter twice or all of the letters are unique [duplicate]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Context: I'm a CS n00b working my way through "Cracking the Coding Interview." The first problem asks to "implement an algorithm to determine if a string has all unique characters." My (likely naive) implementation is as follows:
def isUniqueChars2(string):
uchars = []
for c in string:
if c in uchars:
return False
else:
uchars.append(c)
return True
The author suggests the following implementation:
def isUniqueChars(string):
checker = 0
for c in string:
val = ord(c) - ord('a')
if (checker & (1 << val) > 0):
return False
else:
checker |= (1 << val)
return True
What makes the author's implementation better than mine (FWIW, the author's solution was in Java and I converted it to Python -- is my solution one that is not possible to implement in Java)? Or, more generally, what is desirable in a solution to this problem? What is wrong with the approach I've taken? I'm assuming there are some fundamental CS concepts (that I'm not familiar with) that are important and help inform the choice of which approach to take to this problem.

Here is how I would write this:
def unique(s):
return len(set(s)) == len(s)
Strings are iterable so you can pass your argument directly to set() to get a set of the characters from the string (which by definition will not contain any duplicates). If the length of that set is the same as the length of the original string then you have entirely unique characters.
Your current approach is fine and in my opinion it is much more Pythonic and readable than the version proposed by the author, but you should change uchars to be a set instead of a list. Sets have O(1) membership test so c in uchars will be considerably faster on average if uchars is a set rather than a list. So your code could be written as follows:
def unique(s):
uchars = set()
for c in s:
if c in uchars:
return False
uchars.add(c)
return True
This will actually be more efficient than my version if the string is large and there are duplicates early, because it will short-circuit (exit as soon as the first duplicate is found).

Beautiful is better than ugly.
Your approach is perfectly fine. This is python, when there are a bajillion ways to do something. (Yours is more beautiful too :)). But if you really want it to be more pythonic and/or make it go faster, you could use a set, as F.J's answer has described.
The second solution just looks really hard to follow and understand.
(PS, dict is a built-in type. Don't override it :p. And string is a module from the standard library.)

str = raw_input("Enter string: ")
def isUnique():
test_dict = dict()
is_unique = True
for c in str:
if(not test_dict.get(c, False)):
test_dict[c] = c
else:
is_unique = False
break
if is_unique:
return "String is unique"
else:
return "String is not unique"
print(isUnique())

Your solution is not incorrect but your variable dict is not actually a dictionary which means it has to do a linear search to check for the character. The solution from the book does the check in constant time. I will say that the other solution is obnoxiously unreadable because it uses setting the bits in a number to check if the char is unique or not

The solution you have translated from Java into Python is what's called a 'bit-twiddling' algorithm. The idea is that an integer can be treated in multiple ways: One, as a number. Two, as a collection of bits (32 off/ons, or 64, or what-have-you). The algorithm bit-twiddles by saying each bit represents the presence or absense of a specific character - if the nth bit is 0, it sets it. If it's 1, the character that bit corresponds to already exists, so we know there are no unique characters.
However, unless you need the efficiency, avoid bit-twiddling algorithms, as they're not as self-evident in how they work as non-bit-twiddles.

your implementation takes O(n2), author takes O(n).
In your implementation, " if c in uchars:" , when it check if c in this array, it will have to go through the whole array, it takes time.
So yours is not better than author's...

Solution 1:
def is_unique(string):
if len(string) > 128:
return False
unique_tracker = [False] * 128
for char in string:
if unique_tracker[ord(char)] == False:
unique_tracker[ord(char)] = True
else:
return False
return True
Solution 2:
def is_unique_bit(string):
if len(string) > 128:
return False
unique_tracker = 0
for char in string:
ascii_val = ord(char)
if (unique_tracker & (1 << ascii_val)) > 0:
return False
unique_tracker |= (1 << ascii_val)
return True

The original question is along the lines of:
Implement an algorithm to determine if a string has all unique characters. What if you cannot use additional data structures?
Focus on the second sentence, It says we cannot use additional data structures, i.e you need to consider the space complexity of your solution. Your solution uses an array and therefore is not meeting the questions criteria.

Python: List item is empty, code to detect if it is and then put in a place holder value?

Hey I'm writing a program that receives a broadcast from Scratch and then determines based on the broadcast, where to proceed. The code turns the broadcast(list item) into a string and then breaks that string into a list using .split(). The only problem is the broadcast may only be 1 word instead of 2. Is there a way to check if one of the list items from .split() is empty and then change it to a place holder value?
Where I am having trouble
scratchbroadcast = str(msg[1])
BroadcastList = scratchbroadcast.split()
#starts the switch statement that interprets the message and proceeds
#to the appropriate action
v = BroadcastList[0]
w = BroadcastList[1]
if BroadcastList[1] == '':
w = "na"

If BroadcastList contains only one word then BroadcastList will be a single-element list, e.g.
>>> "foo".split()
['foo']
Obviously we can't check whether the second item in the list is an empty string ''; there isn't a second element. Instead, check the length of the list:
w = "na" if len(BroadcastList) == 1 else BroadcastList[1]
Alternatively, use try to catch the IndexError (it's easier to ask for forgiveness than permission):
try:
w = BroadcastList[1]
except IndexError:
w = "na"

Okay, first consider this: how about the third item? Or the fourth? Or the forty-second?
If the string doesn't contain a splitter character (e.g. a space), you wouldn't end up with a list of two items, one of which blank -- you would end up with a list of only one item.
In Python, the length of something is generally obtained through the built-in len() function:
len([]) # == 0
len(["foo"]) # == 1
len(["foo", "bar"]) # == 2
Therefore, you would do:
if len(broadcast_list) == 1:
broadcast_list += [""]
Other ways of doing the same thing include broadcast_list.append("") and broadcast_list.extend([""]). Which one to use is completely up to you; += and .extend are more or less equivalent while .append can only add a single element.
Looking at the rest of your code, your case calls won't work like you expect them to: in Python, strings are truthy, so 'string' or 'otherString' is basically the same as True or True. or is strictly a boolean operator and you can't use it for 'either this or that'.
Python is notorious for not having a switch statement. Your attempt at implementing one would actually be kind of cute had you gone through with it -- something like that can be a pretty good exercise in Python OOP and passing functions as first-class objects. (In my day-to-day use of Python I hardly ever need to do something like that, but it's great to have it in your conceptual toolkit.)
You might be happy to learn that Python strings have a lower method; with it, your code would end up looking something like this:
v = broadcast_list[0].lower()
if v == 'pilight':
# ...
else if v == 'motor':
# ...
else if v == 'camera':
# ....
On a side note, you might want to have a look a PEP8 which is the de facto standard for formatting Python code. If you want other people to be able to quickly figure out your code, you should conform at least to its most basic propositions - such as classes being CamelCased and variables in lowercase, rather than the other way around.

Implementing an algorithm to determine if a string has all unique characters [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Context: I'm a CS n00b working my way through "Cracking the Coding Interview." The first problem asks to "implement an algorithm to determine if a string has all unique characters." My (likely naive) implementation is as follows:
def isUniqueChars2(string):
uchars = []
for c in string:
if c in uchars:
return False
else:
uchars.append(c)
return True
The author suggests the following implementation:
def isUniqueChars(string):
checker = 0
for c in string:
val = ord(c) - ord('a')
if (checker & (1 << val) > 0):
return False
else:
checker |= (1 << val)
return True
What makes the author's implementation better than mine (FWIW, the author's solution was in Java and I converted it to Python -- is my solution one that is not possible to implement in Java)? Or, more generally, what is desirable in a solution to this problem? What is wrong with the approach I've taken? I'm assuming there are some fundamental CS concepts (that I'm not familiar with) that are important and help inform the choice of which approach to take to this problem.

Here is how I would write this:
def unique(s):
return len(set(s)) == len(s)
Strings are iterable so you can pass your argument directly to set() to get a set of the characters from the string (which by definition will not contain any duplicates). If the length of that set is the same as the length of the original string then you have entirely unique characters.
Your current approach is fine and in my opinion it is much more Pythonic and readable than the version proposed by the author, but you should change uchars to be a set instead of a list. Sets have O(1) membership test so c in uchars will be considerably faster on average if uchars is a set rather than a list. So your code could be written as follows:
def unique(s):
uchars = set()
for c in s:
if c in uchars:
return False
uchars.add(c)
return True
This will actually be more efficient than my version if the string is large and there are duplicates early, because it will short-circuit (exit as soon as the first duplicate is found).

Beautiful is better than ugly.
Your approach is perfectly fine. This is python, when there are a bajillion ways to do something. (Yours is more beautiful too :)). But if you really want it to be more pythonic and/or make it go faster, you could use a set, as F.J's answer has described.
The second solution just looks really hard to follow and understand.
(PS, dict is a built-in type. Don't override it :p. And string is a module from the standard library.)

str = raw_input("Enter string: ")
def isUnique():
test_dict = dict()
is_unique = True
for c in str:
if(not test_dict.get(c, False)):
test_dict[c] = c
else:
is_unique = False
break
if is_unique:
return "String is unique"
else:
return "String is not unique"
print(isUnique())

Your solution is not incorrect but your variable dict is not actually a dictionary which means it has to do a linear search to check for the character. The solution from the book does the check in constant time. I will say that the other solution is obnoxiously unreadable because it uses setting the bits in a number to check if the char is unique or not

The solution you have translated from Java into Python is what's called a 'bit-twiddling' algorithm. The idea is that an integer can be treated in multiple ways: One, as a number. Two, as a collection of bits (32 off/ons, or 64, or what-have-you). The algorithm bit-twiddles by saying each bit represents the presence or absense of a specific character - if the nth bit is 0, it sets it. If it's 1, the character that bit corresponds to already exists, so we know there are no unique characters.
However, unless you need the efficiency, avoid bit-twiddling algorithms, as they're not as self-evident in how they work as non-bit-twiddles.

your implementation takes O(n2), author takes O(n).
In your implementation, " if c in uchars:" , when it check if c in this array, it will have to go through the whole array, it takes time.
So yours is not better than author's...

Solution 1:
def is_unique(string):
if len(string) > 128:
return False
unique_tracker = [False] * 128
for char in string:
if unique_tracker[ord(char)] == False:
unique_tracker[ord(char)] = True
else:
return False
return True
Solution 2:
def is_unique_bit(string):
if len(string) > 128:
return False
unique_tracker = 0
for char in string:
ascii_val = ord(char)
if (unique_tracker & (1 << ascii_val)) > 0:
return False
unique_tracker |= (1 << ascii_val)
return True

The original question is along the lines of:
Implement an algorithm to determine if a string has all unique characters. What if you cannot use additional data structures?
Focus on the second sentence, It says we cannot use additional data structures, i.e you need to consider the space complexity of your solution. Your solution uses an array and therefore is not meeting the questions criteria.

Using Python, How can I evaluate an expression in the form of prefix notation compacted?

I am currently working on the python problemsets on a website called singpath. The question is:
Prefix Evaluation
Create a function that evaluates the arithmetic expression in the form of prefix notation without spaces or syntax errors. The expression is given as a string, all the numbers in the expression are integer 0~9, and the operators are +(addition), -(subtraction), *(multiplication), /(division), %(modulo), which operate just the same as those in Python.
Prefix notation, also known as Polish notation, is a form of notation for logic, arithmetic, and algebra. it places operators to the left of their operands. If the arity of the operators is fixed, the result is a syntax lacking parentheses or other brackets that can still be parsed without ambiguity.
This seems simple enough but the string is condensed with no spaces in the input to splice out the data. How could I separate the data from the string without importing modules? Furthermore how could I use the results of the data to solve the given equation? Also please keep in minf that Singpath solutions must be in ONE function that cannot use methods that couldn't be found in the standard python library. This also includes functions declared within the solution :S
Examples:
>>> eval_prefix("+34")
7
>>> eval_prefix("*−567")
-7
>>> eval_prefix("-*33+2+11")
5
>>> eval_prefix("-+5*+1243")
14
>>> eval_prefix("*+35-72")
40
>>> eval_prefix("%3/52")
1
See my point no spaces D:

OK, not as snazzy as alex jordan's lamba/reduce solution, but it doesn't choke on garbage input. It's sort of a recursive descent parser meets bubble sort abomination (I'm thinking it could be a little more efficient when it finds a solvable portion than just jumping back to the start. ;)
import operator
def eval_prefix(expr):
d = {'+': operator.add,
'-': operator.sub,
'*': operator.mul,
'/': operator.div, # for 3.x change this to operator.truediv
'%': operator.mod}
for n in range(10):
d[str(n)] = n
e = list(d.get(e, None) for e in expr)
i = 0
while i + 3 <= len(e):
o, l, r = e[i:i+3]
if type(o) == type(operator.add) and type(l) == type(r) == type(0):
e[i:i+3] = [o(l, r)]
i = 0
else:
i += 1
if len(e) != 1:
print 'Error in expression:', expr
return 0
else:
return e[0]
def test(s, v):
r = eval_prefix(s)
print s, '==', v, r, r == v
test("+34", 7)
test("*-567", -7)
test("-*33+2+11", 5)
test("-+5*+1243", 14)
test("*+35-72", 40)
test("%3/52", 1)
test("****", 0)
test("-5bob", 10)

I think the crucial bit here is "all the numbers in the expression are integer 0~9". All numbers are single digit. You don't need spaces to find out where one number ends and the next one starts. You can access the numbers directly by their string index, as lckknght said.
To convert the characters in the string into integers for calculation, use ord(ch) - 48 (because "0" has the ASCII code 48). So, to get the number stored in position 5 of input, use ord(input[5]) - 48.
To evaluate nested expressions, you can call your function recursively. The crucial assumption here is that there are always exactly two operants to an operator.

Your "one function" limitation isn't as bad as you think. Python allows defining functions inside functions. In the end, a function definition is nothing more than assigning the function to a (usually new) variable. In this case, I think you will want to use recursion. While that can also be done without an extra function, you may find it easier to define an extra recursion function for it. This is no problem for your limits:
def eval_prefix (data):
def handle_operator (operator, rest):
# You fill this in.
# and this, too.
That should be enough of a hint (if you want to use a recursive approach).

Well, one-liner fits in? Reduce in python3 is hidden in functools
Somewhat lispy :)
eval_prefix = lambda inp:\
reduce(lambda stack, symbol:\
(
(stack+[symbol]) if symbol.isdigit() \
else \
(
stack[:-2]+\
[str(
eval(
stack[-1]+symbol+stack[-2]
)
)
]
)
), inp[::-1], [])[0]

The hint that you are most likely looking for is "strings are iterable":
def eval_prefix(data):
# setup state machine
for symbol_ in data:
# update state machine

Separating the elements of the string is easy. All elements are a single character long, so you can directly iterate over (or index) the string to get at each one. Or if you want to be able to manipulate the values, you could passing the string to the list constructor.
Here are some examples of how this can work:
string = "*-567"
# iterating over each character, one at a time:
for character in string:
print(character) # prints one character from the string per line
# accessing a specific character by index:
third_char = string[2] # note indexing is zero-based, so 3rd char is at index 2
# transform string to list
list_of_characters = list(string) # will be ["*", "-", "5", "6", "7"]
As for how to solve the equation, I think there are two approaches.
One is to make your function recursive, so that each call evaluates a single operation or literal value. This is a little tricky, since you're only supposed to use one function (it would be much easier if you could have a recursive helper function that gets called with a different API than the main non-recursive function).
The other approach is to build up a stack of values and operations that you're waiting to evaluate while taking just a single iteration over the input string. This is probably easier given the one-function limit.

What possible improvements can be made to a palindrome program?

I am just learning programming in Python for fun. I was writing a palindrome program and I thought of how I can make further improvements to it.
First thing that came to my mind is to prevent the program from having to go through the entire word both ways since we are just checking for a palindrome. Then I realized that the loop can be broken as soon as the first and the last character doesn't match.
I then implemented them in a class so I can just call a word and get back true or false.
This is how the program stands as of now:
class my_str(str):
def is_palindrome(self):
a_string = self.lower()
length = len(self)
for i in range(length/2):
if a_string[i] != a_string[-(i+1)]:
return False
return True
this = my_str(raw_input("Enter a string: "))
print this.is_palindrome()
Are there any other improvements that I can make to make it more efficient?

I think the best way to improvise write a palindrome checking function in Python is as follows:
def is_palindrome(s):
return s == s[::-1]
(Add the lower() call as required.)

What about something way simpler? Why are you creating a class instead of a simple method?
>>> is_palindrome = lambda x: x.lower() == x.lower()[::-1]
>>> is_palindrome("ciao")
False
>>> is_palindrome("otto")
True

While other answers have been given talking about how best to approach the palindrome problem in Python, Let's look at what you are doing.
You are looping through the string by using indices. While this works, it's not very Pythonic. Python for loops are designed to loop over the objects you want, not simply over numbers, as in other languages. This allows you to cut out a layer of indirection and make your code clearer and simpler.
So how can we do this in your case? Well, what you want to do is loop over the characters in one direction, and in the other direction - at the same time. We can do this nicely using the zip() and reversed() builtins. reversed() allows us to get the characters in the reversed direction, while zip() allows us to iterate over two iterators at once.
>>> a_string = "something"
>>> for first, second in zip(a_string, reversed(a_string)):
... print(first, second)
...
s g
o n
m i
e h
t t
h e
i m
n o
g s
This is the better way to loop over the characters in both directions at once. Naturally this isn't the most effective way of solving this problem, but it's a good example of how you should approach things differently in Python.

Building on Lattyware's answer - by using the Python builtins appropriately, you can avoid doing things like a_string[-(i+1)], which takes a second to understand - and, when writing more complex things than palindromes, is prone to off-by-one errors.
The trick is to tell Python what you mean to do, rather than how to achieve it - so, the most obvious way, per another answer, is to do one of the following:
s == s[::-1]
list(s) == list(reversed(s))
s == ''.join(reversed(s))
Or various other similar things. All of them say: "is the string equal to the string backwards?".
If for some reason you really do need the optimisation that you know you've got a palindrome once you're halfway (you usually shouldn't, but maybe you're dealing with extremely long strings), you can still do better than index arithmetic. You can start with:
halfway = len(s) // 2
(where // forces the result to an integer, even in Py3 or if you've done from __future__ import division). This leads to:
s[:halfway] == ''.join(reversed(s[halfway:]))
This will work for all even-length s, but fail for odd length s because the RHS will be one element longer. But, you don't care about the last character in it, since it is the middle character of the string - which doesn't affect its palindromeness. If you zip the two together, it will stop after the end of the short one - and you can compare a character at a time like in your original loop:
for f,b in zip(s[:half], reversed(s[half:])):
if f != b:
return False
return True
And you don't even need ''.join or list or such. But you can still do better - this kind of loop is so idiomatic that Python has a builtin function called all just to do it for you:
all(f == b for f,b in zip(s[:half], reversed(s[half:])))
Says 'all the characters in the front half of the list are the same as the ones in the back half of the list written backwards'.

One improvement I can see would be to use xrange instead of range.

It probably isn't a faster implementation but you could use a recursive test. Since you're learning, this construction is very useful in many situation :
def is_palindrome(word):
if len(word) < 2:
return True
if word[0] != word[-1]:
return False
return is_palindrome(word[1:-1])
Since this is a rather simple (light) function this construction might not be the fastest because of the overhead of calling the function multiple times, but in other cases where the computation are more intensive it can be a very effective construction.
Just my two cents.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Storing and evaluating nested string elements - python

Related

is there a way to order a set to it compare to the original string to test if the string has a letter twice or all of the letters are unique [duplicate]

Python: List item is empty, code to detect if it is and then put in a place holder value?

Implementing an algorithm to determine if a string has all unique characters [closed]

Using Python, How can I evaluate an expression in the form of prefix notation compacted?

What possible improvements can be made to a palindrome program?

Categories

Resources