Python regex issue with strings - python

Hi wonder if someone can help I've converted to Python from Perl and for the most part I love it. However I struggle with regex in Python this is not as strong or easy as perl for anyway. How do I use a list of exemption values(exemptions_list) to search another list which is being iterated in a for loop. Problem is that the values in the for loop are slightly different from the search exemptions.
i.e. one of the exemptions is the string "default" but the variable coming in to be search is default_10 or default_20. Likewise none is the search pattern but the share is called none_20 etc. I don't really want to iterate over the search patterns as I am already iterating over the shares which come from another subprocess output. So basically it never finds the string as it is looking for default_20 rather than default. How can break down the variable coming in from shared_list so that python uses default from the variable to search again the strings in the exemptions_list. The share variable is as stated generated differently for different systems subprocess output.
Many thanks
in Perl it would be easy.
if ( $share =~ /^.*_[\d\d]/ && $share !~ /$cust_id|$exemptions/ ) {
Python:
exemption_list = "none temp swap container"
shares_list [' this is dynamic and comes in with values such as none_20 temp_20, testtmp etc ]'
def process_share_information(shares_list, customer_id):
for share in shares_list:
share_match = re.search(share, exemption_list)
if not share_match:
print 'we have found a potentially bad share not in exemptions'

Strips last _\d\d from string
re.sub(r'_\d\d$', '', string)
So to check for exemption do
>>> re.sub(r'_\d\d$', '', "none_20") in exemption_list
True
If searched words are in more general format than name_\d\d, iterate over exemptions instead.
>>> exemptions = "none temp swap container".split()
>>> shares_list = "this is dynamic and comes in with values such as none_20 asdfnone anonea temp_20, testtmp etc"
>>> for e in exemptions:
... print(e)
... print(e in shares_list)
... print(re.findall(r'\b\S*?{}\S*?\b'.format(e), shares_list))
... print()
...
none
True
['none_20', 'asdfnone', 'anonea']
temp
True
['temp_20']
swap
False
[]
container
False
[]
Or if you only need one result for whole string
>>> any(e in shares_list for e in exemptions)
True

Related

I have this very long piece of code that defines a list using x[0]=2 and so on in python 3.7. Why doesn't it work?

This is what my code is at the moment and I was wondering if I can assign list values like this.(I'm making a quote generator.)
import random as random
quotes = []
authors = []
quotes[0] = "I have a new philosophy. I'm only going to dread one day at a time."
authors[0] = "Charles Schulz"
quotes[1] = "Reality is the leading cause of stress for those in touch with it."
...
authors[5577] = "Henry David Thoreau"
quotes[5578] = "Sometimes the cards we are dealt are not always fair. However you must keep smiling and moving on."
authors[5578] = "Tom Jackson"
x=random.randint(2,5577)
for y in range(0,5579,1):
print(quotes[y]+" "+author[y])```
You are getting index out of range error since you are trying to access elements from an empty array. You can either initialize the author and quotes list:
authors = [None] * 5579
quotes = [None] * 5579
or a better way to add elements to list would be using the append method.
authors = []
quotes = []
quotes.append("I have a new philosophy. I'm only going to dread one day at a time.")
authors.append("Charles Schulz")
...
quotes.append("Sometimes the cards we are dealt are not always fair. However you must keep smiling and moving on.")
authors.append("Tom Jackson")
for author, quote in zip(authors,quotes):
print("{} {}".format(author, quote))
As stated in the comments, if you create an empty list, you can't assign values with the [] operator, that's for referencing elements that already exist inside the list (so you can update or read them). For adding new values to an empty list we use the append method like this:
quotes = []
quotes.append("I have a new philosophy. I'm only going to dread one day at a time.")
print(quotes[0])
>>> "I have a new philosophy. I'm only going to dread one day at a time."
You can now modify it because it exists:
quotes[0] = "Reality is the leading cause of stress for those in touch with it."
print(quotes[0])
>>> "Reality is the leading cause of stress for those in touch with it."
If you try to access index 1 it will give you an IndexError because it only has index 0 (in my example).
From more on lists, the append() function suggests a roundabout alternative to what you're trying to do:
>>> quotes = []
>>> quotes[0:] = ["first quote"]
>>> quotes[1:] = ["second quote"]
>>> quotes
['first quote', 'second quote']
This requires that both left hand and right hand sides be lists, and makes sure that you can't access lists that haven't had anything assigned to them yet.
As I mentioned in the comment above, IndexError comes from the list not having that element, which prevents one from mistakenly doing something like quotes[55] and expecting it to work too early on.
You are doing quotes[0] and so on, while quotes being empty, does not contain an index 0, 1, etc. You should use append() function instead, to add elements to your list.
Or, if you really want to use quotes[0] and so on, then do
quotes = [None] * 5579
or
quotes = [''] * 5579
at the start of the program.

converting tuple to set for use in if-in statement in python?

i have a tuple which looks like this b (u'3.7', 9023). i want to use it in the following statement :
if list(self.ballot_number) == msg.ballot_number and b in waitfor:
print "hello"
i have checked and the ballotnumber section of the if condition is worrking fine. it's the second part that's not returning true. the waitfor set looks like this : set([((u'3.0', 9002), (u'3.1', 9005), (u'3.2', 9008), (u'3.3', 9011), (u'3.4', 9014), (u'3.5', 9017), (u'3.6', 9020), (u'3.7', 9023))]).
The value of tuple are there in the set but they are not able to match it probably because of different data types. i don't want to split the tuple into individual elements as i have to use it collectively later in the code. How can i run my if statement?
building of set
waitfor = set()
print "in scout"
for a in self.acceptors:
print "acceptor",a
a = tuple(tuple(p) for p in self.acceptors)
waitfor.add(a)
print "waitfor",waitfor
The problem is that you’re not building the set that it seems you think you’re building, and as a result it can’t be used the way you want to use it.
Your code does this:
waitfor = set()
print "in scout"
for a in self.acceptors:
print "acceptor",a
a = tuple(tuple(p) for p in self.acceptors)
waitfor.add(a)
print "waitfor",waitfor
So, for each acceptor, you’re not adding that acceptor to the set, you’re adding the tuple of all acceptors to the set. You do this over and over, but because it’s a set, and you’re adding the same tuple over and over, you end up with just one element, that big tuple of all of the acceptors. Which is exactly what you see—notice the extra parentheses in your output, and the fact that if you print out len(waitfor) it’s just 1.
And this means that none of the p values you later check with p in waitfor are going to be in waitfor, because the only thing that’s actually in it is the giant tuple that contains all those pairs, not any of the pairs itself.
It’s like adding “The State of California” to a phonebook millions of times, instead of adding the millions of Californians, and then asking “Is Jerry Brown in the phonebook?” No, he’s not. There’s no bug in how you’re searching the phonebook; the bug was in creating the phonebook. So that’s the part you need to fix.
So, what you want is:
waitfor = set()
print "in scout"
for a in self.acceptors:
print "acceptor",a
waitfor.add(tuple(a))
print "waitfor",waitfor
Or, more simply, this one-liner:
print “in scout”
waitfor = set(tuple(p) for p in self.acceptors)
print “waitfor”, waitfor
Or, if your version of Python is new enough for set comprehensions (I think that means 2.7, but don’t quote me on that), it’s slightly more readable:
print “in scout”
waitfor = {tuple(p) for p in self.acceptors}
print “waitfor”, waitfor
You've got too many brackets in your set, so it's only looking for a single element.
len(waitfor)
# 1
If you try:
waitfor = set([(u'3.0', 9002), (u'3.1', 9005), (u'3.2', 9008), (u'3.3', 9011), (u'3.4', 9014), (u'3.5', 9017), (u'3.6', 9020), (u'3.7', 9023)])
Then your test:
(u'3.7', 9023) in waitfor
# True
Will work!

How to implement the concept of cellular automata in python

I am fairly new at python (and programming in general, just started 2 months ago). I have been tasked with creating a program that takes a users starting string (i.e. "11001100") and prints each generation based off a set of rules. It then stops when it repeats the users starting string. However, I am clueless as to where to even begin. I vaguely understand the concept of cellular automata and therefore am at a loss as to how to implement it into a script.
Ideally, it would take the users input string "11001100" (gen0) and looks at the rule set I created and converts it so "11001100" would be "00110011" (gen1) and then converts it again to (gen3) and again to (gen4) until it is back to the original input the user provided (gen0). My rule set is below:
print("What is your starting string?")
SS = input()
gen = [SS]
while 1:
for i in range(len(SS)):
if gen[-1] in gen[:-2]:
break
for g in gen:
print(g)
newstate = {
#this is used to convert the string. we break up the users string into threes. i.e if user enters 11001100, we start with the left most digit "1" and look at its neighbors (x-1 and x+1) or in this case "0" and "1". Using these three numbers we compare it to the chart below:
'000': 1 ,
'001': 1 ,
'010': 0 ,
'011': 0 ,
'100': 1 ,
'101': 1 ,
'110': 0 ,
'111': 0 ,
}
I would greatly appreciate any help or further explanation/dummy proof explanation of how to get this working.
Assuming that newstate is a valid dict where the key/value pairs correspond with your state replacement (if you want 100 to convert to 011, newstate would have newstate['100'] == '011'), you can do list comprehensions on split strings:
changed = ''.join(newstate[c] for c in prev)
where prev is your previous state string. IE:
>>> newstate = {'1':'0','0':'1'}
>>> ''.join(newstate[c] for c in '0100101')
'1011010'
you can then use this list comp to change a string itself by calling itself in the list comprehension:
>>> changed = '1010101'
>>> changed = ''.join(newstate[c] for c in changed)
>>> changed
'0101010'
you have the basic flow down in your original code, you jsut need to refine it. The psuedo code would look something like:
newstate = dict with key\value mapping pairs
original = input
changed = original->after changing
while changed != original:
changed = changed->after changing
print changed
The easiest way to do this would be with the re.sub() method in the python regex module, re.
import re
def replace_rule(string, new, pattern):
return re.sub(pattern, new, string)
def replace_example(string):
pattern = r"100"
replace_with = "1"
return re.sub(pattern, replace_with, string)
replace_example("1009")
=> '19'
replace_example("1009100")
=> '191'
Regex is a way to match strings to certain regular patterns, and do certain operations on them, like sub, which finds and replaces patterns in strings. Here is a link: https://docs.python.org/3/library/re.html

IDLE settings change, comment key change

this is not a programming question but a question about the IDLE. Is it possible to change the comment block key from '#' to something else?
here is the part that is not going to work:
array = []
y = array.append(str(words2)) <-- another part of the program
Hash = y.count(#) <-- part that won't work
print("There are", Hash, "#'s")
No, that isn't specific to IDLE that is part of the language.
EDIT: I'm pretty sure you want to use
y.count('#') # note the quotes
Remember one of the strengths of Python is portability. Writing a program that would only work with your custom version of the interpreter would be removing the strengths of the language.
As a rule of thumb anytime you find yourself thinking that solution is to rewrite part of the language you might be heading in the wrong direction.
You need to call count on the string not the list:
array = []
y = array.append(str(words2)) <-- another part of the program
Hash = y[0].count('#') # note the quotes and calling count on an element of the list not the whole list
print("There are", Hash, "#'s")
with output:
>>> l = []
>>> l.append('#$%^&###%$^^')
>>> l
['#$%^&###%$^^']
>>> l.count('#')
0
>>> l[0].count('#')
4
count is looking for an exact match and '#$%^&###%$^^' != '#'. You can use it on a list like so:
>>> l =[]
>>> l.append('#')
>>> l.append('?')
>>> l.append('#')
>>> l.append('<')
>>> l.count('#')
2

Trying to generate all sentences of a simple formal grammar

I am new to python and trying to generate all sentences possible in the grammar.
Here is the grammar:
#set of non terminals
N = ('<subject>', '<predicate>', '<noun phrase>', '<noun>', '<article>', '<verb>', '<direct object>')
#set of teminals
T = ('the', 'boy', 'dog', 'bit')
#productions
P = [ ('Sigma', ['<subject>', '<predicate>']), \
('<subject>', ['<noun phrase>']), \
('<predicate>', ['<verb>']), \
('<predicate>', ['<verb>','<direct object>']), \
('<noun phrase>', ['<article>','<noun>']), \
('<direct object>', ['<noun phrase>']), \
('<noun>', ['boy']), \
('<noun>', ['dog']), \
('<article>', ['the']), \
('<verb>', ['bit']) ]
Here is my attempt, I am using a queue class to implement it methodically,
# language defined by the previous grammar.
Q = Queue()
Q.enqueue(['Sigma'])
found = 0
while 0 < len(Q):
print "One while loop done"
# Get the next sentential form
sf = Q.dequeue()
sf1 = [y for y in sf]
for production in P:
for i in range(len(sf1)):
if production[0] == sf1[i]:
sf[i:i+1] = [x for x in production[1]]
Q.enqueue(sf)
Q.printQ()
I am getting in infinite loop, and also I am facing some issue with shallow-deep copy, if I change one copy of sf, everything in queue changes too. Any help is appreciated, any directions, tips would be great
Here is the expected output:
The dog bit the boy
The boy bit the dog
The boy bit the boy
The dog bit the dog
The dog bit
The boy bit
I am facing some issue with shallow-deep copy, if I change one copy of sf, everything in queue changes too
Yes. In Python, a list is an object with its own identity. So:
Q.enqueue(['Sigma'])
creates a (one-element) list and enqueues a reference to it.
sf = Q.dequeue()
pops that reference from Q and assigns it to variable 'sf'.
sf[i:i+1] = ...
makes a change to that list (the one that 'sf' refers to).
Q.enqueue(sf)
enqueues a reference to that same list.
So there's only one list object involved, and Q just contains multiple references to it.
Instead, you presumably want each entry in Q to be a reference to a separate list (sentential form), so you have to create a new list for each call to Q.enqueue.
Depending on how you fix that, there might or might not be other problems in the code. Consider:
(1) Each sentence has multiple derivations, and you only need to 'find' one (e.g., the leftmost derivation).
(2) In general, though not in your example grammar, a production's RHS might have more than one occurrence of a non-terminal (e.g. if COND then STMT else STMT), and those occurrences need not derive the same sub-forms.
(3) In general, a grammar can generate an infinite set of sentences.
By the way, to copy a list in Python, instead of saying
copy = [x for x in original]
it's simpler to say:
copy = original[:]
I created a simple grammar that allows to specify different sentences in terms of alternatives and options. Sentences that are described with that grammar can be parsed. The attributed grammar is described using Coco/R for which there is a python version (http://www.ssw.uni-linz.ac.at/Coco/#Others). I am more familiar with C# so I created a C# project here that can work as an example for you: https://github.com/abeham/Sentence-Generator.
For instance, parsing "(This | That) is a [nice] sentence" with the parser of that simple grammar creates four sentences:
* This is a sentence
* This is a nice sentence
* That is a sentence
* That is a nice sentence
Only finite sentences can be created with that grammar since there is no symbol for repetition.
I know that there already exists an accepted answer, but I hope that this answer will also be of value to those, like me, that arrived here looking for a generic solution. At least I didn't find anything like that on the web, which is why I created the github project.

Categories

Resources