C++ Transpose a Python regular expression into PCRE - python

I need to transpose a regular expression I wrote in Python into C++ using PCRE cpp wrapper.
My original python code does the following:
self.reg = re.compile('(?<![/,\-\s])\s+(?![/,\-\s])')
myfields = self.reg.split(line_of_text)
...
I tried to create a pcrecpp reg exp as follow:
pcrecpp::RE reg("(?<![/,\\-\\s])\\s+(?![/,\\-\\s])");
But it doesn't work. I mean PartialMatch() and FullMatch() do not work.
Moreover, I didn't find yet a method doing something similar to python re.split().
I'm not very experienced with PCRE. Is there a specific syntax ?
Any feedback ?
Thanks.
z.

The pcrecpp::RE class uses / as a delimiter ( I believe.. ). The syntax is pretty similar to Perl's.
So you most likely need to escape the forward slash to fix your problem.
pcrecpp::RE re("(?<![\\/\\s,-])\\s+(?![\\/\\s,-])").PartialMatch("foo bar")

escape the forward slashes.
Like this:
(?<![\/,\-\s])\s+(?![\/,\-\s])

Related

Avoid syntax error using hyphens in Python

I am writing a python script to automate some simulations with Aspen Plus via its COM functions. But when I want to get access to molecular weights values, I have to write something like this:
import os
import win32com.client as win32
aspen = win32.Dispatch('Apwn.Document')
aspen.InitFromArchive2(os.path.abspath('Aspen\\Flash.bkp'))
MW = aspen.Tree.Data.Properties.Parameters.Pure Components.REVIEW-1.Input.VALUE.MW ACID.Value
But it launchs a syntax error in REVIEW-1, due to hyphens can not be used as identifiers. How can I use them like that?
EDIT:
I replaced dot synax for FindNode function of Aspen COM like that:
MW = aspen.Tree.FindNode("\\Data\\Properties\\Parameters\\Pure Components\\REVIEW-1")
But I still get a None object, however this:
MW = aspen.Tree.FindNode("\\Data\\Properties\\Parameters\\Pure Components")
works, getting the "COMObject FindNode" so I think that the problem is in the hyphen as well.
Thanks in advance!
Thanks for the tip
For cases of having hyphens, the following should work instead of escaping the "\" character:
MW = aspen.Tree.FindNode(r'\Data\Properties\Parameters\Pure Components\REVIEW-1\Input\VALUE')
Ok, I was breaking my head trying to solve it in Python, but was easier solving it in Aspen renaming the node. I've noticed, also, that spaces sometimes give problems too, so should be renamed as well. In some cases it can't be done or I don't know how, for example:
MW = aspen.Tree.FindNode("\\Data\\Properties\\Parameters\\Pure Components\\REVIEW1\\Input\\VALUE\\MW ACID")
It returns a None object and I don't know hot to rename "MW ACID", but there is a tricky way to get the value:
MW = aspen.Tree.FindNode("\\Data\\Properties\\Parameters\\Pure Components\\REVIEW1\\Input\\VALUE")
for o in MW.Elements:
if o.Name == "MW ACID":
MW_acid = o.Value
For now it works for me, however it will be slower due to iteration. So if someone knows how to solve the problem in Python without renaming names, it will be still helpful. I tried to use unicode and bytes notation for non-breaking hyphen but it didn't works too.
Regards!

Replace method in Python and \

I am using replace string method in Python and I am finding something that I cannot understand.
Changing the way that a folder is written in python to windows notation, I find that replace method will change this double / for a double \ instead of just one \ as intended.
folder_im_wdows = folder_im_wdows.replace("//","\\")
But the most impressive, is that when I try a workaround doing the next
folder_im_wdows = folder_im_wdows.replace("//",chr(92))
Python does the same...
The original variable is: //xxxxx//xxxx//xxxx//xxxx//xxx//xxxxx
And I want to get -> \xxx\x\x\x
What's happening with replace method?
This is because python's CLI escapes backslashes.
Example from python's CLI:
>>> str = "abc//def//fgh"
>>> str.replace("//", "\\")
'abc\\def\\fgh'
>>> print(str.replace("//", "\\"))
abc\def\fgh
>>>
Also, you should need to use \\ and not only \, because you need to escape the backslash character, well, I do.
Use os.path for working with path names:
import os
os.path.normpath('C:/Users/Bob/My Documents')
os.path.abspath would do the job too (it uses os.path.normpath)
Note: requires host to be windows, if that's not the case you can use ntpath.normpath directly
https://docs.python.org/library/os.path.html#os.path.normpath
Avoid regexes, replaces and all that. You're going to get it wrong in some subtle way.

Sublime Text syntax: Python 3.6 f-strings

I am trying to modify the default Python.sublime_syntax file to handle Python’s f-string literals properly. My goal is to have expressions in interpolated strings recognised as such:
f"hello {person.name if person else 'there'}"
-----------source.python----------
------string.quoted.double.block.python------
Within f-strings, ranges of text between a single { and another } (but terminating before format specifiers such as !r}, :<5}, etc—see PEP 498) should be recognised as expressions. As far as I know, that might look a little like this:
...
string:
- match: "(?<=[^\{]\{)[^\{].*)(?=(!(s|r|a))?(:.*)?\})" # I'll need a better regex
push: expressions
However, upon inspecting the build-in Python.sublime_syntax file, the string contexts especially are to unwieldy to even approach (~480 lines?) and I have no idea how to begin. Thanks heaps for any info.
There was an update to syntax highlighting in BUILD 3127 (Which includes: Significant improvements to Python syntax highlighting).
However, a couple users have stated that in BUILD 3176 syntax highlighting still is not set to correctly highlight Python expressions that are located within f strings. According to #Jollywatt, it is set to source.python f"string.quoted.double.block {constant.other.placeholder}" rather than f"string.quoted.double.block {source.python}"
It looks like Sublime uses this tool, PackageDev, "to ease the creation of snippets, syntax definitions, etc. for Sublime Text."

Python strip unexpected behavior

I was stripping a file name in python for routing purposes and I was getting some unexpected behavior with the python strip function. I've read the docs and searched online but have not been able to find an explanation for the following behavior:
"Getting-Started.md".strip('.md')
Out[29]: 'Getting-Starte'
But if it is any other character aside from 'd' to the left of the period, it works properly:
"Getting-StarteX.md".strip('.md')
Out[30]: 'Getting-StarteX'
It seems like there is something similar to a mirroring going on 'd. md'. I'm doing a double strip to get by this for now, but I was just curious of why this occurs.
Thank you.
strip() would strip all the characters provided in the argument - in your case ., m and d.
Instead, you can use os.path.splitext():
import os
os.path.splitext("Getting-StarteX.md")[0]
If there is only one ".md" appearing at the end of the testing string, you can also use
"Getting-Started.md".split('.md')[0]
Thanks #Carpetsmoker remind me the assumption.

How do you do list slicing equivalent to python in Ruby?

I am tryiing to port some python code to ruby, and I am doing pretty well, using equivelent ruby functions, even removing / altering some to use ruby features more.
However at a core point I need to get slices from an array
in python the following works fine:
output=["Apple","Orange","Pear"]
team_slices=[(0,1),(1,2),(2,3)]
for start,end in team_slices:
print output[start:end]
Will output as expected:
['Apple']
['Orange']
['Pear']
Whereas the ruby code:
output=["Apple","Orange","Pear"]
team_slices=[[0,1],[1,2],[2,3]]
team_slices.each do |start,ending|
print output[start..ending]
end
Will output:
["Apple","Orange"]
["Orange","Pear"]
["Pear"]
Is there any way to do the slicing more equivalent to python? I know I am likely missing somethign simple here
Seems like python's ranges exclude the end value, so just use the ... variant in ruby:
output=["Apple","Orange","Pear"]
team_slices=[[0,1],[1,2],[2,3]]
team_slices.each do |start, last|
print output[start...last]
end
PS: In ruby you should use 2 spaces for indentation, if you want to stick to conventions ;)
EDIT| Had to rename end to last due to ruby using it as a syntactical keyword.

Categories

Resources