Add own realtime custom parser to Python to generate and compile AST

Add own realtime custom parser to Python to generate and compile AST - python

My task is to add switch statement and remove mandatory colons from functions, classes, loops in Python.
Maybe to add some other nice features from Coffeescript.
The .py files with custom syntax must be imported with python interpreter, than parsed with a custom parser (just like Coffeescript compiler does).
(I already had a little experience in writing Python-like "for" syntax to already created custom parser, corrected several bugs. But it takes a long time to read all code and get it. So I decided to ask advice first.)
I searched a long time through internet, found several helpful answers, but still don't know how to implement it better.
Some from what I found:
Parse a .py file, read the AST, modify it, then write back the modified source code
Python's tokenize module
Python's ast module
Python's c-like preprocessor with import hook
What I think to do:
Rewrite Coffeescript parser or Python parser into pure Python
Make import hook to parse files to AST by my own parser.
Continue import (compile AST and import it to module)
(like Coffeescript does it)
So I have such questions:
- Is there a Python parser written in Python (not to rewrite all Coffeescript parser) ?
- Maybe is there any way to make ast.AST class frow own parser not rewriting ast library from C into Python ?
- How can I do it better and easier ? (except modifying Python's sources, all must be done in runtime and be totally compatible with all other Python interpreters)
- Maybe there are already some libraries, that help modifying Python's syntax ?
Thank you very much.
Best regards, Serj.

Related

Is using Python f-strings effective for generating Python code?

I have not found any related questions or examples of Python code using f-strings to generate Python code. Is there an underlying problem that I should know about? f-strings are really convenient and seem to be rather efficient for my needs.
I am generating some python scripts that can be used in the command line for automatically processing folders with remote sensing images. I was going to manually write some files by hand but realized is was relatively easy to automate the process by externalizing metadata regarding the expressions.
Program logic:
Extract expressions containing information on the different types of calculations that could be performed on the images (any sort of remote sensing indice)
Iterate through expressions
Create file with expression name (and other standards)
Insert expression information into f-strings
Write f-string to create file
I will also generate some tests automatically once settled on the method to be used. Is there a limit from which f-strings will not handle the code efficiently?
Some people have discussed using Python templates like Jinja2. However, if f-strings are sufficient I do not wish to integrate another external dependency.
from expressions_meta import expressions
for key in expressions.keys():
file_name = '_'.join([key, 'dir', 'cl.py'])
with open(file_name, 'w') as f:
f.write(f"""
import sys
import getopt
from gdal_dir_calc import GDALDirCalc
expression = {expressions[key]}
band_meta = {{}}
[...]
gdal_dir_obj.main()
""")
I might just be overly cautious but I think the topic could address other applications as well.
Any other tips regarding the use of f-strings for Python code generation or another tool?

If you are coming across this question, be sure to review the design of your program.
Most likely you are trying to implement a solution in which you violate DRY principles.
For command line applications:
Instead of generating many specific commands, look into to passing a name argument which in turn can be used with a selection of arguments.
From the standard library, some tools that may be helpful are configparser, shlex, cmd, getopt and argparse. See the standard library documentation on these tools.
Click is an interesting third party package.
Thanks #SergeBallesta for your helpful comments.

Converting Python AST/code into a CFG in Python

From my research it seems the main CFG generator for Python code in Python is the PyPy Flow Model (http://doc.pypy.org/en/latest/objspace.html#the-flow-model) but it seems to have the limitations which come from using RPython. Most of the limitations do not hinder CFG generation but there are a few which do, such as restricting for loops to built-in types and the generator limitations.
CPython appears to generate a 'complete' CFG in compile.c which it then emits its bytecode from. However this is all done in C, while my work so far is all done in Python. There also seems to be a bit of hacking required to extract the CFG from CPython.
Is there no 'complete' CFG generator for Python code implemented in Python?
I am not completely against the idea of building one myself which ties in with the ast module, but I would like to make sure I know all of my options before I dive into that.

What is the System() import in Python

Long story short, a piece of code that I'm working with at work has the line:
from System import System
with a later bit of code of:
desc_ = System()
xmlParser = Parser(desc_.getDocument())
# xmlParser.setEntityBase(self.dtdBase)
for featureXMLfile in featureXmlList.split(","):
print featureXMLfile
xmlParser.parse(featureXMLfile)
feat = desc_.get(featureName)
return feat
Parser is an XML parser in Java (it's included in a different import), but I don't get what the desc_ bit is doing. I mean obviously, it somehow holds the feature that we're trying to pull out, but I don't entirely see where. Is System a standard library in Python or Java, or am I looking at something custom?
Unfortunately, everyone else in my group is out for Christmas Eve vacation, so I can't ask them directly. Thank you for your help. I'm still not horribly familiar with Python.

This isn't from the standard library, so you'll need to check your system (Python has plenty of introspection to help you with that).
You can tell as Python modules in the standard library use lowercase names as per PEP-8, or by searching the library reference.
Note as well that Python has it's own XML parsing tools that will be much nicer to work with in Python than Java's.
Edit: As you have noted in the comments you are using Jython, it seems likely this is Java's System package.

millimoose indicated the correct answer in his comment, but neglected to submit it as an answer, so I'm posting to indicate the correct answer. It was indeed a custom module built by my company. I was able to determine this by typing import System; print(System) into the interpreter.

ISO human-readable parser for Python in Python

I'm looking for a parser for Python (preferably v. 2.7) written in human-readable Python. Performance or flexibility are not important. The accuracy/correctness of the parsing and the clarity of the parser's code are far more important considerations here.
Searching online I've found a few parser generators that generate human-readable Python code, but I have not found the corresponding Python grammar to go with any of them (from what I could see, they all follow different grammar specification conventions). At any rate, even if I could find a suitable parser-generator/Python grammar combo, a readily available Python parser that fits my requirements (human-readable Python code) is naturally far more preferable.
Any suggestions?
Thanks!

PyPy is a Python implementation written entirely in Python. I am not an expert, but here's the link to their parser which - obviously - has been written in Python itself:
https://bitbucket.org/pypy/pypy/src/819faa2129a8/pypy/interpreter/pyparser

I think you should invest your effort in ast. An excerpt from the python docs.
The ast module helps Python applications to process trees of the
Python abstract syntax grammar. The abstract syntax itself might
change with each Python release; this module helps to find out
programmatically what the current grammar looks like.

Python parser Module tutorial

I am writing an application which reads an input file that currently has its own grammar, which is processed by lex/yacc.
I'm looking to modify this so as to make this input file a Python script instead, and was wondering if someone can point me to a beginner's guide to using the parser module in Python. I'm fairly new to Python itself, but have worked through a fair chunk of the online tutorial.
From what I have researched, I know there are options (such as pyparsing) which can allow me to keep the existing grammar and use Pyparsing as a replacement for lex/yacc. However, I am curious to learn the Python parser module in more detail and explore its feasibility.
Thanks.

You mean the parser module? It's a parser for Python source code only, not a general purpose parser. You can't use it to parse anything else.

As Jochen said, the parser module is for parsing Python code. I think you're best off checking out Ned Batchelder's list of parsers. PyParsing does things pretty differently from Lex and Yacc, so I'm not sure why you think you could keep your existing grammar and lexer. A better bet might be David Beazley's PLY toolkit. It's solid and has excellent documentation.

I recommend that you check out https://github.com/erezsh/lark
It's great for newcomers to parsing: It can parse ALL context-free grammars, it automatically builds an AST (with line & column numbers), and it accepts the grammar in EBNF format, which is considered the standard and is very easy to write.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Add own realtime custom parser to Python to generate and compile AST - python

Related

Is using Python f-strings effective for generating Python code?

Converting Python AST/code into a CFG in Python

What is the System() import in Python

ISO human-readable parser for Python in Python

Python parser Module tutorial

Categories

Resources