Parsing python files using ANTLR - python

Using these steps I'm trying to generate the parse tree for Antlr4 Python3.g4 grammar file, to parse python3 code, I've generated my python parser using ANTLR. But I'm unsure how to pass in a python file as the InputStream doesn't accept this.
I've currently managed to pass it in as a text file:
def main():
with open('testingText.txt') as file:
data = file.read()
input_stream = InputStream(data)
lexer = Python3Lexer(input_stream)
stream = CommonTokenStream(lexer)
parser = Python3Parser(stream)
tree = parser.single_input()
print(tree.toStringTree(recog=parser))
But I get errors to do with 'mismatched input <EOF>' and sometimes 'no viable alternative as input <EOF>'
I would like to pass in a .py file, and I'm not sure what to do about the <EOF> issues

To be sure, I'd need to know what testingText.txt contains, but I'm pretty sure that the parser expects a line break at the end of the file and testingText.txt does not contains a trailing line break.
You could try this:
with open('testingText.txt') as file:
data = f'{file.read()}\n'
EDIT
And if testingText.txt contains:
class myClass:
x=5
print("hello world")
parse it like this:
from antlr4 import *
from Python3Lexer import Python3Lexer
from Python3Parser import Python3Parser
def main():
with open('testingText.txt') as file:
data = f'{file.read()}\n'
input_stream = InputStream(data)
lexer = Python3Lexer(input_stream)
stream = CommonTokenStream(lexer)
parser = Python3Parser(stream)
tree = parser.file_input()
print(tree.toStringTree(recog=parser))
if __name__ == '__main__':
main()
E.g. use tree = parser.file_input() and not tree = parser.single_input().

Related

Applying python script to file

This is a simple ask but I can't find any information on how to do it: I have a python script that is designed to take in a text file of a specific format and perform functions on it--how do I pipe a test file into the python script such that it is recognized as input()? More specifically, the Python is derived from skeleton code I was given that looks like this:
def main():
N = int(input())
lst = [[int(i) for i in input().split()] for _ in range(N)]
intervals = solve(N, lst)
print_solution(intervals)
if __name__ == '__main__':
main()
I just need to understand how to, from the terminal, input one of my test files to this script (and see the print_solution output)
Use the fileinput module
input.txt
...input.txt contents
script.py
#!/usr/bin/python3
import fileinput
def main():
for line in fileinput.input():
print(line)
if __name__ == '__main__':
main()
pipe / input examples:
$ cat input.txt | ./script.py
...input.txt contents
$ ./script.py < input.txt
...input.txt contents
You can take absolute or relative path in your input() function and then open this path via open()
filename = input('Please input absolute filename: ')
with open(filename, 'r') as file:
# Do your stuff
Please let me know if I misunderstood your question.
You can either:
A) Use sys.stdin (import sys at the top of course)
or
B) Use the ArgumentParser (from argparse import ArgumentParser) and pass the file as an argument.
Assuming A it would look something like this:
python script.py < file.extension
Then in the script it would look like:
fData = []
for line in sys.stdin.readLines():
fData.append(line)
# manipulate fData
There are a number of ways to achieve what you want. This is what I came up with off the top of my head. It may not be the best / efficient way, but it should work. I do a lot of file I/O with python at work and this is one of the ways I've achieved it in the past.
Note: If you want to write the manipulated lines back to the file use the argparse library.
Edit:
from argparse import ArgumentParser
def parseInput():
parser = ArgumentParser(description = "Takes input file to read")
parser.add_argument('-f', type = str, default = None, required =
True, help = "File to perform I/O on")
args = parser.parse_args()
return args
def main():
args = parseInput()
fData = []
# perform rb
with open(args.f, 'r') as f:
for line in f.readlines():
fData.append(line)
# Perform data manipulations
# perform wb
with open(args.f, 'w') as f:
for line in fData:
f.write(line)
if __name__ == "__main__":
main()
Then on command line it would look like:
python yourScript.py -f fileToInput.extension

Reading config file that does not begin with section

I have config file that looks something like:
#some text
host=abc
context=/abc
user=abc
pw=abc
#some text
[dev]
host=abc
context=/abc
user=abc
pw=abc
[acc]
host=abc
context=/abc
user=abc
pw=abc
I would like to parse the cfg file with ConfigParser in Python 2.7. The problem is that the cfg file does not start with the section. I cannot delete the text lines before the sections. Is there any workaround for that?
Inject a section header of your choice.
import ConfigParser
import StringIO
def add_header(cfg, hdr="DEFAULT"):
s = StringIO.StringIO()
s.write("[{}]\n".format(hdr))
for line in cfg:
s.write(line)
s.seek(0)
return s
parser = ConfigParser.ConfigParser()
with open('file.cfg') as cfg:
parser.readfp(add_header(cfg, "foo"))

pickle is not working in a proper way

import nltk
import pickle
input_file=open('file.txt', 'r')
input_datafile=open('newskills1.txt', 'r')
string=input_file.read()
fp=(input_datafile.read().splitlines())
def extract_skills(string):
skills=pickle.load(fp)
skill_set=[]
for skill in skills:
skill= ''+skill+''
if skill.lower() in string:
skill_set.append(skill)
return skill_set
if __name__ == '__main__':
skills= extract_skills(string)
print(skills)
I want to print the skills from file but, here pickle is not working.
It shows the error:
_pickle.UnpicklingError: the STRING opcode argument must be quoted
The file containing the pickled data must be written and read as a binary file. See the documentation for examples.
Your extraction function should look like:
def extract_skills(path):
with open(path, 'rb') as inputFile:
skills = pickle.load(inputFile)
Of course, you will need to dump your data into a file open as binary as well:
def save_skills(path, skills):
with open(path, 'wb') as outputFile:
pickle.dump(outputFile, skills)
Additionally, the logic of your main seems a bit flawed.
While the code that follows if __name__ == '__main__' is only executed when the script is run as main module, the code that is not in the main should only be static, ie definitions.
Basically, your script should not do anything, unless run as main.
Here is a cleaner version.
import pickle
def extract_skills(path):
...
def save_skills(path, skills):
...
if __name__ == '__main__':
inputPath = "skills_input.pickle"
outputPath = "skills_output.pickle"
skills = extract_skills(inputPath)
# Modify skills
save_skills(outputPath, skills)

Running python script from console

Totally new to python, and i want to do the following:
I have this code:
def assem(myFile):
print "Hello ,World!"
import myParser
from myParser import Parser
import code
import symboleTable
from symboleTable import SymboleTable
newFile = "Prog.hack"
output = open(newFile, 'w')
input = open(myFile, 'r')
prsr=Parser(input)
while prsr.hasMoreCommands():
str = "BLANK"
if(parser.commandType() == Parser.C_COMMAND):
str="111"+code.comp(prsr.comp())+code.dest(prsr.dest())+code.jump(prsr.jump())+"\n"
output.write(str)
prsr.advance()
checked the indentation, its ok,its a bit messy over here.
this program needs to run from console and receive a file named Add.asm
what is the console command to make it run?
tried :
python assembler.py Add.asm
did not work.
any idea?
optparse is indeed what you will need for more advanced cl options. However, you could python assembler.py <filename> with a simple if __name__ == "__main__" block. In lieu of argparse or optparse, you can use sys.argv[1] for a single simple argument to the script.
def assem(myFile):
print "Hello ,World!"
import myParser
from myParser import Parser
import code
import symboleTable
from symboleTable import SymboleTable
newFile = "Prog.hack"
output = open(newFile, 'w')
input = open(myFile, 'r')
prsr = Parser(input)
while prsr.hasMoreCommands():
str = "BLANK"
if(parser.commandType() == Parser.C_COMMAND):
str= "111" + code.comp(prsr.comp()
) + code.dest(prsr.dest()) + code.jump(prsr.jump()
) + "\n"
output.write(str)
prsr.advance()
if __name__ == "__main__":
import sys
assem(sys.argv[1])
You'll also want to google python string formatting and find links like http://docs.python.org/library/stdtypes.html#string-formatting
this is what you are looking for: http://docs.python.org/library/optparse.html

Pythonic equivalent of ./foo.py < bar.png

I've got a Python program that reads from sys.stdin, so I can call it with ./foo.py < bar.png. How do I test this code from within another Python module? That is, how do I set stdin to point to the contents of a file while running the test script? I don't want to do something like ./test.py < test.png. I don't think I can use fileinput, because the input is binary, and I only want to handle a single file. The file is opened using Image.open(sys.stdin) from PIL.
You should generalise your script so that it can be invoked from the test script, in addition to being used as a standalone program. Here's an example script that does this:
#! /usr/bin/python
import sys
def read_input_from(file):
print file.read(),
if __name__ == "__main__":
if len(sys.argv) > 1:
# filename supplied, so read input from that
filename = sys.argv[1]
file = open(filename)
else:
# no filename supplied, so read from stdin
file = sys.stdin
read_input_from(file)
If this is called with a filename, the contents of that file will be displayed. Otherwise, input read from stdin will be displayed. (Being able to pass a filename on the command line might be a useful improvement for your foo.py script.)
In the test script you can now invoke the function in foo.py with a file, for example:
#! /usr/bin/python
import foo
file = open("testfile", "rb")
foo.read_input_from(file)
Your function or class should accept a stream instead of choosing which stream to use.
Your main function will choose sys.stdin.
Your test method will probably choose a StringIO instance or a test file.
The program:
# foo.py
import sys
from PIL import Image
def foo(stream):
im = Image.open(stream)
# ...
def main():
foo(sys.stdin)
if __name__ == "__main__":
main()
The test:
# test.py
import StringIO, unittest
import foo
class FooTest(unittest.TestCase):
def test_foo(self):
input_data = "...."
input_stream = StringIO.StringIO(input_data)
# can use a test file instead:
# input_stream = open("test_file", "rb")
result = foo.foo(input_stream)
# asserts on result
if __name__ == "__main__":
unittest.main()
A comp.lang.python post showed the way: Substitute a StringIO() object for sys.stdout, and then get the output with getvalue():
def setUp(self):
"""Set stdin and stdout."""
self.stdin_backup = sys.stdin
self.stdout_backup = sys.stdout
self.output_stream = StringIO()
sys.stdout = self.output_stream
self.output_file = None
def test_standard_file(self):
sys.stdin = open(EXAMPLE_PATH)
foo.main()
self.assertNotEqual(
self.output_stream.getvalue(),
'')
def tearDown(self):
"""Restore stdin and stdout."""
sys.stdin = self.stdin_backup
sys.stdout = self.stdout_backup
You can always monkey patch Your stdin. But it is quite ugly way. So better is to generalize Your script as Richard suggested.
import sys
import StringIO
mockin = StringIO.StringIO()
mockin.write("foo")
mockin.flush()
mockin.seek(0)
setattr(sys, 'stdin', mockin)
def read_stdin():
f = sys.stdin
result = f.read()
f.close()
return result
print read_stdin()
Also, do not forget to restore stdin when tearing down your test.

Categories

Resources