python static code analysis tools - code analysis (preliminary research question) [closed]

python static code analysis tools - code analysis (preliminary research question) [closed] - python

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
Disclaimer:
I've just started researching this area/domain of knowledge; so I have no idea what exactly it's called; but through a google search, I believe it has to do with (static code analysis, or at least it's related to it).
My question is:
Given a python code - file - script - module - package. Is there a tool that can produce a report out of it detailing:
how many classes are used, functions, built-in functions; decorators ;if/for/while statements etc?
To give you an analogy most of us can relate to:
Given a text file: find all the verbs / nouns / adjectives / adverbs / proper noun.
NLP tools like spaCy or NLTK have the ability to do that for natural languages.
But what about programming languages? Is there a tool for that?
Can a tool like pylint do that?
UPDATE
As I expected such tools exist; one of them as #BoarGules suggested in his comment is the ast module ... It's the hint I needed to go further in my research; any further suggestions are welcome. BTW ast stands for abstract syntax tree.

Given a python code - file - script - module - package. Is there a tool that can produce a report out of it detailing: how many classes are used...
There cannot be an exact tool for that, since Python has an eval primitive.
When that primitive is executed, the set of classes or functions of your Python program can increase.
Be aware of Rice's theorem.
Consider using abstract interpretation and type inference techniques in your Python static analyzer.
Consider also using (painfully) Frama-C on the source code (the code written in C) of the Python interpreter. With a lot of work, Frama-C could be extended to analyze Python source code.
(but someone needs to do that work, or to pay for it)
Read also recent proceedings of ACM SIGPLAN conferences.

Related

Why python can't be built and maintained by many people or over a long period of time? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
Following statement has been taken from Introduction to Computation and Programming Using Python by John Guttag
Python is a general-purpose programming language that can be used effectively
to build almost any kind of program that does not need direct access to the
computer’s hardware. Python is not optimal for programs that have high
reliability constraints (because of its weak static semantic checking) or that are
built and maintained by many people or over a long period of time (again
because of the weak static semantic checking).
The sentence in bold seems very vague , can anyone provide good explanation or example . ?

Python uses dynamic typing. That is, you can only know the type of an object with certainty at runtime.
A consequence of this is that the only way to know a piece of code uses the right data types is to run it. Thus, testing becomes very important. But testing all the code paths in a program can take a long time.
The problem is exacerbated when many people work on a program for a long time since it's hard to get devs to write documentation and to build consistent interfaces, so you end up having a limited number of people who understand what types should be used and everyone spends a lot of time waiting for tests to run.
Still, the author's view is overly pessimistic. Many companies have large Python codebases and are able to do so by having extensive, and expensive, test suites and oncall rapid response teams.
For instance, Facebook has hundreds of millions of lines of code of which 21% are Python (as of 2016). With this level of preexisting investment, in the short term it is much cheaper to develop ways of making Python safer than to migrate code to a new language (like Julia or Rust).
And we can see this in the ecosystem.
Python has been modified to address the typing problem through the introduction of type annotations. While these are not enforced at runtime they provide a fast (realtime) check of type safety that significantly reduces the need to rely on tests and can be used to enforce interfaces using tools like Pyre and mypy. This makes having a large Python codebase of the sort the author discusses much more manageable.

How to build a Full C Parser using pyparsing? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am trying to build a full C Parser using pyparsing.
Actually what I want for my project is to identify certain lines of code in a C Program of interest to me. Eg. Complex Assignment instructions with typecasting, pointer dereference etc.
I thought, since I am investing the effort, I will implement the Full C Grammar in pyparsing, and use just what I need.
I referred to this C Grammar for YACC and wrote it according to pyparsing (to the best of my limited understanding of pyparsing).
http://www.lysator.liu.se/c/ANSI-C-grammar-y.html#translation-unit
What I get however is that pyparsing gets stuck in an infinite loop. I have uploaded the python code here.
https://gist.github.com/gkernel/18cd1d38376d07db989a
I need help in this. Please also tell me an alternative approach to solve my problem if you know any.
EDIT:
To be clear, there could be a bug in the code, but I have already invested effort in checking that I have written the correct grammar. I basically want to ask if pyparsing can be used for something as complicated as this.
One of the things I have done is Forward() declare all the non-terminals in the grammar, and I want to know if this is the right approach. I did this because Python would complain of some names being undefined.

As far as I know, pyparsing creates recursive-descent grammars. Recursive-descent grammars will go into an infinite loop if presented with a left-recursive grammar, and it is most likely that the rather ancient C grammar you unearthed (and any more modern C grammar) will be left-recursive, since such grammars are easier to write and are acceptable input to LALR(1) and GLR parser generators, like bison.
C is not an easy language to parse, and more so if you don't understand the basics of parsing theory. If your goal is to learn parsing theory, I'd suggest that you try a simpler language. If your only goal is to parse C, as indicated in your question, then I'd suggest you use one of the available tools; both gcc and clang come with (unfortunately underdocumented) mechanisms to access the parse tree for a C program, and there are commercial products as well if you have a budget.

Is it possible to translate a clojure syntax into python syntax? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I have been looking around but haven't found an example of this. I'd like to write out a few long/tedious python scripts using Clojure. Just because I happen to enjoy Clojure a bit more and they are not full on programs.
This site makes me think it is possible:
http://jkkramer.com/sudoku.html
For example if I have script.clj, I'd like to be able to convert it to script.py - not by hand of course.
Is it possible to do this? If so, what tool/library/script should I use? If its not possible not, why not?
[Edit] I edited this because the wording mistakenly gave the impression I was looking for a detailed lesson on writing my own solution. I was just curious if the tools were out there to answer my question and if not then why not.

Yes. Write a compiler that takes Clojure syntax and outputs valid Python syntax.
How to do that is well outside of the ability/scope of a StackOverflow answer.
Also note that if you do this for the general case of compiling any piece of Clojure code to Python you will have implemented quite a bit of Clojure in Python (especially when you implement defmacro and generic methods).

You actually don't have to do a source to source translation in order to write Clojure that will interact with python libraries. Just see clojure-py which allows you to write regular Clojure syntax and run it under the Python interpreter and call Python libraries.

Python idle syntax highlighting [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I'm using idle on windows 7 as I've just come from a mac and the text editor I was using highlighted different keywords then what idle does. I know that I can change the colour of the current syntax like print and def but can I add other keywords to highlight as well? Thanks

Unfortunately, I don't think Idle is extensible in that way, without hacking on its source code. I believe that it currently highlights only a specific set of names (plus other easily identifiable things like string literals).
It highlights in orange (by default) all of the keywords of the Python language.
It highlights in magenta all of the built-in functions, types and other objects that are available from the standard library without doing any import statements. (You can see a list of them by running dir(__builtins__) in a Python interpreter, or by browsing sections 2-6 of the Library Reference.)
Idle does not do much code analysis. This means that it can't tell what most other names represent. It can't give specific highlighting colors to, for example, class names, because there's no requirement for them to be named in any particular way. Does foo in your code refer to a class, a module, a function or something else? Idle can't tell.
If you want more serious highlighting, you may need to find a more sophisticated IDE. I've recently been pretty happy with Spyder (though I'm not sure if its syntax highlighting is any more capable than Idle's), and there are lots of others. The official Python wiki has a list of IDEs which might help you find the one that is best for you.

Is there a standard lexer/parser tool for Python? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
A volunteer job requires us to convert a large number of LaTeX documents into ePub file format. It's a series of open-source fiction book which has so far only been produced only on paper via a print on demand service. We'd like to be able to offer the book to users of book-reader devices (such as Kindle) which require the ePub format for best results.
Fortunately, ePub is a very simple format, however there's no trivial way for LaTeX to produce the XHTML output required.
We experimented with alternative LaTeX compilers (e.g. plastex) but in the end we figured that it would probably be a lot easier to simply write our own compiler which understands a tiny subset of the LaTeX language and compiles directly to XHTML / ePub.
Previously I used a tool on Windows called GOLD. This allowed me to go directly from BNF grammars to a stub parser. It also alllowed me to implement the parser in any language I liked. (I'd choose Python).
This product has to work on Linux, so I'm wondering if there's an equivalent toolchain that works as well under Ubutnu / Eclipse / Python. The idea is that we will take the grammar of TeX and just implement a teeny subset of that, but we do not want to spend a huge amount of time worrying about grammar and parsing. A parser generator would obviously save us a great deal of time.
Sal
UPDATE 1: Bonus marks for a solution with excellent documentation or tutorials.
UPDATE 2: Extra bonus if there is grammar file for TeX already available, since all I'd have to do is implement the functions we care about.

Try pyparsing.
Se http://pyparsing.wikispaces.com/WhosUsingPyparsing, search for TeX. There's a project where pyparsing is used to parse a subset of TeX syntax mentioned on that page.
For documentation, I recommend the "Getting started with pyparsing" e-book, by pyparsing's author.
EDIT: According to PaulMcG, Pyparsing is no longer hosted on wikispaces.com. Go to the new GitHub site

Try PLY.

I once used tex4ht to convert LaTeX to XHTML+MathML. Worked quite nice. From that on, you could use the output HTML as base for the ePub.
Of course, this breaks the Python toolchain, so it might not become your favorite method...

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.