Syntax Highlighting in Cocoa TextView? Experiences? Suggestions? Ideas? [duplicate]

Syntax Highlighting in Cocoa TextView? Experiences? Suggestions? Ideas? [duplicate] - python

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Syntax coloring for Cocoa app
I'm interested in syntax highlighting in a Cocoa TextView.
I found several resources:
approach with flex, via a flex pattern matched against textStorageDidProcessEditing
in a TextView delegate. In this approach the whole string get parsed on each input event, hence performance degrades.
CocoaDev has an own page on the topic of syntax highlighting:
Use NSTextStorageDidProcessEditingNotification, then get the edited range, and just apply the coloring there. The range might be wordboundaries or anything; this definitely improves performance.
Mentioned there: Xcode, for example, only colorizes text that's currently on-screen, and defers colorizing the rest of the document until you scroll through it. How would one implement this?
Use NSLayoutManager – via Temporary attributes [which] are used only for on-screen drawing and are not persistent in any way... as the docs say, but that doesn't color the last edited range, until a whitespace character is entered.
Custom Helper like UKSyntaxColoredDocument – nice, but language definition is done via property list; how to use additional/existing language definitions?
None of the approaches seem really extensible or robust to me (except the 4. maybe ..).
I am aware of robust existing libraries for SH like pygments; and of PyObjC.
Question: How would it be possible to use some existing library e.g. like pygments to have an extensible and performant syntax highlighting in a Cocoa TextView?
Note: I know this question is very broad (and much too long). Experiences and suggestions as well as solutions are welcome. Thanks.
Found another similar thread on that matter: Syntax coloring for Cocoa app

I would suggest taking a look at the source code to Smultron. It has very nice syntax highlighting. It uses a subclass of NSTextView to do most of the heavy lifting. The code uses the layout manager to add attributes to the text and uses some other clever tricks to only highlight as much of the document as necessary.

Related

Simple tokenizer for C++ in Python

Struggling to find a Python library of script to tokenize (find specific tokens like function definition names, variable names, keywords etc.).
I have managed to find keywords, whitespaces etc. using something like this but I found it quite a challenge for function/class definition names etc. I was hoping of using a pre-existent script; I explored Pygments with no success. Its lexer seems amazing for what I want but have no idea how to utilize it in Python and to also get positions for each found token.
For example I am looking at doing something like that:
int fac(int n)
{
return (n>1) ? n∗fac(n−1) : 1;
}
from the source code above I would like to get:
function_name: 'fac' at position (x, y)
variable_name: 'n' at position (x, y+8)
EDITED:
Any suggestions will be appreciated since I am in the dark here regarding tokenizations and parsing in C++?

Eli Bendersky is a smart guy, and sometimes active here on SO. He's got a blog post on this issue which I'll refer you directly to: Parsing C++ in Python with Clang.
Because things disappear, here's the takeaway:
Eli Bendersky wrote a C language (not C++) parser in Python, called pycparser. People keep asking him if he's going to add support for C++. He is not. He recommends instead that people use the Python bindings for libclang to get access to "a C API that the Clang team vows to keep relatively stable, allowing the user to examine parsed code at the level of an abstract syntax tree (AST)".
You can find the bindings separately on PyPI here. Note though that you'll have to have clang installed, so you may just want to point your PYTHON_PATH directly at the install location.

You're struggling to find a python library to do what you want because what you want is impossible to do, fundamentally.
I have managed to find keywords, whitespaces etc. using something like this but I found it quite a challenge for function/class definition names etc
You mean like this:
foo = 3
def foo():pass
What is foo? All a tokenizer should/can tell you is that foo is an identifier. It's context tells you whether it's a variable or a function declaration. You need a parser to handle context free grammars. Mathematically, the space of context free grammars is too large for a standard lexer to tackle.
Try a parser: here's one in python
Normally I'd try and provide you links here to distinguish between the topics, but this is too broad to provide a single good link to. If you're interested, start with any standard compiler text. Elsewhere on SE, we see this question pop up as a theoretical question and, in some form, as a famous question about html.
Once you realize that tokenizers are (usually) built (largely) on regular expressions, it becomes more obvious why your task is not going to end happily.
Now that you know the terminology, I think you'll find this SO article useful, which recommends gcc-ml. I don't know how up-to-date it is, but it's the type of program you're looking for.

Alternatives to Qt.escape in PySide?

I am porting some code from PyQt to PySide, which includes a home-grown XML exporter. The code is peppered with lines like:
Qt.escape(textNote)
This is new to me. My PyQt book (Summerfield, 2008) writes:
The Qt.escape() function takes a QString and returns it with any XML
metacharacters properly escaped. And we...convert any
paragraph and line breaks in the notes to their Unicode equivalents.
But unfortunately for my goal of creating XML from text, escape seems to no longer be in use.
This issue is discussed at two sources I found:
http://qt-project.org/wiki/Transition_from_Qt_4.x_to_Qt5#f166611e9788f9dbdff69088d622663e
http://www.kdab.com/automated-porting-from-qt-4-to-qt-5/
Unfortunately, they both suggest to use QString.toHtmlEscaped() but this method seems to not exist in PySide (indeed, QString is not part of PySide's lexicon).
Finally, as of four years ago, it seemed escape is not something that they intended to support in PySide, as discussed at a bug report:
After a discussion with other PySide developers, we decided not export
this function, the reasons are:
This function is part of QtGui, if we create a QtGui.Qt, this will cause some headaches with QtCore.Qt.
PyQt4 also didn't export this function.
There are functions in python std lib that you can use to achieve the same goals, like xml.sax.saxutils.escape().
So, I'll mark this bug as WONTFIX.
This seems to answer my question, but it is four years old, and I am curious if it still holds. That is, is there no PySide escape functionality, so is the best option to go to saxutils? Or perhaps is there some workaround akin to toHtmlEscaped in PySide that I've overlooked?

Every time you would have used
Qt.escape(yourText)
you can get the exact same functionality with
from xml.sax.saxutils import escape as escape
escape(yourText)
It's a little less elegant, but it works. The PySide developers have remained consistent with their initial reaction to a question about this four years ago.

Aspect oriented programming (AOP) in Python

Possible Duplicate:
Any AOP support library for Python?
I am familiar with the AspectJ extension for the Java language.
I want to know if there is such a thing for Python.
Don't get me wrong, I do not mean a library but a language extension like AspectJ is to Java.

Python does not need something like a "language extension" for being able to work in an Aspect Oriented way.
That is simply due to the dynamic mechanisms in Python itself. A Google search will yield a couple projects - but despite looking merely like libraries, it is all that is needed in Python.
I am not making this up - it is the fact that you can introspect classes and methods, and change them at run-time. When I first learned about Aspect Orientation, I could implement some proof of concepts in Python in a couple of hours - certainly some of the existing projects can offer production-quality entries.
But since you asked, there is a Python "language extension" of sorts that could be used for Aspect Orientation: when I made the proof of concept I mentioned above, I used to check the input parameters to methods at run-time to determine whether certain methods would be affected by a rule or not.
In Python 3 there is a little known feature of the language that allows one to annotate the input parameters and return value of a function or method. An aspect orientation library could make use of this to apply its magic at "load time", and not at the time of each function call.
BTW, here is my quick hack to get a working example of using Aspect Orientation with Pure Python. Sorry - the code comments are in pt_BR -
https://github.com/jsbueno/metapython/blob/main/aspect.py

You can use Spring Python
Link : http://docs.spring.io/spring-python/1.2.x/sphinx/html/aop.html#aspect-oriented-programming

Are there technical reasons a Ruby DSL like RSpec couldn't be rewritten in Python?

The section below goes into more detail, but basically someone stated that the Ruby-written DSL RSpec couldn't be rewritten in Python. Is that true? If so, why?
I'm wanting to better understand the technical differences between Ruby and Python.
Update: Why am I asking this question?
The Running away from RSpec discussion has some statements about it being "impossible" to recreate RSpec in Python. I was trying to make the question a little broader in hopes of learning more of the technical differences between Ruby and Python. In hindsight, maybe I should have tightened the question's scope to just asking if it truly is impossible to recreate RSpec in Python, and if so why.
Below are just a few quotes from the Running away from RSpec discussion.
Initial Question
For the past few weeks I have been thinking a lot about RSpec and why there is no clear, definite answer when someone asks:
"I'm looking for a Python equivalent of RSpec. Where can I find such a
thing?"
Probably the most common (and understandable) answer is that Python syntax
wouldn't allow such a thing whereas in Ruby it is possible.
First Response to Initial Question
Not syntax exactly. Rspec monkeypatches every object inside of its
scope, inserting the methods "should" and "should_not". You can do
something in python, but you can't monkeypatch the built-in types.
Another Response
As you suggest, it's impossible. Mote and PySpec are just fancy ways
to name your tests: weak implementations of one tiny corner of RSpec.
Mote uses horrible settrace magic; PySpec adds a bunch of
domain-irrelevant noise. Neither even supports arbitrary context
strings. RSpec is more terse, more expressive, removes the noise, and
is an entirely reasonable thing to build in Ruby.
That last point is important: it's not just that RSpec is possible in
Ruby; it's actually idiomatic.

If I had to point out one great difficulty for creating a Python RSpec, it would be the lack of a good syntax in Python for creating anonymous functions (as in JavaScript) or blocks (as in Ruby). The only option for a Python programmer is to use lambdas, which is not an option at all because lambdas just accept one expression. The do ... end blocks used in RSpec would have to be written as a function before calling describe and it, as in the example below:
def should_do_stuff():
# ...
it("should do stuff", should_do_stuff)
Not so sexy, right?
There are some difficulties in creating the should methods, but I bet it would be a smaller problem. Actually, one does not even need to use such an unusual syntax—you could get similar results (maybe even better, depending on your taste) using the Jasmine syntax, which can be trivially implemented.
That said, I feel that Python syntax is more focused on efficiently representing the usual program components such as classes, functions, variables, etc. It is not well suited to be extended. I, for one, think that a good Python program is one where I can see objects, and functions, and variables, and I understand what each one of these elements do. Ruby programmers, OTOH, seem to seek for a more prose-like style, where a new language is defined for a new problem. It is a good way of doing things, too, but not a Pythonic way. Python is good to represent algorithms, not prose.
Sometimes it is a draconian limit. How could one use BDD for example? Well, the usual way of pushing these limits in Python is to effectively write your own DSL, but it should REALLY be another language. That is what Pyccuracy is, for example: another language for BDD. A more mainstream example is doctest. (Actually, if I would write some BDD Python library, I would write it based on doctest.) Another example of Python DSL is Twill. And yet another example is reStructuredText, used in Sphinx.
Summarizing: IMHO the hardest barrier to DSLs in Python is the lack of a flexible syntax for creating anonymous functions. And it is not a fault*: Python is not fond of having its syntax heavily explored anyway—it is considered to make code less clear in the Python universe. If you want a new syntax in Python you are well advised to write your own language, or at least it is the way I feel.
* Or maybe it is - I have to confess that I miss anonymous functions. However, I recognize that they would be hard to implement elegantly given the Python semantic indentation.

I set out on an attempt to implement something like rspec in Python.
I got this:
with It('should pass') as test:
test.should_be_equal(1, 1)
source: https://gist.github.com/2029866
(thoughts?)
EDIT: My answer to your question is that the lack of anonymous blocks prevents a Ruby DSL like RSpec from being rewritten in Python but you can get a close approximation using with statements.

One of Ruby's strengths is in the creation of DSLs. However the reasons given for it being difficult in python can be sidestepped. For example you can easily subclass the builtin types, e.g:
>>> class myint(int): pass
>>> i = myint(5)
>>> i
5
If I were going to create a DSL in python I'd use pyparsing or Parsley and something like the above behind the scenes, optimizing the syntax for the problem, not the implementation language.

By mixing Mamba and Expects, I think you can get very close to what RSpec is for Rails...
https://github.com/nestorsalceda/mamba
https://github.com/jaimegildesagredo/expects
Also, I think Specter should match your expectations with testing:
https://github.com/jmvrbanac/Specter
http://specter.readthedocs.io/en/latest/writing_tests/index.html

I think this is what you are looking for. Yes, we made the "impossible" in python
"sure" is an utility belt for expressive python tests, created by Gabriel Falcão

programming language implemented in pure python

i am creating ( researching possibility of ) a highly customizable python client and would like to allow users to actually edit the code in another language to customize the running of program. ( analogous to browser which itself coded in c/c++ and run another language html/js ). so my question is , is there any programming language implemented in pure python which i can see as a reference ( or use directly ? ) -- i need simple language ( simple statements and ifs can do )
edit: sorry if i did not make myself clear but what i want is "a language to customize the running of program" , even though pypi seems a great option, what i am looking for is more simple which i can study and extend myself if need arise. my google searches pointing towards xml based langagues. ( BMEL , XForms etc ).

The question isn't completely clear on scope, but I have a hunch that PyPy, embedding other full languages, and similar solutions might be overkill. It sounds like iamgopal may really be interested in something more like Interpreter Pattern or Little Language.
If the language you want to support is really small (see the Interpreter Pattern link), then hand-coding this yourself in Python won't be too hard. You can write a simple parser (Google around; here's one example), then walk the AST and evaluate user expressions.
However, if you expect this to be used for a long time or by many people, it may be worth throwing a real language at the problem. (I'd recommend Python itself if your users are already familiar with basic Python syntax).

Ren'Py is a modification to Python syntax built on top of Python itself, using the language tools in the stdlib.

For your user's sake, don't use an XML based language - XML is an awful basis for a programming language and your users will hate you for it.
Here is a suggestion. Use a strict subset of Python for your language. Use the compiler module to convert their code into an abstract syntax tree and walk the tree to to validate that the code conforms to your subset before converting the AST into python bytecode.
N.B. I just checked the docs and see that the compiler package is deprecated in 2.6 and removed in Python 3.x. Does anyone know why that is?

Numerous template languages such as Cheetah, Django templates, Genshi, Mako, Mighty might serve as an example.

Why not Python itself? With some care you can use eval to run user code.
One of the good thing about interpreted scripting languages is that you don't need another extra scripting language!

PLY (Python Lex-Yacc)
is something of your interest.

Possibly Common Lisp (or any other Lisp) will be the best choice for that task. Because Lisp make it possible to easily extend host language with powerful macroses and construct DSL (domain specific language).

If all you need is simple if statements and expressions, I'm sure it wouldn't be an awful task to parse each line. Something like
if some flag
activate some feature
deactivate some feature
elif some other flag
activate some feature
activate some feature
else
logout
Just write a class which, while parsing takes the first word, checks if it's "if, elif, else," etc, and if so, check a flag and set a flag saying you either are or are not executing until the next conditional. If it's not a conditional, call a function based on the first keyword that would modify the program state in some way.
The class could store some local execution state (are we in an if statement? If so are we executing this branch?) and have another class containing some global application state (flags that are checkable by if statements, etc).
This is probably the wrong thing to do in your situation (it's very prone to bugs, it's dangerous if you don't treat the data in the scripts correctly), but it's at least a start if you do decide to interpret your own mini-language.
Seriously though, if you try this, be very, very, srs careful. Don't give the scripts any functionality that they don't definitely need, because you are almost certainly opening security holes by doing something like this.
Don't say I didn't warn you.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.