Code completion for dynamic languages

2007-02-02 • Python, Ruby, Emacs, Dynamic Languages • Comments

Here’s an interesting article in which Huw Collingbourne describes his frustration with trying to program a smart code completion system for Visual Studio. The problem being, the code in question is Ruby. One particularly juicy quotation reads:

Ruby is a so-called “dynamic” language, which is a polite way of saying that it’s hugely unpredictable.

He goes on to explain:

A Ruby program is so dynamic that you can never be sure what it is up to from one moment to the next. To take a simple example, when you write some stand-alone functions into the editor, those functions get bound into the base class of the entire Ruby class hierarchy. That means that every single Ruby class automatically “inherits” them - and the IntelliSense system is expected to know about it!

Despite overtly grumbling about dynamic languages, the author does a good job of promoting them. Software is supposed to be soft and dynamic languages help keep it that way. Everything is open. Everything can be queried. Everything can be adapted.

Emacs Python mode

My preferred IDE is emacs — whatever language I’m using. It has a particularly nice Python integration. Here’s how it works:

Pull up a full-screen window
Split the window vertically
Use one side for the code you’re working on
Use the other side to run an interpreted Python session
Switch sides as desired

I continually select regions of code to execute. I continually step into the interpreter and use the Python help command to get help on modules and functions — including the ones I’m creating right now. I rework a function in one window then exercise it in the other. I sketch experimental code, run it, rub it out.

Collingbourne says:

Creating real IntelliSense is much harder. The only way to do it properly is to analyse the code much as the Ruby interpreter itself does. The big difference is that the interpreter only goes into operation when a program is complete …

He spots the answer in one sentence — to do it properly, you do have to hook into the interpreter and its powers of reflection — then misses the point in the next. You just have to run the interpreter alongside the code you’re developing, like emacs Python mode does.

A cheat’s guide to code-completion

Collingbourne also exposes some code-completion systems as “cheats”.

Some code-completion systems solve this problem in a cunning way - they cheat. Instead of working out what type of object x is at any given moment, taking into account all the difficult stuff such as its scope, inheritance and context, they work alphabetically. If someone enters a dot followed by the letters “my”, they drop down a list of names such as “my_method”, “my_othermethod” and “my_random_guess” whether or not those methods have anything to do with the object in question.

Well, I’m happy to cheat using emacs in this way. The ALT-/ combination uses alphabetic completion and I use it more than any other key sequence, whatever document I’m working on. More often than not, it does the right thing. As usual, simple solutions are better.

Feedback

Keith Ray 2007-02-02
re -
1. Pull up a full-screen window
2. Split the window vertically
3. Use one side for the code you’re working on
4. Use the other side to run an interpreted Python session
I was just thinking this morning that doing that on a projector would be a great way show off test-driven development in Python (or even Java).
Thomas Guest 2007-02-04

Hi Keith. Doing any programming in front of an audience is brave, and makes very for convincing demonstration.

I've been lucky enough to attend the ACCU conference a few times and admire the speakers who have the ability to program live. Herb Sutter excels at it.
rubikitch 2007-04-11

rct-complete in `rcodetools' package does editor-independent accurate completion. Instead of analyzing Ruby code, rct-complete actually executes the code to get candidates.
Thomas Guest 2007-04-12

rubikitch, thanks for the note -- that's exactly the kind of intelligent code completion technique I was talking about.
vais 2007-04-17

rubikitch, am I missing something, or is this an extremely dangerous way to do completion? I have been struggling with this thought for a while, could you please explain your reasoning as to how it is ok to run arbitrary code to provide completion candidates. Side effects, anybody?

For example, copy and paste any sample code from www.webrick.org, and then invoke completion in your editor of choice. What happens? Well, at a minimum you get an http server running on port 2000, and, I suppose depending on the editor in question, it hangs as the ruby code blocks. I have not tried this with things like File.delete, but even if you blacklist certain things, you can never be 100% sure that there are no negative side effects from executing the code you happen to be editing.

So, the question is whether 100% accurate completion is worth the price of uncertainty about the consequences of code execution. I have never been more looking forward to being proved wrong as on this one.
Thomas Guest 2007-04-17

vais, I haven't actually used rct-complete myself, and I haven't found any definitive documentation online, but I think the programmer is fully in charge of what gets executed and when. Determining code-complete options can be done using an object's reflection interfaces; then you can select the method and decide if you want to run the code.

Developing any code is "dangerous", but narrowing the gap between writing and executing code reduces the risk.
vais 2007-04-19
Thomas, I could not agree with you more on the points you make in this post regarding your Emacs setup for working with Python -- my workflow in Ruby is almost identical.

"...to do it properly, you do have to hook into the interpreter and its powers of reflection..." could not be more dead on.

However, you should at least try rct-complete before jumping to the conclusion that it does the right thing, somehow, magically. That is just plain ole wishfull thinking. And, yes, I could not find any documentation either (that would explain the principles behind its operation). I guess the author did not want to kill the magic...

Of course Ruby code must be executed in order to obtain code-complete options, and that is exacly what the tool does. Here, try this:
1. create a file named test.txt
2. create a file named test.rb, and type the following: File.rename('test.txt', 'oops.txt')
3. Hit enter, and type this: Dir.
4. because my editor is set up to call rct-complete when a period is pressed, completion happens, and I get a list of everything the Dir class can do, great!
Now, look into the directory where your test.txt and test.rb are. What do you see? oops.txt and test.rb.

I am all for narrowing the gap between writing and executing code, but if this example does not scare you, nothing will. That is just not the way to do it.

Once again, I wish it had some magic and just did the right thing, but a simple test points to the contrary. If only rubikitch would make some of his reasoning behind rct-complete publicly available, but alas...
Thomas Guest 2007-04-19

vais, you're quite right, I can't really comment on rct-complete because I haven't used it; and although I'm interested in Ruby, I hardly ever program in it, so I won't be able to provide any more informed comments. Thanks too for the use case. I take your point. I am confused though: are you saying you use rct-complete, even though it's so dangerous, or when you say "my editor" do you mean "the editor"?
vais 2007-04-19

Thomas, I would love to be able to use rct-complete, so when I found it, I got really excited. Of course the next step is to try it, and, it surely did all the things it was supposed to do - what a joy!

The trouble started when I discovered that if there are any syntax errors in the code, or even a mis-typed method name, autocompletion stopped working for the entire file until those things were removed. So, I started digging... and the rest is documented above.

So, to answer your question, I am not speaking in the abstract, and by "my editor" I really do mean my editor. BTW my editor for many years has been SciTE (http://www.scintilla.org/SciTE.html) - a Lua-scriptable tiny wonder (which by the way was orginally created as just a demo of the Scintilla component which in turn was created because the author wanted a better editing component for working with Python code)

Mitchell made it even more perfect(tm) with his scite-tools project you can see here: http://caladbolg.net/scite.php, and scite-tools comes with rct-complete baked right in. Fortunately disabling it is just commenting out a line of Lua, which is what I unfortunatelly had to do as a result of my findings. I must add that scite-tools is MUCH more than that, and anybody who uses SciTE owes it to themselves to check it out. This is the best example I have seen of tapping into the power of Lua under the hood of this editor to do amazing things.
rubikitch 2007-04-28

I just implemented Test-Driven Completion (TDC) in rcodetools. TDC means that completion in a method, current implementation cannot do, uses a unit test. Test scripts are self-enclosed and expected to be executed, so side-effects are not problem. If you want to use the development version, try darcs get http://eigenclass.org/repos/rcodetools/head rcodetools After `darcs get' the repository, darcs pull updates your local repository.

Word Aligned

space sensitive programming