generate versus annotate, code generation is still an important tool

In 2003, when I was only just starting to write JEE applications, each enterprise bean (using EJB2) required you to write a set of files. This was a pain, though fortunately there were tools like xdoclet to aid in this. You just had to add some javadoc attributes and this was used to generate a lot of the files, assuring the programmers did not need to cut and paste too much (and avoiding all the mistakes you make because it is just so plain boring).

This concept has later been extended into annotations, where instead of compile time generation of java source files, the information is read at runtime, and everything is generated on the fly. This is a big (development) boost as compile cycles are a lot shorter (xdoclet was not particularly fast), and the technology has been used for many good uses (see EJB3 and JPA, tapestry,…)

To jump-start my way in the JEE world, I followed a “Jboss advanced” course. While talking to the other participants at the course, it was clear to me that most of us did not believe xdoclet to be enough. Many (including me) were developing some kind of framework to add even more metadata, improve consistency of the program and reduce work.

Even with the recent evolutions, with annotations, I still believe there is a need for such frameworks. Annotations are a great tool to add metadata, but it is limited to adding information about one class. While it would be possible to use them to generate entire new classes when needed, this would be a pain as these classes would then not be available when writing code (more exactly, the IDE would not be able to help you to reference these classes, and your compiler would also be unhappy). This is a limitation that affects productivity again. There are also cases where it would be useful to be able to customize some of the behaviour which is annotated. In some cases this is more easily done using other methods.

I am still a big proponent of compile time code generation. To assure programmer and user interfaces to a system are consistent, there is not a lot that can beat the effectiveness of code generation. It is often forgotten is just how much boiler plate code is being written all the time.

If you are going to write an enterprise application, you are going to need a data access layer, a crud user interface, probably some web services to access the data layer, some basic (preferable user customizable) reporting, possibly integration with a full text search engine etc. Just for the crud user interface, you need access control and rights management (who is allowed to see or edit which fields, partly determined by the administrator and partly by the users themselves), you need consistency so that for example linking records is always done in the same way, easy navigation inside the forms, aids to have fast keyboard entry of data etc.

That is a lot of boilerplate code to produce, and a lot of code to maintain and modify each time you add a table or a field in your data layer.

Unfortunately, these modifications, or the evolution of software is often also one of the weak points of code generation. It happens a lot that code generation is static. I mean that the code which is generated has a certain behaviour, and it can only be changed by editing the generated code. The end result is one time generated code, where the generated stuff is committed in the source repository and maintained like any other code. This way you can easily generate a (possibly very powerful) prototype, but the problem of maintaining the software has not become any easier (probably the reverse as you now need to maintain foreign code). In a demonstration that was given at JavaOne afterglow this year, it seemed that Ruby on Rails for example suffers from this problem.

Better would be to assure the generated code has enough hooks to allow customization, but is smart enough to assure that these hooks are not removed/deleted/regenerated when the generation is done again (and does not need modifications). This is for example how it is handled in torque, one of the first tools tools I have seen to implement this idea.

While I believe there are probably quite a few frameworks based on advanced code generation in use to aid in software development and maintenance, I assume most of them have been developed in-house as part of certain projects and are not available for general use. On the open source front, I know only of tools which help with some of the aspects which are given above, usually the persistence problem, but only equanda seems to have the intention of covering the whole spectrum of boilerplate code (a lot of the examples above already working (not the web services or full text search engine integration) and some other stuff thrown in for good measure).

Leave a Reply

Your email address will not be published. Required fields are marked *

question razz sad evil exclaim smile redface biggrin surprised eek confused cool lol mad twisted rolleyes wink idea arrow neutral cry mrgreen