Writing a meta-program to write your program

Started by
15 comments, last by Ramsess 16 years, 5 months ago
I'm currently working on a large number of PHP websites, each with their own MySQL-persisting object hierarchies, and so I find myself writing with regularity the same kind of SQL queries, the same kind of functions, the same kind of classes. This asks for refactoring. My idea was to write a meta-program that would parse data about what the object hierarchies look like, and generate the code to create the database, access it, and persist objects accordingly. The trouble is that while PHP does indeed support runtime code generation, 1° the code generation doesn't actually have to be done at runtime, 2° coding in PHP is already unsafe, so meta-programming in it is downright insane, 3° the readability of the resulting code is extremely reduced as it mixes factored code generation with specific behaviour insertion, and 4° the level of introspection of PHP is lacking. So, I'm looking at O'Caml right now. The point of the program would be to generate human-readable, commented and documented PHP code by manipulating an abstract syntax tree. This would be divided into two parts: hand-written code (transformed into that AST through camlp4 and representing the specific code for the various persisting objects) and manipulation functions (these would decorate classes with the ad hoc functions, database accesses, comments). The meta-program would also perform type-checking and contract-checking (mark user input as "tainted" and only remove the "tainted" tag when it goes through the adapted sanitization functions) because that makes coding in PHP easier. Plus, unit testing could be auto-generated with instrumentation to test coverage, and caching policies could be auto-detected and inserted wherever needed. And I could keep the manipulation functions from one project to the next (with some general ones such as those implementing a design pattern, the type-checking and contract-checking bits, or those implementing more specific algorithmic patterns such as maximum-finding). Among other advantages, I would say that this would allow writing code in several languages if the AST is generic enough, as well as improve compile-upload-test times during fast iterations by automatically selecting only those areas of code which need to be generated, reducing the overhead by a linear factor. One could also include makefiles in there. Of course, since the generated code would be clean human-readable PHP, I could leave it there without fear of reprisal from management should I decide to drop the project (or even sell it). Your opinions?
Advertisement
i would definitely advise to be careful, but i would definitely say to do it. within the last six weeks, i have found myself doing alot of the same type of stuff at my work, only in stead of php, it's asp pages. but sql table creation, insertion and so forth. it did take some time and a good bit of testing, which my boss was very helpful in, but once completed i have now taken a task that my work does routinely, and converted it into a 2 hour setup process, with testing, as opposed to a day's work of coding and testing.

Code makes the man
With my very limited idea of what it is that you're trying to solve with this code generator, I would have to say that code generation is the wrong solution to the problem. Even if your code generator will solve your problem of writing the repetitious code for you initially, it doesn't solve the problem of maintaining all of the code it generates. When you need to change the code, you'll either have to regenerate all code, wiping out specific modifications to individual PHP sites, or write it to be so intelligent that it will somehow avoid clobbering the modifications.

Why can't you write modules that can be shared by all of your php pages to eliminate repetition instead?
Quote:Original post by smr
Why can't you write modules that can be shared by all of your php pages to eliminate repetition instead?


Because writing those modules is repetitive.
I'm not 100% sure what it is you are doing, but it sounds a bit like you want Ruby on Rails & Active Record. A friend of mine tried to duplicate a lot of the metaprogramming magic of rails in PHP, but although it worked pretty well, it was far from elegant.
Would you need to write a different module for each site?
Ruby was an idea, but I don't have access to anything but PHP4 (and I don't think the guys would let me write code they couldn't find someone cheap to understand).

Quote:Original post by smr
Would you need to write a different module for each site?


The point is that I create one class per problem-space object (students, courses, grades in one application, consultants and timetables in another, MtG cards and decks in yet another, and so on). This class has fields (methods and the corresponding member variables) that match the structure of a database (although not in a field = column fashion, but close), with the corresponding loading code, table creation code, input validation code, and so on. Plus, of course, all the high-level logic, but that's another story as it's easily factored out. While all of these contain a lot of similitudes, factoring out those similitudes in external modules still leaves a lot of unfactored repetitiveness around simply because PHP lacks expressiveness and thus imposes a minimum "define method, fetch parameter array, forward call to module" for each field regardless of how much factoring I manage to do with the field code.

Unless, of course, I generate the classes themselves by factoring out the actually relevant data (the field names, types and constraints) and passing it to a class generator. However, this solution bears the disadvantages of being leaky (that is, there's always some unfactorable property here or there in a constructor or field) and obscenely obscure (you don't have an actual class definition lying around to document or examine). So, I'd rather run a generator externally to have my classes ready and clean for the day when I leave all of this to the person after me.
I don't know. It seems like a great way to get a lot of code written initially (although you're going to write a lot of code by hand for the generator), but all that generated code will need to be maintained. Maybe maintenance won't be a problem if you can simply regenerate the code, but I've never in my experience had software used by customers where I haven't needed to make targeted modifications for specific cases. It seems like you've already got your mind made up though to do this.
Quote:Original post by smrMaybe maintenance won't be a problem if you can simply regenerate the code, but I've never in my experience had software used by customers where I haven't needed to make targeted modifications for specific cases.


The point would be that if any modifications are required (targeted or general), I would add them to the generator control code instead of the generated code.

Of course, once I deliver the code, I will stipulate that I will only maintain the original code I delivered, and that if they make any modifications to that code I will not be bound to respect those modifications (a very sane thing to add to any maintenance contract, because I don't want to be liable for whatever mudballs they come up with once I leave).

im not sure if im on the right track with what you're trying to do, but with J2EE I have used ORM libraries such as Hibernate, which simply require an XML mapping file which mirror a Java class (and obviously there are tools which can automatically generate these XML mappings)
by researching these tools, maybe you could write a PHP based solution with a similar goal.
"I am a donut! Ask not how many tris/batch, but rather how many batches/frame!" -- Matthias Wloka & Richard Huddy, (GDC, DirectX 9 Performance)

http://www.silvermace.com/ -- My personal website

This topic is closed to new replies.

Advertisement