Wednesday, April 04, 2007

Long Object

I am behind on everything. Anyway, I was wanting to write a program to calculate the check sum for an ISBN13 code. As book lovers know, on January of this year, ISBNs went from 10 digits to 13 digits. It would make sense to write the program as an object. With an object, you would store the ISBN number and extract either the 10 or 13 digit version of the number.

The site lastflood.com has a nicely written PHP Object for ISBNs. This is an example of very well written code; However, it strikes me that this well written code is over 1500 lines!

The size of the code doesn't matter for compiled languages like Java and C++. The problem with a scripting language like PHP is that you have to compile it with each call. In a typical year, this object would be called and compiled several hundred million times. I am left scratching my head wondering if I should write "well written" code; or go for speed. The object would be running on a crowded shared server.

I am really caught in a quandry: Should I write good code or should I write fast code? How big of a hit do I get for using 1500 line objects opposed to short procedures that take maybe 80 lines.

The fact that there seems to be a dichotomy between good and fast strikes me as wrong.

Back to my rant on Unicode. An ISBN is simply a number. You can represent a 13 digit number with a 44 digit binary string. If I am using a 64 bit unicode character set to store the ISBNs, I would need 832 bits to store each ISBN. When I store this ISBN as a string in a database, the size of the key is almost twenty times the size that I really need. This does not matter for the majority of applications. When you get into something where you get into analyzing data, inefficiencies stack up. For example, if you were working on a database that recorded crossreferencing in books, you could very quickly end up with lists of millions of ISBNs.

Yes, I know that the 1500 line program is well written. It is better than anything I ever write, but my brain keeps screaming at me the word "efficiency"! The ISBN is just one tiny element that I have on pages. If I wrote everything in good style, my programs would be 80,000+ lines of code in size.

I neither want to write nor maintain code that is that long. What I want is code that lets me express the actual logic needed for a task in the cleanest, most efficient manner possible. In other words: I am a dinosaur headed for extinction.

No comments: