Orange is my favorite color

There are many haters of Hungarian Notation. Prepare to pile on.

I’ve had this post in my drafts queue for a long time but this thread about escaping SQL reserved keywords on the Transfer group prompted me to finish my thoughts:

Most applications have a User table, if those users are organized then there may be a Group table, and if there’s any kind of e-commerce going on then it might have an Order table. How would you name these so as not to use reserved words?

I learned Hungarian Notation from a Sybase database admin named Rick when I first started webdeving back in 1994. He worked at Prudential and I had never touched a database so when we started doing some more serious work, he told me about the Hungarian Notation they used at Prudential for naming columns:

vchCompanyName = company name, varchar
intEmployeeCount = employee count, integer

This seems innocuous enough. Since I didn’t control the database but needed to code against it, I accepted this as Gospel. It was my first interaction with a real database, one that ran on scary Sun Ultra 5 servers at some colo facility on the East Coast, so who was I to disagree? Honestly I was more concerned with getting things done than arguing over a field name. But now I have more time to argue. ;)

Arguments against Hungarian Notation

Convention over configuration is all the rage today. Ruby on Rails programmers claim this to be part of the secret sauce that empowers them to quickly build new services like Basecamp, Blinksale and Twitter. Since many opponents of HN argue that a compiler is your sanity check, it’s especially relevant for untyped languages like PHP and ColdFusion that don’t have such a fallback. HN subtly communicates information to the developer but it can also be programmatically leveraged as I will demonstrate later.

Another common argument against HN is that the editing environment can give you on-demand details of data types or flag incompatible operations based on introspection. Most of the developers I know have migrated to Eclipse + CFEclipse for ColdFusion. While for Java or C the IDE can give you a wealth of information about the code, that support is not available to untyped or dynamically generated languages like ColdFusion which don’t declare variable types.

Perhaps my favorite argument is that HN is not portable since two companies may not implement it in exactly the same way; therefore having no standard is preferable. But this makes no sense! There are changes anytime we switch companies or projects! We also need to learn where the bathrooms are located and the names of our new co-workers; are these critics really suggesting that developers are unable to adapt?

Lastly I have read that using HN locks the code into a particular data type making data model changes more difficult down the road. This might have been true before the advent of search and replace but with today’s IDEs, you can trigger a very accurate system-wide replacement with Regular Expressions in seconds. With the exception of some very large or distributed projects, I simply can’t believe this to be the case any longer. In fact, I would argue that Hungarian Notation makes these kinds of system-wide changes easier because the field and variable names have more specific and unique names making a RegExp more precise (this is the same reason why I never name an iterator variable “i” or “j” – I always use “ii” or “jj” so you can search and/or replace it).

The old arguments against Hungarian Notation just simply don’t make sense in today’s web development environment. Thus, I present to you my top reasons to love Hungarian Notation!

Four Reasons to love Hungarian Notation

  1. With no variable typing, IDE introspection or language compiler to fall back upon for PHP, ColdFusion and other web languages, HN subtly communicates valuable information to the developer that helps prevent and identify mistakes.
  2. The greatest cost of software is actually the maintenance, assuming you’re not some pump-n-dump consultant who rides from one gig to the next. Hungarian Notation significantly improves maintainability because a developer can return to code six months or six years later and have an immediate handle on what is going on in the code without knowing the intricacies of the data model. These prefixes act like hints, continually reminding us of where we are and what we’re dealing with. It’s built-in documentation that you never have to sit down to write!
  3. Prefixes can allow you to catch special situations. I use Transfer ORM against PostgreSQL. One of the great things about postgres are the extensive data types it supports which include things like GIS “points” and UUIDs and timestamp-differential “Intervals”. Problem is, Transfer would throw errors when I accessed an interval column but Hungarian Notation came to the rescue. Since I prefix all interval columns with “inv”, I modified TransferSelecter.cfc starting with:
    default:
    	null = null & "text";

    to handle this special case:

    default:
    if (left(arguments.column, 3) EQ "inv")
    {
    	null = null & "interval";
    }
    else
    {
    	null = null & "text";
    }

    Not the ideal fix but the prefix saved my bacon with a minimal amount of work when I was in the middle of writing my application.

  4. No collision with reserved words in SQL or any other language. Consider an e-commerce site: it would surely have an “Order” table and perhaps a “Group” table, both of which are reserved keywords in SQL. You can work around that by making them plural or giving them a more descriptive name like “SiteOrders” but consistently prefixed names like tblLookupOrder will never conflict with reserved words.

Ghidinelli (Hungarian) Notation

There are many interpretations of Hungarian Notation out there but I will share the prefixes and naming conventions I use to paint a picture of a system I have used alone and in teams for over a decade. I don’t claim this to be the way, but rather a way.

My first personal rule is that all prefixes should be three characters. There are two reasons for this – with three characters you will surely have enough combinations to adequately abbreviate the object you are trying to describe. Secondly, it allows you to programmatically check the prefix since it will always be three characters long (see the Transfer example above). Beyond that, I use a combination of what I learned back in 1994 and what sounded good to me as time went by:

Table Naming

  • tblLookup* – a normalized lookup table to hold an idea. ex: tblLookupUser, tblLookupOrder.
  • tblMap* – a mapping table between typically two lookup tables to describe a relationship. ex: tblMapUserOrders or I have seen others use tblMapUsersToOrders.
  • vue* – a view prefix, usually using the same Lookup and Map pieces. ex: vueLookupAttendee.

Column Naming

  • boo – boolean
  • int – integer
  • snt – small integer (int[2])
  • tnt – tiny integer (int[1])
  • chr – character
  • uid – character(35) used for storage of UUIDs which I use as primary keys in many instances.
  • vch – varying character
  • txt – text or unlimited varying character (depending on platform)
  • inv – interval
  • num – numeric
  • dec – decimal
  • mny – money, which may actually be decimal but better describes it as currency
  • xml – XML storage for databases that support it like PostgreSQL, Oracle…
  • dte – date
  • dtm – datetime/timestamp
  • rad – radians (may be represented as decimal)
  • net – IP address

Application Variable Naming

  • arr – array
  • str – struct (in coldfusion, object in javascript, etc)
  • obj – an instance of a coldfusion component
  • int – integer/numeric
  • qry – query

I’ve become less strict about the variable naming rules in the past years mostly because with CFCs, I keep my methods short enough to be clear. I may be more strict in using them in the view layer where non-programmers or other users might need to do some work since we can share a key of “what is what”.

Conclusion

At the end of the day, Hungarian Notation does not cost the developer anything but may add value in a variety of circumstances, specifically consistency, maintainability and flexibility. There may be components of these conventions that people don’t like, but they clearly do solve the items laid out above.

These are guidelines I generally use but freely break as needed. I tend to be very strict in the data layer and less strict in the display layer. I’ve been writing my current application now for five years and I can’t imagine wading into some of my old code without these hints – it would take at least twice as long to make simple changes. This is the key reason why I continue to use Hungarian Notation!

13 Comments

  1. Geoff said:

    on November 24, 2008 at 5:44 am

    Joel has a nice writeup about hungarian notation:

    http://www.joelonsoftware.com/articles/Wrong.html

  2. Gerald Guido said:

    on November 24, 2008 at 7:39 am

    Consistent use of Hungarian Notation and Convention over Configuration is invaluable for code generation, validation, automation and self documenting code. It is especially useful when you cannot tease out info using Metadata like SSN’s, email addresses, phone numbers etc. I haven’t created a form from scratch in years.

    Bob Silverberg puts it to great use on his Populate Method for Transfer:

    http://www.silverwareconsulting.com/index.cfm/2008/7/22/How-I-Use-Transfer–Part-IX–My-Abstract-Transfer-Decorator-Object–The-Populate-Method

  3. brian said:

    on November 24, 2008 at 10:01 am

    @Geoff – I came across Joel’s post after I wrote my first draft; I knew I liked that guy. ;) He is obviously a much better writer than I am!

    @Gerald – great link to Bob’s stuff, thanks!

  4. dickbob said:

    on November 26, 2008 at 7:23 am

    Like you Brian I take comfort in the same features of Hungarian notation although, according to Wikipedia we should be calling it Systems Hungarian notation…

    http://en.wikipedia.org/wiki/Hungarian_notation

    I think if you read the whole of the Joel article I’m not sure that he is agreeing with us. I think he likes Apps Hungarian but not necesarly Systems Hungarian. Plus you’ll attract the scour of Sean…

    http://www.corfield.org/blog/index.cfm?do=blog.entry&entry=D1CB9656-0284-4F53-209C8F9F6159FB8D

    …and get put on his DNH list!

  5. Sean Corfield said:

    on December 11, 2008 at 7:22 pm

    @dickbob, don’t get me started! :)

    I still stand by that blog post (from three and a half years ago – OMG!) which comments on Joel’s “Wrong” article…

  6. dickbob said:

    on December 12, 2008 at 1:03 am

    @Sean, yeah I had to go into therapy for three years after your comments but I’ve now come to terms with the fact that Sean won’t hire me ;-)

    To quote the “it depends” justification for a lot of differing opinions, I reckon it depends if the convention helps you structure your code in a predicable and logical manner that allows other to follow with ease.

    Hey, if we all called it a Design Pattern it would be acceptable for people to hold differing opinions on the concept and implementation and for them all to valid and “cool” ;-)

  7. Ben Nadel said:

    on December 12, 2008 at 7:13 am

    One of the things I like about this way of doing things is that you can easily tell between two different forms of the same data. For example:

    strIDs = “1,2,3″
    arrIDs = ListToArray( strIDs )

    really, there’s no different between the INTENT of these two forms of data, so, I don’t think they really require different names. They are just different formats of the same thing. But, with hungarian notation, you can easily tell the difference.

    Sure, you could do something like:

    IDList = “1,2,3″
    IDs = ListToArray( IDList )

    But to mean, this feels much more awkward.

  8. brian said:

    on December 12, 2008 at 8:41 am

    @dickbob – insightful comment about calling it a “design pattern” to legitimize HN!

  9. Sean Corfield said:

    on December 12, 2008 at 1:53 pm

    @Ben, which is more readable?

    strIDs or idList?

    arrIDs or idArray?

    I’m all for readable, English, semantically correct names. I’m against cryptic prefixes that make code hard to read.

  10. Ben Nadel said:

    on December 12, 2008 at 2:01 pm

    @Sean,

    While yours is slightly more readable, your notation feels just like hungarian notation as well, just with the data type in a different place.

    Not to say that I don’t prefer yours; just saying that implementation is very similar.

  11. Sean Corfield said:

    on December 12, 2008 at 2:06 pm

    “feels like hungarian” but “is more readable” :)

    I would only have the name describe the implementation semantics if it was particularly important (such as your example of having the same data in two different formats in a single function).

    If the implementation is either “obvious” or irrelevant, I would omit it from the name (and that is usually the case). If your functions are small, you can easily see how a name is used and the actual implementation is “irrelevant”.

    For me, readable code always takes precedence.

  12. Ben Nadel said:

    on December 12, 2008 at 2:12 pm

    @Sean,

    I’ll go along with you on that one :)

  13. brian said:

    on December 12, 2008 at 2:30 pm

    There’s something about Sean’s response that I disagree with but I haven’t yet been able to write it succinctly. :)

    I think critiquing HN on readability is a red herring though. In any environment – be it solo work, a job or an open source project, there are standards for coding. They may be ones you create on the fly or ones you adhere to because your manager says so, but I don’t think a set of prefixes which can be understood in sixty seconds is a good reason to not use HN.

    To the contrary, the context that HN prefixes offer in my opinion makes code more readable (and especially more maintainable when you return to code six months later…).

{ RSS feed for comments on this post}