Tuesday, August 02, 2005

Software Complexity

I spend a lot of time developing large systems. I find myself continously wrestling with the fact that in order to make a system easy to use, it needs to be exceedingly complex. You could almost draw a direct parallel. As the system becomes easier to use, its complexity increases by the same factor. The side affect is that maintaining a large system becomes that much more complex. Each system has certain drawbacks that were probably rationalized as functionality, or perhaps functionality that later become a drawback. For example, lets say a system maintains some internal data cache. The data needs to be refreshed every day.

Solution 1 is to enforce that the system is restarted every day causing the internal cache to be refreshed.

Solution 2 is to build a timer that automatically refreshes the internal cache every midnight. The timer locks the system for that period and refreshes the cache.

Solution 3 is to build something dynamic that is able to identify a slow period, then lock the system and perform the refresh.

Solution 4 is to build a partial cache refresh, were only changed items are refreshed. The system automatically scans the data source every few minutes, identifies the changes and inserts them into the cache, locking only the data source.

Solution 5, the external data source notifies the system when data changes occur, and the system performs a specific cache refresh, locking only the items that changed.

Solution 6 incompasses solution 5, except instead of locking, a seperate cache is built and swapped in during a slow interval, etc...

There are a lot of solutions to this problem. In fact, there are a lot of solutions without even considering AI where you can start having nueral nets try to predict slow intervals, optimal times for refresh, or which data is likely to change, etc... Each solution makes the system more flexible and much smarter but with an ever increasing complexity cost. Solution 6 will probably yield the most flexible system with the least amount of downtime with perhaps some AI thrown in. But, Solution 6 will also be very complex with quite a bit of code envolved in making this work. There is going to be a seperate system that needs to be aware of data changed, a method of notification, ability to modify and lock parts of the cache, ability to identify slow periods or periods where the specific data is not used, etc.. Solution 6, if built correctly, will probably require little day to day maintenance but once it fails will be very complex to troubleshoot and fix.
So, what's the point of all this? I don't know. I like to make complex systems simple, but that's a loosing battle. Perhaps, the curve is not really a a straight line but more of a bell curve, where after a certain point, the system complexity drops. The system becomes so complex and so smart that it is actually easy to maintain.

No comments: