Tuesday, July 02, 2013

Greenfield Enterprise Architecture for an IT BU

Let's say you had the power to have a completely greenfield development for an entire Enterprise Architecture for an IT BU - what would it look like?

Well, which IT BU you might ask? Does it really matter?

You need a big database. What kind of enterprise architecture can you have without a database?

Next, we need a bunch of ETL to load data from source systems, because there is always a source system to source from. We can either code it or buy it - doesn't matter which.

And there shall be database performance problems with the data-load.

Next, we need some engine code. Let's call it "The System". And, It shall be in Java. And it shall have memory problems regardless of whether it's 32bit or 64bit or how much memory you allocate to it.

And next, there shall be a web-ui, for which, countless hours will be spent on things that will never be used. And there shall be an excel download link on every page - because that's the only feature the users seem to care about.

And there shall be data quality problems, and performance problems, and scalability problems, and extensibility, and bulk upload requirements, and usability issues, and technical debt, and The Business will cry out for Salvation.

Surely, there must be another way.

I have spent the last 5 years as an Enterprise Architect at a Tier 1 Investment Bank designing systems that solve Big and Expensive problems. There are a few observations I would like to make for things that worked and things that didn't.

1. User's are smart and love Excel and are better at coding than the H1B coder you got for the 2 for 1 sale from a body shop.
2. Build functions not systems - and expose your functions to your users. Allow the users to create a managed ecosystem around the functionality.
3. Make sure your functions work natively in Excel - think COM C# library.
4. Use elastic infrastructure like HDFS, and compute clouds, and data-grids, etc... - don't build dedicated systems, build services that have clean inputs and outputs and can run on scalable hardware like compute clouds.
5. The database and ETL has been my Achilles heel. The database schema is too rigid for the fast pass of change. Alternatively, the rigidness is required given how central the data is too everything. I have yet to really embrace the nosql movement, given the lack of ACID qualities. There are some promising developments in the form of Impala, which is a closer to a pure MPP database running on commodity elastic hardware. Perhaps an interim medium can be found between a strict data-model of a traditional database and a loose schema of a nosql database.

To be continued....




1 comment:

Anonymous said...

why we always blame users saying that they love Excel. Why cant we create some flashy, jazzy user friendly UI? Why the user need to perform look-up or search operations if he already know the values. Like for example, though he knew the currency value is USD, but still he need to perform search operation.

Instead of doing 10 lookup operations he may simply go for bulk upload where he can simply paste the values and modify the necessary things.