Search This Blog

August 22, 2010

Data Architecture Defined

Over the last few weeks I’ve been working with the PASS organization to start up a SQL Server Data Architecture Virtual Chapter (or “DArcVC” as it is starting to be known).  Data Architecture has basically been my job for the last 12 years.  I may have been paid as a DBA, or SQL developer, or BI consultant, or Microsoft Application Solutions consultant, but at the end of the day, what I’ve DONE in all of those roles has really revolved around data architecture.

For those of you not familiar with what data architecture entails, it’s probably worth starting with a definition of IT architecture.  IT architecture is the process of envisioning and designing information technology solutions that align to clients’ expected outcomes.  The architect who designs your house, hotel or office tower does pretty much the same thing.  The only difference is that they work with a different medium.  In a business context, the role of an IT architect is to envision and design systems that deliver tangible business outcomes.  In an entertainment context (e.g. computer games), the role of the architect is to envision and design systems and content that deliver entertainment outcomes. 

You get the picture… Architecture defines the end-state goals for the system builders to follow.  With that said, architects also have a hand in defining the processes, governance structures and organizations that deliver the system at hand.  The architect also provides a financial context for the work – from both cost and benefit perspectives.  Much like a bricks and mortar architect recommends certain building materials and construction methods to deliver measurable outcomes (e.g. energy efficiency, resistance to certain types of environmental conditions, etc) an IT architect makes recommendations on how a system must be structured to deliver value to the people that will use it.

The core concerns of IT architecture can be summarized using the acronym “BATOG”, defined as follows:

  • Business – Who is paying for the solution?  What is the total budget?  What is the expected ROI period?  What business efficiencies is the solution expected to deliver?
  • Applications – What software packages is required to deliver the required outcomes?  Do they need to be customized?  Or are they being built from scratch?  How do you make them work together?
  • Technology – What do you need to build and run your solution? 
  • Organization – Who will build and support the solution?  Who will make sure it gets delivered on time?  Who will manage the budget?
  • Governance – What are the processes that need to be enforced to ensure that the system meets the clients’ requirements?  How are these processes enforced?

Armed with that information, we can then apply those concerns to the Data component of IT architecture.  With that filter applied, data architecture can then be described as the process of envisioning and designing Data systems that deliver required client outcomes.  In a more task focused sense, this means:

  • Selection of data persistence systems (files, DBMS tools, etc)
  • Designing data models and creating a roadmap for their realization in end-state systems
  • Designing data acquisition, integration and presentation systems
  • Designing data lifecycle management systems (e.g. archiving/purging tools)
  • Ensuring that data is secure, but also accessible to the right people
  • Ensuring that data persistence systems are reliable – or, at the very least, have a recovery path from system-level failures.
  • Ensuring that data designs are closely aligned to business processes
  • Defining the organizational requirements that support the outcomes listed above (e.g. skills, training, roles)
  • Guiding the expectations and estimation processes of program/project managers who oversee data-oriented projects
  • Defining data platform configuration and sizing policies

There’s plenty more, but that gives you an idea of some of the high level tasks that data architects get involved with.  Data architects are typically not only concerned with data, but also in how raw data is turned into information and knowledge.  In this layered view:

  • Data is the “What”.  For example:
    • What is the customer’s name? 
    • What did they buy?
    • What did we charge them for it?
  • Information is quite literally the deliverable of informing a human about the what – especially when it leads to decisions and actions that deliver outcomes that provide value of some sort.  For example:
    • Why did that customer want to buy that product at that price?
    • Could we have charged the customer more for that product?
    • What are other stores charging for the same product?
  • Knowledge is the understanding that underlies the decision making enabled by information and data.  Knowledge is the secret sauce to how we capture, design and consume information.  For example:
    • The customer was in the 18-25 age bracket.  We know that people in that age bracket have a preference for the kind of product that customer purchased.
    • The decision to place that product in the window of stores in areas frequented by people in the 18-25 age bracket will lead to lots of sales.
    • If we make those people walk to the back of the store to actually pick that item off a shelf, then they have to walk past all sorts of other tempting items.  This may lead to impulse-driven cross-sell/up-sell.

So a data architect isn’t just concerned with the collection, storage and processing of data, but also with designing data to provide information outcomes, and using information to either extract new knowledge or to use knowledge to identify new opportunities for how we can derive value from information.

In the business intelligence community a commonly used description for why we build business intelligence systems is “Actionable Insight”.  Actionable insight is the goal of all data architectures, and is the key value proposition for capturing data in the first place.  Of course, if we do not design our data systems properly, we run the risk of being unable to derive the required business insights, which is no help to anyone.  If you can’t get actionable insights out of an information system, then there’s not much point spending money on it in the first place.

With that, I’m done.  I’ve given you a quick summary of IT architecture, and how data architecture fits into the broader context of its parent discipline.  I’ve given you an idea of the kinds of concerns that a data architect addresses on a day to day basis, and I’ve provided some perspectives on why data architecture is important.

In future blog entries, I’ll start drilling down into more detail on some of the specific practices and principles involved in data architecture.  I’ll use SQL Server as a means to explain and demonstrate those practices & principles, and to provide specific guidance on how SQL Server helps to address a variety of data architecture concerns.

No comments:

Post a Comment

There was an error in this gadget