Predictive analytics castle on a data quicksand – the D3 ills and 3C pills.

House on Quicksand

In the era of data-driven decisions, it’s only too easy to build a house on quicksand. Recognize that data has three ills:

  1. Dirty – Data have incorrect or missing records.
  2. Disorganized – Lack of standard nomenclature or yardsticks.
  3. Disconnected – Data are spread over discrete data-stores.

Where there are three ills, there are also cures. These are:

  1. Clean РRectify records and remove outliers.
  2. Consistent РStandardize nomenclature and establish common yardsticks.
  3. Connected – Curate meta-data “glue” to assemble data together like LEGO(R) bricks.

This is the data wrangling challenge. Remember, if your data has D3 ills, take the 3C pill!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: