Data Ethics LLC
  • Home
  • About
    • What We Do
    • Team
    • Privacy Statements >
      • Website Privacy Statement
      • CRM Privacy Statement
  • Solutions
    • Data Analytics
    • Data Governance
    • Data Privacy
    • Data Protection Officer
  • Contact

Key Performance Indicator Assessment (KPIA) Process

KPIA #5: Data Review

1/21/2018

0 Comments

 
I find it like opening a present. You get a feeling of excitement and anticipation as you open a data set for the first time. Today we are going to spend time reviewing data that has been extracted from one of the applications that you inventoried as outlined in yesterday’s process, Identifying Data Sources. If this experience of opening a data set for the first time is new to you, don’t worry because we are going to work through this process together using the following steps:
  1. Determine the file type and make sure that you have the correct application that will be able to open the data file. For example, files that have the following extensions:.xls, .xlsx, .txt, .csv are all viewable in Microsoft Excel. These file extensions are the most common type of data file formats. If the file of data you have is of another format, include a note in the comment section for additional help.
  2. Once the file type is determined, open the file and the file should open in worksheet format. (If the file is a text file and opened into a text document, simply close the document and open the file from within the spreadsheet application).
  3. Once the file has been opened successfully, take a pause. Even the most experienced analysts will feel the pressure of the moment to dive into the data file and important information can be overlooked at first. Make sure to take a pause and admire the data set like it is a piece of art. 

    Picture 1: The Art of the Data Table
Picture
Take in the view. Look at the columns and rows and get a feel for the construction of the file. Look at the very bottom of the screen. There should be a tab with the name of the data sheet that is currently open. Is there more than one tab? If so, a colleague must have programmed an extract with several tabs or organized the data into tabs after extraction. Take a moment to click through those tabs and see what information is contained on each.

  • It is best practice to provide a data dictionary with a data file. Sometimes this data dictionary is included on another worksheet within the data file or as a separate text document. The data dictionary is like a welcome message to the file that you have and defines the data attributes that are contained in the file. Take a look at the data table in Figure 1. You might guess what each of these data attributes are referring to, but are you 100% sure? If you are not 100% sure, you need to find out. Analysis based on assumptions leads to too many problems and head pain when what is assumed is false. What does the date field in Figure 1 represent? Is it about the shoes alluded to in the third column? Perhaps date of purchase, perhaps date of first wear? Perhaps this is too much guessing? The answer should be in the data dictionary, if one was not provided, find out what the column means for sure. Once you find out create your own data dictionary. Figure 2 provides an example of a data dictionary for the table in Figure 1.


Example data set
Data Dictionary
  • Time for some more inventorying. For each file that you are examining answer the following questions:
    • Which KPI is this data set associated with? (List all).
    • How many columns of data are there?
    • How many rows of data are there?
    • Is there a unique ID for every row of data?
    • Is there a data dictionary?
      • Does the dictionary contain a description entry for every column?
      • Are there any special notes regarding the level of completeness of the data file (we will focus more on this in the next blog)?
    • If there is no data dictionary
      • Have you taken the time to make a data dictionary?
      • As you talk to people who have the answered about the data you are looking for, have you made special notes about the data that may be important later in the data process?
    • Based on a scan of the data, does the data look consistent: date columns only have dates, text columns only have text etc. If not make sure to make a note in your data dictionary.
After you have reviewed each data file that may be of use and completed the checklist outlined in step 5, you are almost done with the data review process. The final step is to set aside any data set that is not relevant to the KPIs you are assessing. Cross check each file one last time and catalog the rationale as to why the data set is not needed. 

Blog #5’s question: Have you set aside time to review each data file and take in its “artistic” qualities?
​

Blog #6 Sneak Peak: Data completeness. 
0 Comments



Leave a Reply.

    Author

    This blog is written by our founder/principal consultant Dr. Brandan Keaveny. Learn more about Dr. Keaveny here. 

    Archives

    January 2018

    Categories

    All
    Data Review
    Overview/Introduction
    Quarterly Analysis
    Test Run

    RSS Feed

Picture
​© 2019 Data Ethics LLC
HOME
WHAT WE DO
PRIVACY STATEMENTS
CONTACT
Picture
Picture
DataEthics BBB Business Review
  • Home
  • About
    • What We Do
    • Team
    • Privacy Statements >
      • Website Privacy Statement
      • CRM Privacy Statement
  • Solutions
    • Data Analytics
    • Data Governance
    • Data Privacy
    • Data Protection Officer
  • Contact