Data Collection Plan

The Data Measurement Plan:

– Data Collection Plans are used in the Measure Phase of Lean Six Sigma projects.

For each performance measure (Y), update a data collection plan to:

– Include MSA (Measurement System Analysis) measure plan (Gantt chart, MS-Project plan is Optional)

– Add Financial measure plan if separate from performance Y

– Add any Time Study or other data collection plans for Value Stream Map

– Sample Size Calculation

– Input, Process and Output Metrics (confirm that the right process metrics have been chosen and logical trade-offs have been made in determining what to measure.)

– Is this project driven by Customer specifications? If so, how do you know that the specifications satisfy customer critical requirements?

– What was the process for determining the metrics in the Data Measurement Plan?

– What trade-off’s were made is determining the final set of metrics for which to gather data?

The extent to which these have been addressed or executed form the basis of evaluating whether or not to allow the project to proceed with the actual data collection.

Key Question: Does the data currently exist?

Existing Data

– Taking advantage of archived data or current measures to learn about the Output, Process or Input

– This is preferred when the data is in a form we can use and the Measurement System is valid (a big assumption and concern)

New Data

– Capturing and recording observations we have not or don’t normally capture

– May involve looking at the same “stuff,” but with new Operational Definitions

– This is preferred when the data it is readily and quickly collectable (it has less concerns with measurement problems)

Existing vs. New Considerations

– Is existing or “historical” data adequate?

– Meet the Operational Definition?

– Truly representative of the process, group?

– Contain enough data to be analyzed?

– Gathered with a capable Measurement System?

– Cost of gathering new data

– Time required to gather new data The trade-offs made here, i.e. should the time and effort be taken to gather new data, or only work with what we have, are significant and can have a dramatic impact on the project success

Check Sheets:

– The workhorse of data collection

– Enhance ease of collection

– Faster capture

– Consistent data from different people

– Quicker to compile data

– Capture essential descriptors of data “Stratification factors”

– Need to be designed for each job

How will Data Be Collected:

1. Select specific data & factors to be included

2. Determine time period to be covered by the form (Day, Week, Shift, Quarter, etc.)

3. Construct form

– Be sure to include: (clear labels; enough room; space for notes; test the form)


– Include name of collector(s) (first & last)

– Reason/comment columns should be clear and concise

– Use full dates (month, date, year)

– Use explanatory title

– Consider lowest common denominator on metric (Minutes vs. Hours; Inches vs. Feet)

– Test and validate your design (try it out)

– Don’t change form once you’ve started, or you’ll be “starting over”!

As you set up Check Sheets:

– Prepare a spreadsheet to compile the data.

– Think about how you’ll do the compiling (and who’ll do it).

– Consider what sorting, graphing or other reports you’ll want to create.

– Continuous or Discrete data?

– Adequate level of Discrimination and Accuracy?

– Adjust check sheet as needed to ensure usable data later, but don’t make data harder to collect.

Who Will Collect the Data?


– Familiarity with the process

– Availability/impact on job

Rule of Thumb – If it takes someone more than 15 minutes per day it isn’t likely to be done – Potential Bias

– Will finding “defects” be considered risky or a “negative”?

– Benefits of Data Collection

Will data collection benefit the collector? Be Sure They…

– Give input on the check sheet design

– Understand operational definitions (!)

– Understand how data will be tabulated

– Helps them see the consequences of changing

– Have been trained and allowed to practice

– Have knowledge and are unbiased

Narrow Potential Key Process Input Variables (KPIVs):

– Have the potential root causes been narrowed?

– Was a Cause and Effect (C&E) Matrix used? If so, what were the results?

– How were the KPOVs (Key Process Output Variables) rated?

– Did people who operate the process, technical experts, and supervisors collaborate to produce the C&E Analysis?

– Have you characterized the variables (controllable, uncontrolled [noise], etc)?

– Was a Pareto Chart used to select potential Key Process Input Variables (KPIVs) from the C&E Matrix?

– How many KPIVs do you have at the beginning and end of C&E Matrix?

– Are there any potential KPIVs which need immediate Baseline capability and MSA?

– Are these potential KPIVs monitored in the workplace?

– Which process steps stand out as especially significant in the C&E Matrix?

– Is there any process step that the team feel can be eliminated or combined?

Key Steps:

– Fill in the Output measure Y.

– Fill in the key stratification questions you have about the process in relationship to the Y.

– List out all the levels and ways you can look at the data in order to determine specific areas of concern.

– Create specific measurements for each subgroup or stratification factor.

– Review each of the measurements (include the Y measure) and determine whether or not current data exists.

– Discuss with the team whether or not these measurements will help to predict the output Y, if not, think of where to apply the measures so that they will help you to predict Y.