Analyzing Data Methods in Six Sigma: Designed Experiments

Benefits of Designed Experiments

Traditional experiment models, as outlined in the previous chapter, attempted to keep all variables in the experiment constant except for one. This one variable, then, is manipulated to determine how its change affects the data. Creating experiments this way is a common practice and is one taught in most schools. However, it sometimes isn't possible to keep all the other variables constant, and that can adversely affect results. Additionally, experimental error isn't factored into these models.

Statistically designed experiments, on the other hand, erase these challenges. Multiple variables can be manipulated all at once, and the results can be statistically analyzed. This results in a conservation of resources for the organization. Furthermore, experimental errors are quantified and can be statistically accounted for. Finally, designed experiments can also detect interactions among variables, which is something traditional experiments can't do.

In short, statistically designed experiments allow for the manipulation of several variables at the same time, and the data mined from these experiments is statistically analyzed for the effect of one variable or a combination of variables on the outcome.

Important Terminology

In Six Sigma projects that use statistically designed experiments, there are several important terms that are used. A block is a natural grouping that should generate some results. A plot is a smaller unit within a block. A treatment is the specific thing being tested. Treatments are applied to something in the experiment. And, the yield is the result generated from the experiment.

When working with statistically designed experiments, there are also several terms that deal with different kinds of variables and how they work. A response variable is simply the variable being studied because it responds to other variables. A primary variable refers to any variable that can be controlled and is believed to somehow affect the results. Background variables may have some kind of effect on the results; however, they cannot be or should not be controlled in the experiment. Interaction refers to the effect of one item in the experiment being dependent upon another item.

Every experiment cannot be controlled perfectly, so there are situations in which outcomes are affected by unknown variables. This is referred to as experimental error. To minimize the effects of experimental error, randomization is used, which requires that each set of test units are assigned to test conditions in a completely random way. Each test unit has the same chance of being assigned to one test condition.

Reliable Experiments

If the results of an experiment are to be useful and reliable, the experiment must be well designed. It takes time and careful planning to ensure a good experiment. For example, a plan for a good experiment must be written down and agreed to by all people on the Six Sigma project.

The experiment must have clearly stated objectives and a purpose. This will guide the work. There must be an explanation of what treatments will be given and how they will be given. How the objectives of the experiment will be met must be explained as well. Also considered should be the size of the experiment (and samples) as well as the expected time frame.

A good experiment has the ability to be replicated, and in many instances, is required. Replication is simply collecting data using the same conditions as the original experiment, multiple times. This helps account for errors and other factors that may have influenced any variances.

There are many different types of statistically designed experiments; however, can be researched elsewhere.

Testing Assumptions

It is important to note that results for experiments are only valid if certain assumptions are indeed true. Someone looking at data as a result would assume that the experiment was carried out with the fewest disruptions as possible. However, there are other assumptions that must be tested about the data before it can be presented.

Statistical independence is an assumption that two values are not related. In other words, if there is one result, it in no way affects the other result. If this is the case, there are different tests that can be used to assure that two values are indeed not related. If after running the tests it appears that two values are not independent, then there are methods that can try to correct the situation or statistically explain it.

The normal assumption is, very simply, that the results will fall within a normal range. Even if there are no previous results to compare with, there are statistical tests that can determine if the results are normal. If results of an experiment are not normal, they can be left alone, transformed into a more normal model, or applied to another distribution.

The equal variance assumption assumes that there is equal variance for each of the treatments applied. Also, the linear model assumption expects that results of an experiment are linear in fashion. When this is not the case, a there are several different linear models that can apply.

Interested in learning more? Why not take an online Six Sigma course?

Categorical Data

When dealing with items and distributions of results in Six Sigma projects, sometimes it is beneficial to compare the percentage of items distributed among several categories. This is referred to as analyzing categorical data. This is usually accomplished by taking a sample from each group and assigning the individual items to quality categories, such as excellent, good, fair, or poor.

Logistic regression looks for relationships between response variables and predictors. In other words, the categorical data is analyzed so that causes are found for particular effects.

Non-Parametric Methods

After examining how designed experiments work, there may be a time when they are not useful. If assumptions are tested and found not to be true, non-parametric methods may be used. These methods are not as good at proving or disproving a hypothesis as statistically designed experiments are, but they are of value if the assumptions are not true.

Once data is sufficiently analyzed, it's time to move on to the Improve Phase of DMAIC.

The Improve Phase
After the necessary data has been analyzed, and causes and effects are determined, the next phase in the DMAIC model is Improve. This phase involves implementing a new system for co

Customer Demands

Logically, there must be a customer demand for a product or service to be successful. Making changes or improvements to a product or service based on customer demand sounds like an easy proposition that would result in additional sales. However, it's not always that simple.

Some challenges posed by making changes based on customer demands include different customers wanting different things, tradeoffs necessary to make a new feature possible, the cost versus quality debate, and internal departmental needs and demands as well.

To help navigate these issues, a helpful approach is to provide a model for customers with which to prioritize their demands. Out of a list of possible improvements, a sample of customers can determine which ones are most important to them and which ones are least important. It is essential to find out what customers value and what they are willing (or not) to pay for.

Lean Techniques

In the Analyze Phase, techniques were introduced as a model in which as much waste as possible is reduced or eliminated. This puts the emphasis on value for the customer rather than on mass production. In other words, mass production strives to meet the demands of a majority of customers by streamlining certain processes, but in the end, there are waiting times and excess inventory within this model.

However, if emphasizing value for the customer by meeting customer needs is a primary goal for making improvements, then there are several tools that are useful. Constraint management focuses on the constraints, or bottlenecks, in a process and tries to manage them effectively to reduce waiting times and other waste. Level loading is a scheduling method that tries to be responsive to customer demands and still attempts to produce the same quantity every day. Pull systemssimply produce items when customers need them rather than mass producing, or pushing, items ahead of time. Using a flexible processallows for a manufacturing environment to adapt quickly to customer orders so that any particular item can be made with a simple reconfiguration of the production area. Lot size reduction means that only the items needed immediately for production are ordered and are on hand. In short, these techniques mean that when a customer places an order, the organization has everything at its disposal to fill the order, the order is produced by a simple reconfiguration, and the order is delivered when the customer needs it.

Empirical Models

An empirical model approach to making improvements calls for a statistical approach for making the best improvement choices. A Six Sigma empirical model uses six different stages or phases.

Phase 0 uses analysis of data to detect problems, performance, cost, time, and results. This is gathering and analyzing all known data to identify the changes to be made. Phase I calls for using designed experiments to measure effects based on all sorts of variables. Phase II builds on Phase I by taking the effects that are determined to be most important in implementing change and then using further designed experiments to figure out the optimal causes that achieve the desired effects. Phase III uses factorial experiments to develop new tests for new situations. It takes the results of Phase II and applies them to improve situations. Phase IV takes Phase III situations and perfecting those new situations by using a composite design. Finally, Phase V implements robust changes in the value stream process to ensure optimal performance in the new system.

Data Mining and Neural Networks

Understanding that experimenting on a fully functioning workplace can have a negative effect on productivity, using real experiments is sometimes a luxury that the bottom line can't afford. In these instances, Six Sigma uses a virtual experiment approach. In this case, the Six Sigma team can use mined data that the company already has on record. Examining this existing data can also show cause and effect relationships that experiments are trying to reveal.

Taking this one step further, this data can be manipulated in neural network software, which can emulate a live testing environment. These neural networks, while quite accurate, can't replace an actual live experiment, but the results are close enough to be able to implement changes in the real workplace effectively.


In using simulation, Six Sigma teams can once again avoid experimenting directly on the operating workplace. Simulation requires the use of detailed models that accurately represent the workplace. While the model can be viewed only as a representation and not as good as the real thing, it can provide a few benefits. A major benefit in using simulation to determine if proposed improvements would work is the tendency for something else to be adversely affected if a change is made elsewhere in the workflow. This happens commonly in workplaces. A simulation can reveal these possible disruptions before it ever reaches the actual value stream.

When working with simulations, there are many software choices and statistical models that can be used to aid the process and to ensure a higher level of accuracy. All simulation results must be compared to actual numbers to determine if the simulation is valid.


Risk Assessment and Performance Standards

After new improvements have been chosen, proper risk assessment and performance standards need to be developed. Performance standards are important because tolerance levels for what is acceptable and what's not acceptable ensure that the standard for quality production is met. Statistical formulas help determine these tolerances.

When a new design is considered, it must be evaluated for how it works, its plausibility, and its maintenance requirements. If it is chosen for implementation, it must be noted that failures occur in the real world and cannot be completely avoided. To minimize the effects of failures, a Six Sigma team will predict all the possible failures in a process and prioritize them by their effects on the workflow. The team plans for failures and, as a result, develops contingency plans so that work can continue with as minimum a disruption as possible. Likewise, it's also imperative to plan for safety issues.

When planning for possible failures, the effects of each kind of failure must be evaluated as to the probability for that kind of failure occurring as well as the probability that the failure will go undetected. Planning for the worst helps keep everyone prepared in the event that it does happen.

The Improve Phase of the DMAIC model involves experiments that are conducted to help improve the value stream and the workflow to minimize waste. Once these improvements are decided upon, proper planning must occur to minimize and plan for risk.