Summary
This blog is a guide on how to control metrics in Power BI using Nelson rules for Statistical Process Control (SPC).
I begin with an overview of SPC and provide practical examples of how it can add value across various sectors. I then explain why Nelson rules are important in SPC, and the logic used by each of the eight rules to identify unexpected values that signal when a metric is uncontrolled.
Finally, I detail how to build your own solution in Power BI, from configuring the data model to creating key DAX measures and formatting the control chart.
Solution (.pbix) and source files: https://github.com/RitchieBI/Control-Metrics-in-Power-BI-8-Nelson-Rules/tree/main
Statistical Process Control (SPC)
Statistical Process Control (SPC) was first introduced in the 1920s to boost production quality by monitoring and managing variation [1]. Its main tool, the control chart, became a vital part of the Six Sigma methodology introduced by Motorola in 1986 [2] and remains widely used in manufacturing today.
A control chart monitors how a value varies from the average value. This variation is compared to a statistically expected level of variation, represented by the standard deviation (SD), as shown below.
Nelson
Rules
Control charts define control limits, beyond which data
points are considered uncontrolled. Walter A. Shewhart, the pioneer of SPC, set
control limits at three standard deviations above and below the process average. This
remains the most important criterion in SPC and is often used in isolation.
For normally distributed data, just 0.3% of data points
satisfy Shewhart’s criterion for being out of control. As this criterion has
such a wide tolerance, there is a risk of missing critical instability signals,
such as trends, patterns, or clusters. To identify these instability signals,
Lloyd S. Nelson developed the eight ‘Nelson rules’ in 1984 [4]. Using
all eight Nelson rules in combination helps to detect early warning signs and a
broader range of out-of-control conditions. For this reason, I include all
eight Nelson rules in the provided solution.
The eight Nelson rules are detailed below.
Rule 1: At least one point is beyond three standard deviations from the mean* (at least one value is significantly out of control)
*From now on, I will use term "mean" instead of "average" to distinguish it from other types of averages, such as the median.
Rule 2: Nine points in a row are on the same side of the mean (prolonged bias)
I kept the data model simple, to focus our attention on the required measures and formatting. The tables ‘Process’, ‘Legend’ and ‘Icons’ were imported from GitHub using the Power Query Web Connector. Apart from the automatically applied transformations, ‘Promoted Headers’ and ‘Changed Type’ , no additional transformations were applied in Power Query.
A date table, ‘Date’, was created in DAX.
Date =
ADDCOLUMNS(
CALENDARAUTO(12),
"Year", YEAR([Date]),
"Month", FORMAT([Date],"MMM"),
"MonthNumber", MONTH([Date]),
"Day", DAY([Date])
)
A blank table, ‘_Measures’, was created in DAX for storing measures.
_Measures = {BLANK()}
A many-to-one relationship was created between the ‘Process’ table and the ‘Date’ table.
2. Total Value, Mean and Standard Deviation
Measures were defined for total value, mean, and standard
deviation. These measures are heavily referenced by other measures in the solution.
Total Value = SUM('Process'[Value])
Mean = AVERAGEX ( ALLSELECTED ( 'Date'[Date] ), [Total Value] )
SD = STDEVX.P ( ALLSELECTED ( 'Date'[Date] ), [Total Value] )
The measure [Alternating] was created
to detect alternating values. [Alternating] is used in the [Control Rules] measure
to test a value against Rule 3 and Rule 4. Rule 3 tests for six consecutive
points that do not alternate in direction, while Rule 4 tests for fourteen
consecutive points that alternate in direction. The DAX formula for
[Alternating] is detailed in Appendix A.
To establish that a value alternates
direction, it is necessary to know the value of the two prior data points. The
current value and the previous two values were defined as variables, as shown
below.
//-------Last 3 Values ---------------------------------------
VAR Value_Current = [Total Value]
VAR Value_BeforeCurrent =
CALCULATE(
[Total Value],
OFFSET(-1, ALLSELECTED('Date'[Date]))
)
VAR Value_TwoBeforeCurrent =
CALCULATE(
[Total Value],
OFFSET(-2, ALLSELECTED('Date'[Date]))
)
I used the SIGN function to identify the direction of movement between the first and second data points, and between the second and third data points. SIGN returns 1 for an upward direction, -1 for a downward direction, and 0 if there is no directional change.
//-------Previous and Current Direction of [Total Value] -----
VAR Direction_Previous =
IF(
ISBLANK(Value_TwoBeforeCurrent),
0,
SIGN(Value_BeforeCurrent -
Value_TwoBeforeCurrent)
)
VAR Direction_Current =
IF(
ISBLANK(Value_BeforeCurrent),
0,
SIGN(Value_Current -
Value_BeforeCurrent)
)
Finally, a Boolean expression is used to return the result. This returns TRUE if there are two consecutive directional changes, and the changes are opposite in direction. Otherwise, it returns FALSE.
//-------Condition for [Total Value] to alternate ------------
VAR Result =
Direction_Previous
<> 0 &&
Direction_Current
<> 0 &&
Direction_Previous
<> Direction_Current
RETURN
Result
Note that [Alternating] will always return FALSE for the first two data points, as there aren't three data points available for comparison. [Alternating] was verified by adding it as a data label on a line chart, as shown below.
4. Control Rules
The measure [Control Rules] was developed to evaluate each
data point against all eight Nelson rules. While sections of the measure are
shown below, the complete measure can be found in Appendix B. As the measure is
long, I only highlight some key concepts below. If you have any questions about
the specifics of the logic, please feel free to send me a message!
4.1 Rule Variables
4.2 SWITCH Function
SWITCH(
TRUE,
Rule1, 1,
Rule2, 2,
Rule3, 3,
Rule4, 4,
Rule5, 5,
Rule6, 6,
Rule7, 7,
Rule8, 8,
BLANK()
)
IF ( Rule1, 1,
IF ( Rule2, 2,
IF ( Rule3, 3,
IF ( Rule4, 4,
IF ( Rule5, 5,
IF ( Rule6, 6,
IF ( Rule7, 7,
IF ( Rule8, 8,
BLANK () ))))))))
As Rule 1 through 4 provide the strongest indication of whether a metric is controlled [5], the rules were listed in ascending order within the SWITCH function. This ensures that the most important rule is flagged if a data point meets the criteria of multiple rules.
4.3 WINDOW Function
VAR Last9 =
WINDOW ( -8, REL, 0, REL, ALLSELECTED ( 'Date'[Date] ) )
4.4 Rule 3 and Rule 4
For six out of the eight rules, the criterion is defined in terms of the mean and standard deviation. In contrast, the criterion of Rule 3 and Rule 4 are defined in terms of the number of consecutive alternating values.
In the expressions used to test Rule 3 and Rule 4, the [Alternating] measure was added as new column in the table variables ‘Last7’ and ‘Last14’, as shown below.
//Rule 4: Fourteen points in a
row alternating up and down
VAR Last14_Alternating =
ADDCOLUMNS ( Last14, "Alt", [Alternating] )
//[Alternating] returns FALSE
for the first 2 values of ‘Last14’, as there are not 3 points for comparison.
Therefore, if last 14 values are alternating, [Alternating] will return 12 as
TRUE
VAR Rule4 =
COUNTROWS ( FILTER ( Last14_Alternating, [Alt] = TRUE () ) ) = 12
&& COUNTROWS ( Last14 ) = 14
5. Line Chart and Conditional Formatting
To make it easy to spot why certain data points are out of
control, the colour of data point markers was conditionally formatted based on
the result of [Control Rules].
Marker Colour =
//Used to conditionally format
the colour of the markers
LOOKUPVALUE ( 'Legend'[Colour], 'Legend'[ID], [Control Rules] )
The ‘Legend’ table is shown below.
6. Custom Legend
The legend is a table visual with the columns ‘Legend’[Rule], [Legend], and ‘Legend’[Problem Indicated]. The measure [Legend] is equal to UNICHAR(11044), a large round circle that looks like a data point marker.
The table was formatted so that only the values were visible.
The measure [Legend Colour] was used to conditionally format the font colour of [Legend] so that it matched the marker colours on the graph.
Legend Colour =
//Used to conditionally format the colour of legend
MAX ( 'Legend'[Colour] )
As the legend is large, I wanted
to make it easier to read by filtering it to the relevant information, displaying
only the control rules that meet the specified criteria. To do this, I added
the following measure, [Legend Filter]
, as a filter on the table visual and filtered it to 1 in the
filter pane.
7. Other Formatting Options
Formatting depends on personal/client preference, so I won’t
dive into the specifics of each step I took. However, if you’re curious about
any of the details, feel free to drop a comment on this blog or message me on
LinkedIn—I’d be happy to chat!
Key formatting changes/additions:
- Line formatting
- Dynamic axis limits to give data labels enough space
- Data labels
- Dynamic text and images
- [Control Rules] in ‘Tooltips’ field
- Y-axis constant lines: the mean, and 3 lines above and below the mean in increments of 1 standard deviation
Further Developments
This proof of concept can be adapted to:
- Include specification limits
- Calculate the mean and standard deviation based on the last month or the last calendar month etc.
- Work with discrete data, such as the number of defective items in manufacturing process
- Include moving-range charts, which track the differences between consecutive measurements
- Work with multiple process categories, e.g., Machine 1, Machine 2, etc.
Thank you
Thanks for reading! Feel free to leave a comment if you have any questions or feedback. If you use this in a project or develop the solution further, please let me know—I’d love to hear about it!
References
[1] Shewhart, Walter A. 1931. Economic Control of Quality
of Manufactured Product. New York: D. Van Nostrand Company.
[2] Kumar, Sanjoy, and Yash P. Gupta. “Statistical Process
Control at Motorola’s Austin Assembly Plant.” Interfaces 23, no. 2
(1993): 84–92. http://www.jstor.org/stable/25061730.
[3] MetrologyGate. "The 8 Nelson Rules in the
Statistical Process Control (SPC)." Last modified January 10, 2024. https://metrologygate.com/en/2024/01/10/the-8-nelson-rules-in-the-statistical-process-control-spc/.
[4] Nelson, Lloyd S. 1984. “The Shewhart Control Chart—Tests
for Special Causes.” Journal of Quality Technology 16 (4):
237–39. doi:10.1080/00224065.1984.11978921.
[5] Rane, Shaligram, Rustom Mody, and Hitesh Patel.
"Continuous Process Performance Monitoring Using Nelson Rules." Pharma
Focus Asia. Accessed August 19, 2024. https://www.pharmafocusasia.com/manufacturing/continuous-process-performance-monitoring-using-nelson-rules.
Appendix
A; [Alternating]
Alternating =
//Returns TRUE if current value
of [Total Value] moves in an alternative direction to the direction moved by
the previous 2 values
//-------Last 3 Values ---------------------------------------
VAR Value_Current = [Total Value]
VAR Value_BeforeCurrent =
CALCULATE(
[Total Value],
OFFSET(-1, ALLSELECTED('Date'[Date]))
)
VAR Value_TwoBeforeCurrent =
CALCULATE(
[Total Value],
OFFSET(-2, ALLSELECTED('Date'[Date]))
)
//-------Previous and Current Direction of [Total Value] -----
VAR Direction_Previous =
IF(
ISBLANK(Value_TwoBeforeCurrent),
0,
SIGN(Value_BeforeCurrent - Value_TwoBeforeCurrent)
)
VAR Direction_Current =
IF(
ISBLANK(Value_BeforeCurrent),
0,
SIGN(Value_Current - Value_BeforeCurrent)
)
//-------Condition for [Total Value] to alternate ------------
VAR Result =
Direction_Previous <> 0 &&
Direction_Current <> 0 &&
Direction_Previous <> Direction_Current
RETURN
Result
Appendix
B; [Control Rules]
Control Rules =
//--------Mean and Standard Deviation-------------------------
VAR Mean =
[Mean]
VAR SD =
[SD]
//--------Window Functions------------------------------------
VAR Last3 =
WINDOW ( -2, REL, 0, REL, ALLSELECTED ( 'Date'[Date] ) )
VAR Last5 =
WINDOW ( -4, REL, 0, REL, ALLSELECTED ( 'Date'[Date] ) )
VAR Last7 =
WINDOW ( -6, REL, 0, REL, ALLSELECTED ( 'Date'[Date] ) )
VAR Last8 =
WINDOW ( -7, REL, 0, REL, ALLSELECTED ( 'Date'[Date] ) )
VAR Last9 =
WINDOW ( -8, REL, 0, REL, ALLSELECTED ( 'Date'[Date] ) )
VAR Last14 =
WINDOW ( -13, REL, 0, REL, ALLSELECTED ( 'Date'[Date] ) )
VAR Last15 =
WINDOW ( -14, REL, 0, REL, ALLSELECTED ( 'Date'[Date] ) )
//--------Rules-----------------------------------------------
//Rule 1: One point is beyond
three standard deviation from the mean
VAR Rule1 =
ABS ( [Total Value] - Mean ) > 3 * SD
//Rule2: Nine points in a row
are on the same side of the mean
VAR Last9_AboveMean =
COUNTROWS ( FILTER ( Last9, [Total Value] >
Mean ) )
VAR Last9_BelowMean =
COUNTROWS ( FILTER ( Last9, [Total Value] <
Mean ) )
VAR Rule2 =
( Last9_BelowMean
= 9
||
Last9_AboveMean = 9 )
&& COUNTROWS ( Last9 ) = 9
//Rule 3: Six points in a row steadily increasing or decreasing
VAR Last7_Alternating =
ADDCOLUMNS ( Last7, "Alt", [Alternating] )
VAR Rule3 =
COUNTROWS ( FILTER ( Last7_Alternating, [Alt] = FALSE () ) ) = 7
&& COUNTROWS(DISTINCT(Last7))
= 7
//Rule 4: Fourteen points in a
row alternating up and down
VAR Last14_Alternating =
ADDCOLUMNS ( Last14, "Alt", [Alternating] )
//[Alternating] returns FALSE
for the first 2 values of ‘Last14’, as there are not 3 points for comparison.
Therefore, if last 14 values are alternating, [Alternating] will return 12 as
TRUE
VAR Rule4 =
COUNTROWS ( FILTER ( Last14_Alternating, [Alt] = TRUE () ) ) = 12
&& COUNTROWS ( Last14 ) = 14
//Rule 5: At least two out of
three points in a row are more than 2 standard deviations from the mean in the
same direction
VAR Last3_MoreThan2SD =
COUNTROWS ( FILTER ( Last3, ( [Total Value] > (
Mean + 2 * SD ) ) ) )
VAR Last3_LessThanMinus2SD =
COUNTROWS ( FILTER ( Last3, ( [Total Value] < (
Mean - 2 * SD ) ) ) )
VAR Rule5 =
(
Last3_MoreThan2SD >= 2
||
Last3_LessThanMinus2SD >= 2 )
&& COUNTROWS ( Last3 ) = 3
//Rule 6: At least four out of
five points in a row are more than 1 standard deviation from the mean in the
same direction
VAR Last5_MoreThan1SD =
COUNTROWS ( FILTER ( Last5, [Total Value] > (
Mean + SD ) ) )
VAR Last5_LessThanMinus1SD =
COUNTROWS ( FILTER ( Last5, [Total Value] < (
Mean - SD ) ) )
VAR Rule6 =
(
Last5_MoreThan1SD >= 4
||
Last5_LessThanMinus1SD >= 4 )
&& COUNTROWS ( Last5 ) = 5
//Rule 7: Fifteen points in a
row are within 1 standard deviation of the mean, and are on both sides of the
mean
VAR Last15_Within1SD =
COUNTROWS ( FILTER ( Last15, ABS ( [Total Value] - Mean ) <=
SD ) )
VAR Rule7 =
Last15_Within1SD =
15
&& COUNTROWS ( Last15 ) = 15
// Rule 8: Eight points in a row
are not within 1 standard deviation of the mean, and are on both sides of the
mean
VAR Last8_Within1SD =
COUNTROWS ( FILTER ( Last8, ABS ( [Total Value] - Mean ) <=
SD ) )
VAR Last8_Min =
MINX ( Last8, [Total Value] )
VAR Last8_Max =
MAXX ( Last8, [Total Value] )
VAR Rule8 =
Last8_Within1SD = 0
&& Last8_Max > Mean
&& Last8_Min < Mean
&& COUNTROWS ( Last8 ) = 8
//--------Choose Rule-----------------------------------------
// Returns the rule that returns
true if the system is out of control
VAR NelsonRules =
SWITCH(
TRUE,
Rule1, 1,
Rule2, 2,
Rule3, 3,
Rule4, 4,
Rule5, 5,
Rule6, 6,
Rule7, 7,
Rule8, 8,
BLANK()
)
VAR Result =
IF ( [Total Value] <> BLANK (), NelsonRules, BLANK () )
RETURN
Result
Comments
Post a Comment