Exchange Ideas

Causal Capital

RMB - Risk, Markets & Banking

 

« Is this the demise of Dow? | Main | Throwing it around at UBS »

October 01, 2007

Simpson's Paradox

Recently I was at one of those typical conferences where pensive bankers float around the tea and cakes while the hoard of speakers promote one novel yet convoluted topographic framework after another. Some of the speakers delivered quite convincing, highly vocal demonstrations crammed to the brim with hard hitting cliches and a bouquet of consultant friendly diagrams but were alas such empirically vacuous pieces of work. My abdication from frustration finally came during the break where I was lucky enough to entertain an intriguing conversation with a lonely soul who was propping up the symmetrically stacked chocolate eclairs. He claimed that in his opinion control self assessment is the longest running institutional fraud of the operational risk camp and even with game theory under check by auditing departments directly, the results were staggeringly wrong. Business units that were showing the highest signs of improvement when benchmarked seemed to have most of the problems, how could this be?

Perhaps the problem lies in a Simpson's Paradox.

Simpson's Paradox is a fascinating puzzle that often appears when inferred causal relationships are aggregated based on two variables and it is particularly common when percentages are used to present holistic data that has many entry points, a typical structure for control self assessment and a breeding ground for the disorder.

This condition was apparently publicly discovered (however I doubt that something this obvious could fool statisticians for so long) in a submission of statistics for the University of California at Berkeley in 1973. In short, a study was carried out to assess whether the admission process for the university was fair, was the university discriminating based on gender and if so which gender was loosing out. The top level report showed favoritism towards male students however analysis of the underlying data painted a very opposite picture and the study sample was so large that it could not be blamed on variation of random variables, so how could a reversal and a very misleading error occur?

The best way to explain this is with a simple example. Imagine two departments (Department A and Department B) are running a control self assessment program which reports the results on improvements to specific controls that have been assigned to the business units for correction:

-------------Month 1----Month 2----Overall
Dep A----62.10%-----20.00%-----58.09%
Dep B----80.00%-----26.30%-----31.42%

Department B for month 1 and month 2 appears to be out performing department A each time however when the overall tally for department B is shown, it is well below the real performance of department A.

What is going on here?

The proof with this one is in the detail; In the first month department A corrected 59 out of 95 controls (62% ish) and department B processed 8 out of 10 (80%) corrections. In month 2 department A corrected 2 out of 10 controls (20%) while department B cleaned up 25 out of its core 95 controls (26% ish). Both departments are running 105 control corrections and while department B is performing much better than department A each time, the real truth is being hidden.

All this is quite obvious however couple this paradox with the differential in risk profiles between departments and the random volume number that each department processes daily and you could a have a real governance issue on your hands.

Posted by CausalEvents at October 1, 2007 02:56 AM

Comments

Post a comment




Remember Me?

(you may use HTML tags for style)

What can I do with PRMIA online?