Metric: Difference between revisions

820 bytes added ,  1 December 2023
no edit summary
No edit summary
No edit summary
 
(7 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{a|mgmt|{{image|Dophin Thumbs Up|png|So long, and thanks for all the fish}}}}{{quote|When everything about a people is for the time growing weak and ineffective, it begins to talk about efficiency. ... Vigorous organisms talk not about their [[process]]es, but their aims.
{{a|mgmt|{{image|Dophin Thumbs Up|jpg|So long, and thanks for all the fish}}}}{{quote|When everything about a people is for the time growing weak and ineffective, it begins to talk about efficiency. ... Vigorous organisms talk not about their [[process]]es, but their aims.
:— {{Author|G. K. Chesterton}}, ''Heretics''}}
:— {{Author|G. K. Chesterton}}, ''Heretics''}}
{{Quote|
{{Quote|
Line 10: Line 10:
To be contrasted with the ineffable, inarticulable skills that are provided by a [[subject matter expert]].
To be contrasted with the ineffable, inarticulable skills that are provided by a [[subject matter expert]].
===[[Goodhart’s law]]===
===[[Goodhart’s law]]===
Not a law of economics or sociology so much as a wry remark — professor Goodhart made it at a symposium in 1975 — that, happens to pierce modern management orthodoxy through the heart.  Thus it can both spur its own industry of academic work in sociology and [[systems theory]], and at the same time go ignored in the upper tiers of corporate management:
Not a law of economics or sociology so much as a wry remark — professor Goodhart made it at a symposium in 1975 — that happens to pierce modern management orthodoxy through its heart.  Thus it can both spur its own industry of academic work in sociology and [[systems theory]], and at the same time go ignored in the upper tiers of corporate management:


{{Quote|When a measure becomes a target, it ceases to be a good measure.}}
{{Quote|When a measure becomes a target, it ceases to be a good measure.}}People are smart and selfish. They will work any target you set to suit themselves. If you tax by number of windows, people will board up their windows.<ref>See James C. Scott’s epic {{Br|Seeing Like a State}}.</ref>


It is so universal enough to apply even to dolphins. An aquarium in Miami is reported to have dealt with the problem of litter and dead seagulls in the main tank by rewarding dolphins for cleaning them up: a fish for each bird, or piece of litter.
===On dolphins, seagulls and opposable thumbs===
Goodhart’s law is universal enough to apply even to dolphins.  


Before long the dolphins were observed breaking up pieces of litter and claiming multiple fish for one each, and then stockpiling surplus fish, luring seagulls with them, and killing the seagulls!<ref>[https://open.spotify.com/episode/3y799K1qGOhqxPUGcqXwx0 Rationally Speaking podcast episode 240.</ref>
An aquarium in Miami is reported to have dealt with the problem of litter and dead seagulls in the main tank by rewarding the resident dolphins for cleaning them up: a fish for each bird or piece of litter.


“we should ” as xxx remarked, “bevgrateful dolphins don't have opposable thumbs.
Before long, the dolphins were observed shredding bits of litter into smaller pieces and claiming multiple fish, and then stockpiling surplus fish, luring seagulls with them, and ''killing the seagulls''.<ref>[https://open.spotify.com/episode/3y799K1qGOhqxPUGcqXwx0 Rationally Speaking podcast episode 240.</ref>


One could, and here I am indebted to  [https://modelthinkers.com/mental-model/goodharts-law this] excellent resource on [[Goodhart’s law]], break the phenomenon down into four components.
“We should ” as someone on the podcast remarked, “be grateful dolphins don’t have opposable thumbs.


*'''Regressive''': using a single metric as a proxy to measure “[[multivariate]]” phenomena that are driven by several factors. Here [[Simpson’s paradox]] is not your friend. Much “[[social justice]]” — which we define as the wishful, if not wilful, tendency to boil complex socioeconomic phenomena down to simplistic moral propositions that even a dull fifth-former could understand, and only a dull fifth-former would fall for — stumbles into this trap.  
===Four problems with metrics===
One could, and here I am indebted to [https://modelthinkers.com/mental-model/goodharts-law this] excellent resource on [[Goodhart’s law]], break the phenomenon down into four components.


*'''Extremal''': Where a given metric is useful ''within a range'' — such a range generally corresponding to “normalcy”: “peacetime”, “normal operating conditions”, “[[business as usual]]” and similar platitudes — but which breaks down, fails, or even reverses itself in extremes or unusual cases beyond that range. Paging Messrs [[Black Scholes option pricing model|Black and Scholes]]. You know, using [[normal distribution]]s of independent events to model dependent events, like human behaviour of the market. These are especially fraught because the 80% of the time these metrics work, and they work fabulously, is exactly the range over which ''it doesn’t matter whether they work or not''. When things ''aren’t'' blowing up. The use case for the metric in the first place, remember, was to warn you about risk events. A “heightened risk” metric that you can only rely on when there isn’t a heightened risk is a ''waste of cheese''.
====Regressive====
Using a single metric as a proxy to measure “[[multivariate]]” phenomena that are driven by several factors. Here [[Simpson’s paradox]] is not your friend. Much “social justice” — which we define as the wishful, if not wilful, tendency to boil complex socioeconomic phenomena down to simplistic moral propositions that even a dull fifth-former could understand, and only a dull fifth-former would fall for — stumbles into this trap.  


*'''Causal''': the old [[correlation]] is not the same as [[causation]] chestnut. It may be true that people often buy ice-cream when they are wearing sunglasses, but handing out complimentary sunglasses will not improve ice-cream sales.
====Extremal====
Many metrics are useful ''within a range'' corresponding to “normalcy”: call it “peacetime”, “normal operating conditions”, “[[business as usual]]” and similar platitudes — but break down, fail, or even reverse themselves in extremes or unusual cases beyond that range. Using [[normal distribution]]s of independent events to model non-dependent events with non-linear distributions — like, well, anything that involves human behaviour, such as a market — is especially fraught, because even where, 95% of the time, your metrics work fabulously, that 95% is exactly the range over which ''it doesn’t matter whether they work or not''. This is the time where things are operating as normal, behaving themselves, and ''not'' blowing up.  


*'''Adversarial''': Substitute targeting the desired outcome — a senior tranche in a portfolio of mortgages that will not default in any circumstances — with one that is rated AAA. Hello, [[global financial crisis]].
The “use-case” for any metric in the first place, remember, is to warn about risk events. No-one needs a light on the dashboard saying “everything is fine”.  A “heightened risk” metric that you can only rely on when there isn’t a heightened risk is a ''waste of trees''.
 
====Causal====
Metrics fall for the old “[[correlation]] is not the same as [[causation]]” chestnut. In recent years, prompted we think by the [[difference engine]]’s emergence as the machine of choice for measuring things, we have given up on the idea of proving out causal chains. We are happy enough to rely on correlations. But correlations may be meaningful or spurious, and even where meaningful they give no idea which way the causal arrow flows. It may be true that people often buy ice-cream when they are wearing sunglasses, but handing out complimentary sunglasses will not improve ice-cream sales.
 
====Adversarial====
Substitute targeting the desired outcome — a senior tranche in a portfolio of mortgages that will not default in any circumstances — with one that is rated AAA. Hello, [[global financial crisis]].