Data modernism: Difference between revisions

no edit summary
No edit summary
Tags: Mobile edit Mobile web edit
No edit summary
Line 24: Line 24:
But this is a difference of emphasis not  upshot. It is a different path to the same place. Data science is just a new conveyance to the same reductionist theory of the world.  
But this is a difference of emphasis not  upshot. It is a different path to the same place. Data science is just a new conveyance to the same reductionist theory of the world.  


=== Unstructured data as hubbub ===
=== Unstructured data as [[hubbub]] ===
Now data, as it comes, is an incoherent, imperfect, meaningless thing. It is the pre-cinema audience chat before the lights go down;  a “hubbub” made up of millions of individual interactions, each of which ''may'' have its own meaning '''—''' or may be incoherent, or wrong-headed, or irrelevant '''—''' but in any case when aggregated and taken as an unmoderated whole has no particular meaning at all, beyond “people are talking”?
Now data, as it comes, is an incoherent, imperfect, meaningless thing. It is the pre-cinema audience chat before the lights go down;  a “[[hubbub]]” made up of millions of individual interactions, each of which ''may'' have its own meaning '''—''' or may be incoherent, or wrong-headed, or irrelevant '''—''' but in any case when aggregated and taken as an unmoderated whole has no particular meaning at all, beyond “people are talking”?


Now imagine being asked to take that audience hubbub and, with magical tools, to condense it to the single proposition: “what was this audience thinking?”  But the interactions are unstructured, as between themselves random and disconnected. Obviously, there ''is'' no thread.
Now imagine being asked to take that audience [[hubbub]] and, with magical tools, to condense it to the single proposition: “what was this audience thinking?”  But the interactions are unstructured, as between themselves random and disconnected. Obviously, there ''is'' no thread.


But this is what the algorithm is supposedly doing when it extracts signal from noise. Selectively, it filters, limit, compresses and amplifies on the presumption that there ''is'' a signal to fund among noise; that all the conversations in that hubbub do boil down to some common sentiment, and that those which don’t are no more than noise: that the hubbub is something like a de-tuned radio, or the white noise on the SETI<ref>Search for Extra-Terrestrial Intelligence. You know, Jodie Foster in ''Contact''.</ref> data, buried within which are signals from pulsars, quasars and intelligent life.  
But this is what the algorithm is supposedly doing when it extracts signal from noise. Selectively, it filters, limit, compresses and amplifies on the presumption that there ''is'' a signal to fund among noise; that all the conversations in that [[hubbub]] do boil down to some common sentiment, and that those which don’t are no more than noise: that the [[hubbub]] is something like a de-tuned radio, or the white noise on the SETI<ref>Search for Extra-Terrestrial Intelligence. You know, Jodie Foster in ''Contact''.</ref> data, buried within which are signals from pulsars, quasars and intelligent life.  


But the hubbub is not like that. It can’t be reduced to prime factors. There is not a common signal. SETI is a bad [[metaphor]]: it tries to detect a single bilateral signal from a spectrum of other kinds of radiation that are not a signal, but are broadcast on the same frequency. With the human hubbub ''all there is is signal''. It is just that all the signals conflict, or miss each other, or bear no relation to each other at all. There are a spectrum of unconnected communications and ''no'' real “signal”. We are not trying to isolate a single conversation out of all the other ones — that is the direct analogy — but trying to extract a an aggregated message that is not actually there, and to treat is as an [[Emergence|emergent]] property of all those conversations. This is a different thing entirely.  
But the [[hubbub]] is not like that. It can’t be reduced to prime factors. There is not a common signal. SETI is a bad [[metaphor]]: it tries to detect a single bilateral signal from a spectrum of other kinds of radiation that are not a signal, but are broadcast on the same frequency. With the human [[hubbub]] ''all there is is signal''. It is just that all the signals conflict, or miss each other, or bear no relation to each other at all. There are a spectrum of unconnected communications and ''no'' real “signal”. We are not trying to isolate a single conversation out of all the other ones — that is the direct analogy — but trying to extract a an aggregated message that is not actually there, and to treat is as an [[Emergence|emergent]] property of all those conversations. This is a different thing entirely.  


''There is no 2 from millions of unrelated conversations''. The result is brown, warm and even: maximum ''entropy''.
''There is no 2 from millions of unrelated conversations''. The result is brown, warm and even: maximum ''entropy''.
Line 42: Line 42:
It is natural human nature to read that against your personal situation and come to a view  — as if reading a horoscope.  Suddenly everyone in the cinema ''does'' have a view. They will invest in that conversation.
It is natural human nature to read that against your personal situation and come to a view  — as if reading a horoscope.  Suddenly everyone in the cinema ''does'' have a view. They will invest in that conversation.


But the hubbub was just noise all along. None of the individual conversations had anything to do with each other. All had their own, independent meanings. They are ''immune'' to aggregation.
But the [[hubbub]] was just noise all along. None of the individual conversations had anything to do with each other. All had their own, independent meanings. They are ''immune'' to aggregation.


We say “we have unconscious biases and they inform our reactions”. Well, no ''shit''.
We say “we have unconscious biases and they inform our reactions”. Well, no ''shit''.