Desktops, metadata and filing: Difference between revisions

From The Jolly Contrarian
Jump to navigation Jump to search
No edit summary
Tags: Mobile edit Mobile web edit Advanced mobile edit
No edit summary
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{a|technology|{{wmc|Xerox Alto with mouse and chorded keyset - Computer History Museum.jpg|What you see is what you got, yesterday: A Xerox Alto with a portrait monitor. Nice.}}{{wmc|Desktop icons for Xerox Star 8010.jpg|The Alto’s “desktop”}}{{wmc|Visicalc.png|VisiCalc on the Apple II}}}}====A bad metaphor: the desktop====
{{afreeessay|technology|metadata|{{wmc|Xerox Alto with mouse and chorded keyset - Computer History Museum.jpg|What you see is what you got, yesterday: A Xerox Alto with a portrait monitor. Nice.}}{{wmc|Desktop icons for Xerox Star 8010.jpg|The Alto’s “desktop”}}{{wmc|Visicalc.png|VisiCalc on the Apple II}}}}
 
{{drop|I|n 1973, Xerox’s}} Palo Alto Research Center released the “Alto” personal computer. This was the first machine equipped with a graphical user interface (GUI) instead of the traditional character user interface.<ref>It was well ahead of its time: the GUI would not become mainstream until Apple released its Macintosh a decade later, in 1984.</ref>
 
To lessen the cognitive burden on users — at the time, bowler-hatted bureaucrats, sleeve-gartered clerks and others whose mental framework comprised a typing pool and boys running memoranda between office in-trays in reusable manila envelopes, and whose idea of “information technology” was a {{pl|https://pneumatic.tube/the-lamson-pneumatic-tube-system-at-jacksons-of-reading-uk|pneumatic tube system}} that launched invoices around the clanking pipes of the organisation like mortar bombs — Xerox came up with a visual  metaphor.
 
If users were going to be asked to give up their card catalogue system and stare at a computer screen all day then best make it as familiar as possible. Thus, the Alto interface was modelled on a  “[[desktop]]”: not an impenetrable wall of green code and a flashing cursor, but a cartoonish depiction of a ''literal'' desktop with its familiar iconography: folders, a blotter, filing cabinets, in-trays, out-trays and even a dinky little waste-paper basket. 
 
All designed to reassure the [[meatware]] —perennially fearful of incipient obsolescence as it was, and still is — that the [[change journey]] from the analogue to the atomic age would not be so bad after all.
 
====A better metaphor: the spreadsheet====
{{VisiCalc capsule}}
 
A spreadsheet is a ''much'' better way of thinking about how to organise digital information than a desktop because it is apparently infinite: digital information has no physical dimension. It is not constrained by a physical “[[substrate]]” — usually paper — that analogue information is embedded in. An empty spreadsheet stretches endlessly away in two directions.
 
{{Quote|
'''Downwards''': You can ''add'' items to your filing system without limit, unconstrained by the area of your desk or the volume of your filing cabinet, where each item occupies one of an infinite number of rows. <Br>
'''Across''': You can ''categorise'' each item however you like by creating new ''columns''. There is no limit to columns, and no set hierarchy between columns. They need not even bear any relation to each other as long as they relate to the original item. Whereas a subfolder is necessarily a sub-division of the folder it sits in, this is not true of a new column.}}
 
By contrast the desktop is designed around a physical problem that digital information does not have: how to manage bits of paper. In printed information, paper is “form”, text is “substance”. A desktop is obliged to prioritise form over substance because the substance ''cannot exist independently of the form''. Paper must be ''put'' somewhere. Unless you physically copy it, you can only put it in one place at a time.
 
Storing paper is expensive.<ref>All physical information is eventually destined for the [[Iron Mountain]].</ref> Copying and transporting information on paper is expensive, slow and “''lossy''”. Each copy is worse than the last, and each one increases storage costs.
 
Digital information has (almost) no form.<ref>Okay: ''almost'' no form. Compared with physical information. In this section take the word “almost” as read. </ref> It does not occupy physical space. It costs nothing to store. We can copy and move it costlessly, instantly, and with no loss of fidelity. At least when compared with physical information, digital information can be everywhere at once.  We are not constrained by space or time when we store or move digital information. Yet to file it, we use a metaphor that assumes we are. 
====Division versus multiplication====
{{drop|I|n a “desktop”}} structure, subfolders are sub-''divisions'', each further level down more fine-grained and ''subordinate'' than the last, and less important relative to the formal hierarchy. A folder
path is a rabbit hole.  ''Hierarchy'' takes priority over the ''artefact''. The hierarchy explains and contextualises everything. The more extensive the hierarchy, the less significant an individual artefact.
 
Whereas folders are ''divisive'' in nature, spreadsheet columns are ''multiplicative''. All columns in a spreadsheet have equal standing — they are, well, ''[[pari passu]]'' — so they can be multiplied without limit: if an existing column, or an artful combination of them, doesn’t yield the information you need, you can always add ''more'' columns. In a spreadsheet, ''the artefact takes priority over the hierarchy''. The hierarchy is incidental. ''Formal''.
====A front in the battle between substance and form====
{{quote|
The desktop prioritises ''form''. <br>A spreadsheet prioritises ''substance''.}}
{{Drop|T|he last thing}} to notice is our old friend the [[Der Sieg der Form über Substanz|struggle between form and substance]]: if we take it that, whatever your metaphor of choice, the “item” — the thing being filed — is the ''substance'' and the organising system it goes into is the ''form'', we can see that the desktop and the spreadsheet have fundamentally opposed philosophies.
 
What should matter to your organisation more? ''Substance'' or ''form''?
 
The desktop priorities ''form'' — the “item” is buried at the ''bottom'' of a rigid formal structure of folders and subfolders which, once created, cannot easily be altered. This is why it is so hard to find things you have misfiled. You cannot put anything into the database until you have fully specified its folder path.
 
By contrast, a spreadsheet prioritises ''substance''. The “item” is the first thing to go in the database, without any formal structure. It sits at the ''top'' of the structure. Only once it is ''in situ'' can you assign it any formal properties. The item therefore wears the properties we assign it ''lightly''. Its position and identity does not change if we later alter, remove or augment the values we assign to it.
 
==== Metadata ====
{{drop|E|ach desktop folder}} or spreadsheet column is ''[[metadata]]'' about its item — literally, “information ''about'' our information” — about the 1item being filed.<ref>Grammar pedants’ corner: Even though “data” is plural, “metadata” is generally treated as a singular mass noun. Please direct your letters to the Royal Statistical Society — not because it is their fault: rather, they might keep metadata about this sort of thing.</ref>
 
A folder structure generates a very narrow, wan sort of metadata in the form of folder names: they are limited to a finite number of text characters. It is so limited that it is hardly worth thinking of it as metadata at all. Indeed, the Windows operating system could, but doesn’t, treat folder names as metadata, which is mad.
 
A spreadsheet, by contrast, uses metadata powerfully, putting few limits on what form it can take: text, calculable numbers and dates, checkboxes, people,<ref>As in, a lookup to an object in a people directory, and not just a text name.</ref> colours, flags, choices, lookups, comments, or calculations. It can be validated, managed, controlled, compulsory, optional, pre-populated or free-form. You can then filter, group, sort, chart, pivot and triangulate. Your imagination is the limit.
 
The more metadata you have, the more ways you can look at the data. Each separate value represents a new and distinct way of organising information. Even if the metadata is ''wrong'', the inconsistencies between it and other correct fields on the same item allow us to triangulate and identify problematic data. We can, in this way, generate metadata ''about'' metadata. This is ''meta''-[[metadata]].
 
==== The desktop clings on ====
{{drop|Y|et even in}} our modern, hyper-networked, cloud-based work environment — even though we have had Microsoft Excel for nearly 40 years, the desktop [[metaphor]] hangs on.
 
We still call them “desktops”, though now for the prosaic reason that they generally are the only thing that sits on top of our desk. Look: the desktop was a nice, quaint idea. It got old geezers in green visors to sit down at keyboards — for that, the [[change manager]]s of the world can be truly grateful — but the metaphor has well-outlived its purpose now.
 
It assumes each item to be filed has a unique physical location as if it were shelved in a physical library. Older readers may remember the [[Dewey decimal system]], which numbered the entire corpus of non-fiction wisdom from zero to 1,000.<ref>My favourite was [[001.9]].</ref>
 
When information is digital and has no physical dimension this is an unnecessary constraint. Duplicating items to suit multiple hierarchies creates ''[[basis]] risk''. Which was the canonical version of the document? How can we be sure they are the same? What happens if one, but not the other, gets updated?
 
Where the document is a “living thing,” plotting its own miserable trajectory through the cosmos — say, a contract under negotiation, or a maintained legal template — then running multiple copies multiplies the job of maintaining all copies as the document changes, and that introduces the risk of human error. There may be miskeys. A document may be forgotten. Version control is a pain.
 
Also, a preferred hierarchy can ''change''. Personnel, managers, business priorities, and circumstances change. They change the priorities of formal organisation. Changing your preferred hierarchy means ''completely re-engineering your folder structure''.
 
==== Substrate neutrality ====
{{drop|T|hese are all}} problems of the ''physical'' realm; the spreadsheet metaphor shows us we need not be troubled by them in the digital realm. Here, the ''physical'' “[[substrate]]” — the hard copy — is ''irrelevant''. What matters is the ASCII code embedded in it. In the digital realm, it has been abstracted and floats free of the substrate.
 
Across a [[diverse]] network of collaborators, the freedom to create multiple organising hierarchies on the fly, without upsetting other users, is immensely empowering.
 
Our reverence for the sacred substrate fell away, but not ''entirely''. We still revere [[Wet signature|wet ink]], for some reason [[counterparts]] clauses, and the dear old desktop.
 
For still, as we file, ''we cannot resist the siren call of folders''. Folders in folders in folders in folders in folders.
 
Why do we persist with folders?
 
More than twenty years ago Tom Zingale taught a young JC a valuable lesson. Battling with some byzantine folder structure, and losing, JC cried out in anguish, and Tom said this:{{dialogue|
{{dia|JC|How on earth am I meant to organise all this?}}
{{dia|Tom|With [[metadata]].}}
{{dia|JC|Er, with ''what''?}}
{{dia|Tom|[[Metadata]]. The answer to your question is [[metadata]]. Metadata, [[metadata]], [[metadata]]. Whatever your question is, the answer is [[metadata]].}}
{{dia|JC|Well, my question is, “How do I use [[metadata]] to fix this filing problem?}}
{{dia|Tom|Oh, right. Simple: [[SharePoint]].}}
}}
 
Wait: ''SharePoint''?
 
==== About SharePoint ====
{{Drop|N|ow a lot}} of good people viscerally ''hate'' [[SharePoint]]. To be sure, Microsoft seems to have gone out of its way to foment this hatred. It seems to have conducted a programme, over 20 years, to make [[Desktops, metadata and filing|SharePoint]] as hard to love as it can.
 
But at the same time, Microsoft has rebuilt its entire Office 35 Suite around the SharePoint platform. It is, to be sure ''monumentally confusing'', the Teams integration is baffling. The utterly dismal online versions of its Office suite drive people righteously up the wall.
 
But, still, a good part of the enmity for [[SharePoint]] arises from the users’ basic misunderstanding of SharePoint’s fundamental architecture.
 
SharePoint is the first, philosophically, digitally native operating system. It has abandoned the [[Desktop|desktop]] metaphor. SharePoint uses the ''[[spreadsheet]]'' metaphor.
 
In SharePoint you organise by ''[[metadata]]'', not by ''folders''.
 
{{quote|DO NOT USE FOLDERS IN SHAREPOINT. DO NOT COMPLAIN THAT SHAREPOINT SUCKS IF YOU USE FOLDERS. IF YOU USE FOLDERS IN SHAREPOINT, THAT MEANS ''YOU'' SUCK, NOT SHAREPOINT.}}
 
Folders are top-down. Metadata is bottom-up. Folders prefer form over substance. Metadata prefers substance over form.
 
''SharePoint allows you to do exactly the same thing with a document library as Excel allows you to do with a spreadsheet''.
 
So it is odd — isn’t it? We intuitively understand the power of metadata when we are presented with a spreadsheet. But the same power does not occur to us when we are presented with a file management system. The desktop metaphor is burned on our retina.
 
Even though it is, in essence, a supercharged online spreadsheet, SharePoint continues to be resented by almost everyone.
 
{{sa}}
{{gb|[[Metadata]]<li>[[Excel]]<li>[[Dewey decimal system]]}}
{{ref}}
{{nld}}
{{nld}}

Latest revision as of 12:35, 29 September 2024

JC pontificates about technology
An occasional series.

The Jolly Contrarian holds forth™

💥🆕💥
Audio version of this article
Template:Audio technology metadata
Index: Click to expand:
What you see is what you got, yesterday: A Xerox Alto with a portrait monitor. Nice.
The Alto’s “desktop”
VisiCalc on the Apple II

Resources and Navigation

Index: Click to expand:
Index: Click to expand:

The desktop

In 1973, Xerox’s Palo Alto Research Center released the “Alto”. This was the first personal computer equipped with a “graphical user interface” (GUI) — computing with pictures — instead of the traditional “character user interface”.

If potential users — bowler-hatted bureaucrats who didn’t use computers at all — were to be persuaded to give up their card catalogue systems, typing pools and reusable manila envelopes and instead stare at a screen all day, the system would need to look as familiar as possible.

To lessen the cognitive burden, Xerox came up with a visual metaphor. The Alto’s interface was modelled on a “desktop”. Instead of an impenetrable wall of green code and a flashing cursor, users were presented with a cartoonish depiction of a literal desktop and all its familiar iconography: documents, folders, a blotter, filing cabinets, in-trays, out-trays and even a dinky little waste-paper basket. [1]

All designed to reassure the meatware — as fearful of incipient obsolescence then as it is now — that the change journey from the comfy old analogue world to the coming atomic age would not be so bad after all.

The spreadsheet

In 1979, Dan Bricklin and Bob Frankston created a new application for the Apple II computer. They called it “VisiCalc”. It was a grid of cells that you could input numbers and text into and then run calculations on by reference to cell coordinates. VisiCalc was the first spreadsheet program: a primitive ancestor to that beast we all now know and love as Microsoft Excel. VisiCalc’s brilliant innovation was to separate the data you wanted to run operations on — the numbers and text in the cells — from the operations themselves, which referenced the just cell coordinates, not the data inside the cell. You could therefore change the data without changing the calculation operations. Excel was a rudimentary form of programming language.

It might not have seemed much in 1979, but it would revolutionise business computing. While not nearly as intuitive as the Alto’s “desktop” — there was no graphic user interface or anything like that — VisiCalc was a much purer expression of what a personal computer could do. It promised even modest undertakings a powerful means of storing, augmenting, filtering, analysing and manipulating unprecedented amounts of information as structured data.

Good and bad metaphors

A spreadsheet is a much better way of thinking about how to organise digital information than a desktop because it is not constrained by physical space. Whereas the information on a traditional desktop is embedded in a physical “substrate” — usually paper — digital information has no such limitations. An empty spreadsheet stretches endlessly away in two directions:

Downwards: You can add items — “artefacts” — to your filing system without limit, unconstrained by the area of your desk or the volume of your filing cabinet. Each new artefact occupies a new row. There is an infinite number of rows.
Across: You can categorise each artefact however you like by creating new columns. There is no limit to the number of columns and no necessary hierarchy between them.

The desktop is designed to manage physical properties that digital information does not have. In printed information, paper is “form”, text is “substance”. A desktop must prioritise form because in a physical system substance cannot exist independently of form. Data must be printed on paper. Paper must be put somewhere. Unless you physically copy it, you can only put a piece of paper in one place at a time. Older readers may remember the Dewey decimal system, which numbered the entire corpus of non-fiction wisdom from zero to 1,000.[2] This addressed an exclusively physical problem. We don’t need the Dewey system any more.[3] We don’t need the desktop, either.

Furthermore, storing paper is expensive.[4] Copying and transporting paper is expensive, slow and “lossy”. The embedded data degrades. Each copy increases storage costs at the Iron Mountain.

Whereas a desktop subfolder is necessarily a sub-division of the folder it sits in, spreadsheet columns have no hierarchy and need not bear any relation to each other, as long as they all relate to the original artefact.

But digital information has (almost) no form.[5] It does not occupy physical space. It costs nothing to store. We can copy and move it costlessly, instantly, and with no loss of fidelity. At least when compared with physical information, digital information can be everywhere at once. We are not constrained by space or time when we store or move digital information. Yet to file it, we use a metaphor that assumes we are.

Division versus multiplication

In a “desktop” structure, subfolders are sub-divisions, each further level down more fine-grained and subordinate than the last, and less important relative to the formal hierarchy. A folder path is a rabbit hole. Hierarchy takes priority over the artefact. The hierarchy explains and contextualises everything. The more extensive the hierarchy, the less significant an individual artefact.

Whereas folders are divisive in nature, spreadsheet columns are multiplicative. All columns in a spreadsheet have equal standing — they are, well, pari passu — so they can be multiplied without limit: if an existing column, or an artful combination of them, doesn’t yield the information you need, you can always add more columns. In a spreadsheet, the artefact takes priority over the hierarchy. The hierarchy is incidental. Formal.

But naturally we like hierarchy. Hierarchies place objects in a permanent relationship to each other. This is comforting in a changing world. It feels tangible in a virtual realm that natively is not tangible. We naturally organise ourselves into hierarchies — for better or worse — so there is no surprise that we like our information served up with similar structure.

Flexibility is good for experts, virtuosi, and improvisers: it intimidates those who are not. Structure works better where we want dependability and reliability.

A front in the battle between substance and form

The desktop prioritises form.
A spreadsheet prioritises substance.

Here we find our old friend the struggle between form and substance: if we take it that, whatever your metaphor of choice, the “artefact” — the thing being filed — is the substance and the organising system it goes into is the form, we can see that the desktop and the spreadsheet have fundamentally opposed philosophies.

The desktop priorities form — the artefact is buried at the bottom of a rigid formal structure of folders and subfolders which, once created, cannot easily be altered. This is why it is so hard to find things you have misfiled. You cannot put anything into the database until you have fully specified its folder path.

By contrast, a spreadsheet prioritises substance. The artefact is the first thing to go in the database, before any formal structure is applied to it.[6] It sits at the top of the structure. Only once it is in situ can you assign it any formal properties. The artefact therefore wears the properties we assign it lightly. Its position and identity does not change if we later alter, remove or augment the values we assign to it.

Metadata

Each desktop folder or spreadsheet column is metadata about its artefact — literally, “information about information”.[7] A folder structure generates a limited, anaemic sort of metadata in the shape of folder 260-character labels: so limited that the Windows operating system does not treat folder names as metadata at all.[8] A spreadsheet, by contrast, imposes few limits on what form metadata can take: text, calculable numbers and dates, checkboxes, people,[9] colours, flags, choices, lookups, comments, or calculations. It can be validated, managed, controlled, compulsory, optional, pre-populated or free-form. You can filter, group, sort, chart, or pivot it.

The more metadata you have, the more ways you can look at your data. The more ways you can access your data. Each separate value represents a new and distinct way of organising information. Even if your metadata is wrong, you can triangulate the inconsistencies to identify problematic data. You can, in this way, generate metadata about metadata. This is meta-metadata.

The desktop clings on

Yet even in our modern, hyper-networked, cloud-based work environment — even though we have had Microsoft Excel for nearly 40 years, the desktop metaphor hangs on. We still call them “desktops”, for the prosaic reason that they are the only thing still allowed on the desktop in our clear-desk, humans-as-fungible-cogs-in-the-machine modern office environment. (Is it any wonder firms are struggling to get staff back to the office, by the way?)

The desktop was a nice, quaint idea. It got old geezers in green visors to sit down at keyboards. For that, the change managers of the world can be grateful, but the metaphor has long since outstayed its welcome now. Enough already of the dinky desktop.

When information is digital and has no physical dimension it is an unnecessary constraint. Duplicating artefacts to suit multiple hierarchies creates basis risk. Which was the canonical version of the document? How can we be sure they are the same? What happens if one, but not the other, gets updated?

Where the document is a “living thing,” plotting its own miserable trajectory through the cosmos — say, a contract under negotiation, or a maintained legal template — then running multiple copies multiplies the job of maintaining all copies as the document changes, and that introduces the risk of human error. There may be miskeys. A document may be forgotten. Version control is a pain.

Also, a preferred hierarchy can change. Personnel, managers, business priorities, and circumstances change. They change the priorities of formal organisation. Changing your preferred hierarchy means completely re-engineering your folder structure.

Substrate neutrality

These are all problems of the physical realm; the spreadsheet metaphor shows us we need not be troubled by them in the digital realm. Here, the physicalsubstrate” — the hard copy — is irrelevant. What matters is the ASCII code embedded in it. In the digital realm, it has been abstracted and floats free of the substrate.

Across a diverse network of collaborators, the freedom to create multiple organising hierarchies on the fly, without upsetting other users, is immensely empowering.

Our reverence for the sacred substrate fell away, but not entirely. We still revere wet ink, for some reason counterparts clauses, and the dear old desktop.

For still, as we file, we cannot resist the siren call of folders. Folders in folders in folders in folders in folders.

Why do we persist with folders?

More than twenty years ago Tom Zingale taught a young JC a valuable lesson. Battling with some byzantine folder structure, and losing, JC cried out in anguish, and Tom said this:

JC: How on earth am I meant to organise all this?

Tom: With metadata.

JC: Er, with what?

Tom: Metadata. The answer to your question is metadata. Metadata, metadata, metadata. Whatever your question is, the answer is metadata.

JC: Well, my question is, “How do I use metadata to fix this filing problem?

Tom: Oh, right. Simple: SharePoint.

Wait: SharePoint?

About SharePoint

Now a lot of good people viscerally hate SharePoint. To be sure, Microsoft seems to have gone out of its way to foment this hatred. It seems to have conducted a programme, over 20 years, to make SharePoint as hard to love as it can.

But at the same time, Microsoft has rebuilt its entire Office 35 Suite around the SharePoint platform. It is, to be sure monumentally confusing, the Teams integration is baffling. The utterly dismal online versions of its Office suite drive people righteously up the wall.

But, still, a good part of the enmity for SharePoint arises from the users’ basic misunderstanding of SharePoint’s fundamental architecture.

SharePoint is the first, philosophically, digitally native operating system. It has abandoned the desktop metaphor. SharePoint uses the spreadsheet metaphor.

In SharePoint you organise by metadata, not by folders.

DO NOT USE FOLDERS IN SHAREPOINT. DO NOT COMPLAIN THAT SHAREPOINT SUCKS IF YOU USE FOLDERS. IF YOU USE FOLDERS IN SHAREPOINT, THAT MEANS YOU SUCK, NOT SHAREPOINT.

Folders are top-down. Metadata is bottom-up. Folders prefer form over substance. Metadata prefers substance over form.

SharePoint allows you to do exactly the same thing with a document library as Excel allows you to do with a spreadsheet.

So it is odd — isn’t it? We intuitively understand the power of metadata when we are presented with a spreadsheet. But the same power does not occur to us when we are presented with a file management system. The desktop metaphor is burned on our retina.

Even though it is, in essence, a supercharged online spreadsheet, SharePoint continues to be resented by almost everyone.

See also

References

  1. The metaphor even extended to categories of data that did not exist in an analogue office: Emails were depicted as little envelopes with a stamp and a wax seal.
  2. My favourite was 001.9: “mysteries and the unexplained”.
  3. Entertainingly, courtesy of some well-meaning rube, the Dewey decimal system limped on in the information age, in the shape of webdewey. This may be the librarian’s equivalent of the Lehmans online Amish supplies store.
  4. All physical information is eventually destined for the Iron Mountain.
  5. Okay: almost no form. Compared with physical information. In this section take the word “almost” as read.
  6. Other than mandatory metadata about the act of filing: a date stamp, a file name, the person who filed it, etc.
  7. Grammar pedants’ corner: Even though “data” is plural, “metadata” is generally treated as a singular mass noun. Please direct your letters to the Royal Statistical Society — not because it is their fault: rather, they might keep metadata about this sort of thing.
  8. This, by the way, is mad.
  9. As in, a lookup to an object in a people directory, and not just a text name.