Desktops, metadata and filing: Difference between revisions

From The Jolly Contrarian
Jump to navigation Jump to search
No edit summary
No edit summary
 
(21 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{a|technology|}}SharePoint gets a lot of hate from people who don’t use it properly. To be sure, Microsoft has not made the job of learning how to use it easy — Microsoft’s design decisions across its platform are pretty weird, so we should not be surprised — but here is a basic rule of thumb:
{{a|technology|{{wmc|Xerox Alto with mouse and chorded keyset - Computer History Museum.jpg|What you see is what you get, yesterday}}}}====A bad metaphor====


{{quote|In SharePoint you organise by ''[[metadata]]'', not by ''folders''.}}
{{drop|I|n 1973, Xerox’s}} Palo Alto Research Center released the “Alto” personal computer. This was the first machine to boast a graphical user interface (GUI) instead of the traditional character user interface.<ref>It was well ahead of its time: the GUI would not become mainstream until Apple released its Macintosh a decade later, in 1984.</ref>


Folders are top-down. Metadata is bottom-up. Folders prefer form over substance. Metadata prefers substance over form.
To lessen the cognitive burden on users — at the time, bowler-hatted civil servants and sleeve-gartered clerks, whose mental framework was populated by mailboys running memoranda around the office in reusable envelopes, and whose idea of “information technology” was a {{pl|https://pneumatic.tube/the-lamson-pneumatic-tube-system-at-jacksons-of-reading-uk|pneumatic tube system}} that launched invoices around the organisation like mortar bombs — Xerox PARC’s designers created the metaphor of the “[[desktop]]”.
====Folders====
 
Folders are very old economy. The folder metaphor is, literally, based on physical artefacts that can only be in one place at any time. If I put this item in the “Litigation” folder, I can’t ''also'' put it in the “Knowledge Management” folder.  
{{wmcflex|Desktop icons for Xerox Star 8010.jpg|250px|right|The Xerox PARC Alto desktop}}Yes, you were looking at a computer screen. But ''on'' that screen was not an impenetrable wall of green code following a flashing cursor, but a cartoonish depiction of a ''literal'' desktop, with manila folders, a blotter, filing cabinets, in-trays and out-trays and even a dinky little wastebasket. All very familiar.
====A better metaphor====
{{drop|I|n 1979, Dan}} Bricklin and Bob Frankston created VisiCalc for the Apple II computer. VisiCalc was the first spreadsheet program, revolutionising computing by allowing even modest businesses easily to create and manipulate structured data.
 
{{wmcflex|Visicalc.png|250px|right|VisiCalc on the Apple II}}VisiCalc wasn’t nearly quite as dinky or intuitive as the desktop. It didn’t need a graphic user interface. It was a much purer expression of what a personal computer could do, though: it promised a powerful means of storing, augmenting, filtering, analysing and manipulating structured data. It is, of course the ancestor to that beast we all now know and love as [[Excel|Microsoft Excel]].
====Why it’s is a good metaphor====
{{drop|A|spreadsheet is}} a ''much'' better way of thinking about how to organise information on a computer than is a desktop. 
 
Being a ''conceptually'' infinite number of rows and columns — limited in practice, but these days not by much — a spreadsheet extends in two infinite directions: ''downwards,'' in that you can have any number of items in your filing system — each occupies a single ''row'' of a potentially infinite number of rows — and ''across'', in that you can categorise your list of items in as many different ways as your imagination affords, creating new ''columns'' for different categories each of which may, but need not, be related to the existing categories. If an existing column, or an artful combination, doesn’t yield the information you need, you can always add more.
 
==== Metadata ====
Each of these column categorisations are items of [[Metadata|''metadata'']] ''—'' literally, “information ''about'' information” — about the item in the row.
 
Grammar pedant’s corner: Even though data is plural, metadata is generally treated as a singular mass noun. Please direct your letters to the Royal Statistical Society. Not because it is their fault: it is just that they might keep metadata about this sort of thing.
 
Metadata can take the form of dates, checkboxes, people, colours, flags, choices, lookups, comments, or calculations. It can be validated, managed, controlled, compulsory, optional, pre-populated or free-form.
 
Each extra piece of metadata enriches the existing data in the row without detracting from it.<ref>Indeed, even if the metadata is wrong, the inconsistencies between the fields allow a user to triangulate and identify likely wrong — or problematic — material</ref>
 
Metadata is, in this way, “non-destructive”. It only augments. Each metadata field creates its own way of ordering information. Each is its own hierarchy.
 
Suddenly, you can organise the same information in multiple different ways, simultaneously, without upsetting anyone else’s existing categorisation.
 
You can then filter and group your items by one or more columns. You can sort, chart and triangulate. The more metadata you have, the more ways you can look at the data.
 
You can even sort your data using data about how much metadata there is.
 
This is ''meta''[[metadata]].
 
The spreadsheet approach to file management is, thus “multi-hierarchical” and non-destructive. What about unused metadata? Ignore it. Unused hierarchies are almost costless. And you just never know —
 
==== The desktop clings on ====
Yet on our modern, hyper-networked, cloud-based work environment the desktop [[metaphor]] hangs on. We still call them “desktops”, though now for the prosaic reason that they generally are the only thing that sits on top of our desk. The desktop was a nice, quaint idea, and it got old men in green visors to sit down at a keyboard, and for that the ranks of middle management can be truly grateful, but it has well-outlived its purpose now.
 
Because ''physical'' information that sits on a real desktop, and ''digital'' information that sits on a computer are very different [[Ontology|ontological]] propositions.
 
The desktop metaphor asks us to put our files in folders, as we would do on a real desk. If a folder gets too big, we create subfolders. And, just as with a real desk, once we have put a file in one folder, we can’t very well put it anywhere ''else''. Just as with a real filing cabinet, if we misfile our subfolder, we might never find it again.
 
In the real world of physical information, that does no more than reflect grim corporeal reality: a thing can only be in one place at one time so that’s that. If the boss wants to file by customer, and you want to file by industry, then tough.
 
Physical filing systems reflect this: there is a unique physical location for any single document. So do physical filing ''methodologies'': older readers may remember the [[Dewey decimal system]], which numbered the entire corpus of non-fiction wisdom from zero to 1,000.<ref>My favourite was [[001.9]].</ref>
 
If the same document does need to be categorised in different ways — say the legal department needs to file by customer and the credit department by industry, this could only be achieved by ''duplicating'' the document and holding one version in each location. Legal would have a filing system, and credit would have another. 
 
Plainly, this is an imperfect state of affairs. It created a [[Basis|''basis'']] ''risk''. Which was the canonical version of the document? What happened if one of them, but not the other, was updated? Where it is a “living thing” plotting its own miserable trajectory through the cosmos — a contract under negotiation, or a periodically updated legal template, for example — then ''duplicating it'' is a ''bummer''. It duplicates the manual task of updating all copies of the document as it changes, and that introduces the opportunity for human error. There may be miskeys. A document may be forgotten. Version control is a pain.


Where the same unitary item deserves to be in both folders, I must therefore ''duplicate'' it. Where it is a “living thing” plotting its own miserable trajectory through the cosmos — a contract under negotiation, or a periodically updated legal template for example — then ''duplicating it'' is a ''bummer''. It duplicates the manual task of updating all copies of the document as it changes, and that introduces the opportunity for human error. There may be miskeys. A document may be forgotten. Version control is a pain.  
In the physical realm, duplication was slow, imperfect and expensive and so, limited. At the time this seemed to be a drawback; with hindsight, it appears a valuable discipline.


Also your preferred hierarchy can ''change'', as personnel, business priorities, or circumstances change. Changing your hierarchy means ''completely re-engineering your folder structure''.
Also a preferred hierarchy can ''change'', as personnel, managers, business priorities, or circumstances change. Changing your preferred hierarchy means ''completely re-engineering your folder structure''. It would be lovely to not have to do that every time there is a corporate reorganisation.


So: a folder structure assumes a ''single'' hierarchy and multiple copies of each item.
So: a folder structure assumes a ''single'' hierarchy and multiple copies of each item.
====Metadata====
[[Metadata]] looks at the world the other way up. It says, “let there be a single canonical item, and multiple hierarchies.” Metadata allows you to non-destructively add hierarchies as you please. The more metadata fields you have, the more possible hierarchies there are. Unused hierarchies are almost costless.


Excel is a, well, ''excellent'' tool for managing metadata: Each row is an ''item'' and each column is a ''metadata point''. You can add additional columns as you see fit without impacting what is already there: newly added columns are ''non-destructive'' as they augment without affecting existing ones.
==== Substrate neutrality ====
These are all problems of the physical realm; the spreadsheet metaphor shows us we need not be so troubled in a digital realm. In the digital world, the ''physical'' “[[substrate]]” of a document — the paper it is made out of — is, to all intents and purposes, ''irrelevant''. What matters is the ASCII code embedded in that document. In a digital world it has been abstracted from the substrate and floats free. Within a [[diverse]] network of collaborators, this is immensely empowering.


In Excel you can filter sort and pivot by reference to any column in a table, in any order, and in doing so you impose a dynamic hierarchy on the items in the list. This is the magic of metadata.
It did not take people long to realise that ''email was amazing''.
 
From, more or less, a standing start in about 1993 — by lucky coincidence the year JC entered the workforce — the corporate world fell head over heels in love with electronic communication. Whatever reverence it had for the sacred substrate fell quickly away.<ref> I have a lengthy essay about the gradual extraction of data from the substrate but can't for the life of me find it at the moment.</ref> The expression, “this document is not worth the paper it is written on” has lost its meaning because the paper it is written ''no longer has much value at all''.
 
Now we recognise the digital content embedded in the substrate is the valuable bit; the paper bit is just annoying. It is an inconvenient reminder of our erstwhile physical analogue reality. The better metaphor than the “desktop” here is the ''spreadsheet''. A spreadsheet is, of course a rudimentary form of a database.
 
In a spreadsheet that inconvenient imposition of substrate has gone: a “document” is nothing more that an information string: more or less costless to generate, transport, replicate and store. By simply appending metadata, we can enrich it and put the same thing in several places at once. We transcend the Euclidian geometry of physical space.
 
Now, I said our reverence it had for the sacred substrate fell quickly away. It did not ''entirely'' fall away. We still revere [[Wet signature|wet ink]], for some reason [[counterparts]] clauses, and the dear old desktop. For still, as we file, we cannot resist the siren call of folders. Folders in folders in folders in folders in folders. Why do we persist with folders?
 
More than twenty years ago Tom Zingale taught a young JC a valuable lesson. Battling with some byzantine folder structure, and losing, JC cried out in anguish, and Tom said this:{{dialogue|
{{dia|JC|How on earth am I meant to organise all this?}}
{{dia|Tom|With [[metadata]].}}
{{dia|JC|Er, with ''what''?}}
{{dia|Tom|[[Metadata]]. The answer to your question is [[metadata]]. Metadata, [[metadata]], [[metadata]]. Whatever your question is, the answer is [[metadata]].}}
{{dia|JC|Well, my question is, “How do I use [[metadata]] to fix this filing problem?}}
{{dia|Tom|Oh, right. Simple: [[SharePoint]].}}
}}
==== About SharePoint ====
 
{{Drop|N|ow a lot}} of good people viscerally ''hate'' [[SharePoint]]. And, to be sure, Microsoft seems to have gone out of its way, over 20 years, to make SharePoint as hard to love as it can. But at the same time, it has based its entire Office 35 Suite on the SharePoint platform. It is, to be sure ''monumentally confusing'', the Teams integration is baffling. The utterly dismal online versions of its Office suite drive people righteously up the wall.
 
But, still, a good part of the enmity for [[SharePoint]] arises from this fundamental misunderstanding. SharePoint is the first, philosophically, digitally native operating system.
 
{{quote|SharePoint has abandoned the [[Desktop|desktop]] metaphor.
 
SharePoint uses the [[spreadsheet]] metaphor.
 
In SharePoint you organise by ''[[metadata]]'', not by ''folders''.
 
DO NOT USE FOLDERS IN SHAREPOINT.}}
 
Folders are top-down. Metadata is bottom-up. Folders prefer form over substance. Metadata prefers substance over form.


''SharePoint allows you to do exactly the same thing with a document library''.  
''SharePoint allows you to do exactly the same thing with a document library as Excel allows you to do with a spreadsheet''.  


We intuitively understand the power of metadata when we are presented with a spreadsheet. But the same power does not occur to us when we are presented with SharePoint, even though it is, in essence, a supercharged online spreadsheet.  
So it is odd — isn’t it? We intuitively understand the power of metadata when we are presented with a spreadsheet. But the same power does not occur to us when we are presented with a file management system. The desktop metaphor is burned on our retina.


It is as if we take a preconceived notion of a physical library with us, and ignore our understanding of spreadsheets.
Even though it is, in essence, a supercharged online spreadsheet, SharePoint continues to be resented by almost everyone.  


{{sa}}
{{sa}}
*[[Metadata]]
{{gb|[[Metadata]]<li>[[Excel]]<li>[[Dewey decimal system]]}}
{{ref}}
{{nld}}

Latest revision as of 20:49, 23 September 2024

JC pontificates about technology
An occasional series.
What you see is what you get, yesterday
Index: Click to expand:
Tell me more
Sign up for our newsletter — or just get in touch: for ½ a weekly 🍺 you get to consult JC. Ask about it here.

A bad metaphor

In 1973, Xerox’s Palo Alto Research Center released the “Alto” personal computer. This was the first machine to boast a graphical user interface (GUI) instead of the traditional character user interface.[1]

To lessen the cognitive burden on users — at the time, bowler-hatted civil servants and sleeve-gartered clerks, whose mental framework was populated by mailboys running memoranda around the office in reusable envelopes, and whose idea of “information technology” was a pneumatic tube system that launched invoices around the organisation like mortar bombs — Xerox PARC’s designers created the metaphor of the “desktop”.

The Xerox PARC Alto desktop

Yes, you were looking at a computer screen. But on that screen was not an impenetrable wall of green code following a flashing cursor, but a cartoonish depiction of a literal desktop, with manila folders, a blotter, filing cabinets, in-trays and out-trays and even a dinky little wastebasket. All very familiar.

A better metaphor

In 1979, Dan Bricklin and Bob Frankston created VisiCalc for the Apple II computer. VisiCalc was the first spreadsheet program, revolutionising computing by allowing even modest businesses easily to create and manipulate structured data.

VisiCalc on the Apple II

VisiCalc wasn’t nearly quite as dinky or intuitive as the desktop. It didn’t need a graphic user interface. It was a much purer expression of what a personal computer could do, though: it promised a powerful means of storing, augmenting, filtering, analysing and manipulating structured data. It is, of course the ancestor to that beast we all now know and love as Microsoft Excel.

Why it’s is a good metaphor

Aspreadsheet is a much better way of thinking about how to organise information on a computer than is a desktop.

Being a conceptually infinite number of rows and columns — limited in practice, but these days not by much — a spreadsheet extends in two infinite directions: downwards, in that you can have any number of items in your filing system — each occupies a single row of a potentially infinite number of rows — and across, in that you can categorise your list of items in as many different ways as your imagination affords, creating new columns for different categories each of which may, but need not, be related to the existing categories. If an existing column, or an artful combination, doesn’t yield the information you need, you can always add more.

Metadata

Each of these column categorisations are items of metadata literally, “information about information” — about the item in the row.

Grammar pedant’s corner: Even though data is plural, metadata is generally treated as a singular mass noun. Please direct your letters to the Royal Statistical Society. Not because it is their fault: it is just that they might keep metadata about this sort of thing.

Metadata can take the form of dates, checkboxes, people, colours, flags, choices, lookups, comments, or calculations. It can be validated, managed, controlled, compulsory, optional, pre-populated or free-form.

Each extra piece of metadata enriches the existing data in the row without detracting from it.[2]

Metadata is, in this way, “non-destructive”. It only augments. Each metadata field creates its own way of ordering information. Each is its own hierarchy.

Suddenly, you can organise the same information in multiple different ways, simultaneously, without upsetting anyone else’s existing categorisation.

You can then filter and group your items by one or more columns. You can sort, chart and triangulate. The more metadata you have, the more ways you can look at the data.

You can even sort your data using data about how much metadata there is.

This is metametadata.

The spreadsheet approach to file management is, thus “multi-hierarchical” and non-destructive. What about unused metadata? Ignore it. Unused hierarchies are almost costless. And you just never know —

The desktop clings on

Yet on our modern, hyper-networked, cloud-based work environment the desktop metaphor hangs on. We still call them “desktops”, though now for the prosaic reason that they generally are the only thing that sits on top of our desk. The desktop was a nice, quaint idea, and it got old men in green visors to sit down at a keyboard, and for that the ranks of middle management can be truly grateful, but it has well-outlived its purpose now.

Because physical information that sits on a real desktop, and digital information that sits on a computer are very different ontological propositions.

The desktop metaphor asks us to put our files in folders, as we would do on a real desk. If a folder gets too big, we create subfolders. And, just as with a real desk, once we have put a file in one folder, we can’t very well put it anywhere else. Just as with a real filing cabinet, if we misfile our subfolder, we might never find it again.

In the real world of physical information, that does no more than reflect grim corporeal reality: a thing can only be in one place at one time so that’s that. If the boss wants to file by customer, and you want to file by industry, then tough.

Physical filing systems reflect this: there is a unique physical location for any single document. So do physical filing methodologies: older readers may remember the Dewey decimal system, which numbered the entire corpus of non-fiction wisdom from zero to 1,000.[3]

If the same document does need to be categorised in different ways — say the legal department needs to file by customer and the credit department by industry, this could only be achieved by duplicating the document and holding one version in each location. Legal would have a filing system, and credit would have another.

Plainly, this is an imperfect state of affairs. It created a basis risk. Which was the canonical version of the document? What happened if one of them, but not the other, was updated? Where it is a “living thing” plotting its own miserable trajectory through the cosmos — a contract under negotiation, or a periodically updated legal template, for example — then duplicating it is a bummer. It duplicates the manual task of updating all copies of the document as it changes, and that introduces the opportunity for human error. There may be miskeys. A document may be forgotten. Version control is a pain.

In the physical realm, duplication was slow, imperfect and expensive and so, limited. At the time this seemed to be a drawback; with hindsight, it appears a valuable discipline.

Also a preferred hierarchy can change, as personnel, managers, business priorities, or circumstances change. Changing your preferred hierarchy means completely re-engineering your folder structure. It would be lovely to not have to do that every time there is a corporate reorganisation.

So: a folder structure assumes a single hierarchy and multiple copies of each item.

Substrate neutrality

These are all problems of the physical realm; the spreadsheet metaphor shows us we need not be so troubled in a digital realm. In the digital world, the physicalsubstrate” of a document — the paper it is made out of — is, to all intents and purposes, irrelevant. What matters is the ASCII code embedded in that document. In a digital world it has been abstracted from the substrate and floats free. Within a diverse network of collaborators, this is immensely empowering.

It did not take people long to realise that email was amazing.

From, more or less, a standing start in about 1993 — by lucky coincidence the year JC entered the workforce — the corporate world fell head over heels in love with electronic communication. Whatever reverence it had for the sacred substrate fell quickly away.[4] The expression, “this document is not worth the paper it is written on” has lost its meaning because the paper it is written no longer has much value at all.

Now we recognise the digital content embedded in the substrate is the valuable bit; the paper bit is just annoying. It is an inconvenient reminder of our erstwhile physical analogue reality. The better metaphor than the “desktop” here is the spreadsheet. A spreadsheet is, of course a rudimentary form of a database.

In a spreadsheet that inconvenient imposition of substrate has gone: a “document” is nothing more that an information string: more or less costless to generate, transport, replicate and store. By simply appending metadata, we can enrich it and put the same thing in several places at once. We transcend the Euclidian geometry of physical space.

Now, I said our reverence it had for the sacred substrate fell quickly away. It did not entirely fall away. We still revere wet ink, for some reason counterparts clauses, and the dear old desktop. For still, as we file, we cannot resist the siren call of folders. Folders in folders in folders in folders in folders. Why do we persist with folders?

More than twenty years ago Tom Zingale taught a young JC a valuable lesson. Battling with some byzantine folder structure, and losing, JC cried out in anguish, and Tom said this:

JC: How on earth am I meant to organise all this?

Tom: With metadata.

JC: Er, with what?

Tom: Metadata. The answer to your question is metadata. Metadata, metadata, metadata. Whatever your question is, the answer is metadata.

JC: Well, my question is, “How do I use metadata to fix this filing problem?

Tom: Oh, right. Simple: SharePoint.

About SharePoint

Now a lot of good people viscerally hate SharePoint. And, to be sure, Microsoft seems to have gone out of its way, over 20 years, to make SharePoint as hard to love as it can. But at the same time, it has based its entire Office 35 Suite on the SharePoint platform. It is, to be sure monumentally confusing, the Teams integration is baffling. The utterly dismal online versions of its Office suite drive people righteously up the wall.

But, still, a good part of the enmity for SharePoint arises from this fundamental misunderstanding. SharePoint is the first, philosophically, digitally native operating system.

SharePoint has abandoned the desktop metaphor.

SharePoint uses the spreadsheet metaphor.

In SharePoint you organise by metadata, not by folders.

DO NOT USE FOLDERS IN SHAREPOINT.

Folders are top-down. Metadata is bottom-up. Folders prefer form over substance. Metadata prefers substance over form.

SharePoint allows you to do exactly the same thing with a document library as Excel allows you to do with a spreadsheet.

So it is odd — isn’t it? We intuitively understand the power of metadata when we are presented with a spreadsheet. But the same power does not occur to us when we are presented with a file management system. The desktop metaphor is burned on our retina.

Even though it is, in essence, a supercharged online spreadsheet, SharePoint continues to be resented by almost everyone.

See also

References

  1. It was well ahead of its time: the GUI would not become mainstream until Apple released its Macintosh a decade later, in 1984.
  2. Indeed, even if the metadata is wrong, the inconsistencies between the fields allow a user to triangulate and identify likely wrong — or problematic — material
  3. My favourite was 001.9.
  4. I have a lengthy essay about the gradual extraction of data from the substrate but can't for the life of me find it at the moment.