Intro
There's two ways of finding a file management solution:
- You define it 'in a lab'
- You learn it from someone
In both cases you end up with very specific scenarios in which the solution eventually detaches from your mental model.
Invoices categorized by their invoice date, make sense because that's how they arrive & how they usually are needed first: monthly send off to accountant.
Invoices
- / 2024
- - / 01
- - - / 2024-01-13_apple-headphones_329712.pdf
- - / 02
- - / 03
- ...
- - / 12
-
- / 2023
- - / 01
- - / 02
But this solution already falls apart, when you need to find a specific type of invoice. You know it's from Apple, but you don't exactly remember the month, or even the year.
In this scenario you would want the structure to look like this:
Invoices
- / Apple
- - / Headphones
- - - / 2024-01-13-apple-headphones_329712.pdf
- - / Phones
- - / PCs
- ...
- Google
- - / Subscriptions
- - / Phones
tl;dr
In such a scenario the solution falls apart, because what you need out of it at this moment, can't be provided because of it's rigid structure.
Saving invoice by their invoice date makes sense, since that's often the most referred to date, when dealing with invoices. A potential file name schema could look like this:
Format
YYYY-MM-DD_description_invoice-nr.pdf
|description_invoice-nr_YYYY-MM-DD.pdf
In this example different bits of information (date, description, invoice nr.) are separated via _
while bits inside those are separated via -
.
But what happens when you try to apply this naming format to regular bank statements, insurance documents etc. There you might not have a specific date on the document, but rather a fiscal quarter or even just a year. Then your name format would have to accommodate for all kinds of variations:
Format Variations
YYYY-QQ_description.pdf
YYYY-MM_description.pdf
YYYY_description.pdf
This is furthermore worsened by the inconsistency between allowed characters on different operating systems.
The date it the document might not be the most relevant date about that specific document.
You could be downloading 2023 Q4's Stock Portfolio Summary in January 2024.
Do you now sort this document into 2024/01
or into 2023/12
where it would be much better placed, when you try to search for your "Q4 2023 stop summary".
For this you would want to have two types of dates attached to each document (input + relevancy date).
Result
It's in those moments where your predefined system fails either on input or output, when it requires immediate attention & action to adjust the solution on the spot, to not break the flow of use.
Yet it's usually in those moments where this occurs, that we have the least amount of time & mental capacity to deal with it, which leads to either the rapid degradation of quality in our organization solution or to too much friction & the eventual fade out of usage.
People have come up with all kinds of solutions for these problems. They mainly come in the form of additional meta data like tags, categories, descriptions, OCR/PDF text recognition to extract specific keywords from documents etc.
Fallacy
The idea is that adding more & finer meta data to a document, will help on retrieval.
By adding tags like Apple
, Headphones
& auto extracting the document date from it, it's supposed to help you find a file, even if the organization system's structure isn't what you're looking for right now.
The problem is, that for each of these bits of meta information & types of data you can attach, you need to remember how you structured them.
This added meta data can be very generic like Tags & Categories, or they can be very use case specific like Client, Billing Type, Invoice Status etc. Making the meta data more use case specific helps with aligning the stored structure with the mental model.
If I'm trying to organize my outgoing invoices as a freelancer, being able to assign it to a specific client or specifically setting an invoice status like pending, or paid, is much closer to how you're storing that information in your head, since if you're trying to remember if you've already billed Client A for Project X, you're not asking your brain "which category is this invoice in, what tag did I mentally assign to that category+invoice combo?", but you're asking your brain "which client did this Project X belong to? Oh Client A, how much was Project X roughly quoted for? Probably something like 4k? Do I remember sending off an invoice for 4k this month?"
The more use case specific the meta data, generally the easier it is to align the organization structure with the mental model. At the same time it increases the number of different organization structures you need to remember, because you can't organize your vacation photos or your insurance documents by Client or Invoice Status.
Result
This leads to either remembering multiple, complex structures; all with their own edge cases & nuances. Or relying on multiple premade solutions (apps/services) that have modeled a use case specific solution for you; making it harder & harder to keep all data in sync or have a single source for retrieval.
Even if you would manage to build a program that would have very use cases specific structures for each type of media/document you're organizing, classical file management UIs typically then mainly focus on adding that data.
Because they need to be able to facilitate such a wide range of types of documents & types of meta data, their UIs are generally very generic, built from plain input fields, information in text labels, dropdowns etc.
They not only need to facilitate a wide range of documents & meta data; they also need to facilitate a wide range of mental models. The more users end up using your solution, the bigger your number of different mental models that you're encountering & trying to match, will be. This means those solutions need to be made even more generic (like Obsidian) to allow each user to match their own mental model, which in turn again puts more work on figuring things out on the user, instead of taking work away from the user, which often is the initial premise/pitch of such a solution.
This becomes especially noticable when dealing with living documents like note taking & project management solutions, where things like tags or categories can potentially change when a document evolves in the future. A user doesn't fully know what the document is going to be about, when he starts to write, so tags/categories can only be applied once the document is done or the current working session is ended.
The problem is that this "gardening work" (of updating tags, creating new categories; realizing that now you've got two too similar categories, requiring you to unify them, go back to all other documents that had the old tag & rename/reapply the new tag etc.) isn't what people actually mean when they say "they want to organize their data".
What most people mean is they
"want to store & retrieve files with as little effort as possible, because they want to reduce the hassle off having to go through an annoying set of folders/hard drives etc."
The part that is ultimately required to upkeep a good organization solution, is the exact part that most users usually want to avoid.
That's why file management solutions shouldn't focus on managing files, but on managing the "gardening work".
While giving the user a dropdown of categories to choose from, is a effective way of allowing the categorization during the input stage of document management, it is not a good way of retrieving documents from said system.
Emoji Tags Example
A very good way of organizing categories in this modern age is using Emojis for categories, because they skip the text-to-meaning part of our brain while making straight use of image-to-meaning, which is a lot faster.
While you can use every tag/category system and create Unicode Emoji categories, which works fairly well on input, since you just click on whatever image describes the mental bit of information the best, while not having to remember all the exact categories that exist, because they are presented to you.
On retrieval this doesn't work as well, because when trying to retrieve information you already kind of know what you're looking for, but you still need to scroll through the entire list of Emojis, because you don't exactly remember which Emoji you picked for "computer related invoices", was it
💻
/🖥️
or did you put it under📱
because it was a tablet & there's no tablet Emoji?
During the input stage, the available information specificity is very high, because it is very fresh & has the source material at hand to be added from. During the output/retrieval stage, the available information specificity is very low, because you haven't seen the source material in forever, and only know the remaining rough mental representation that you can remember.
I'm a Mess, so I'm Making My Own File Organizer [TagStudio] - YouTube
In research I came across this video of Cyan Voxel who was looking to learn about organization solutions in search for a photo organization system he wanted to build. While he comes up with a clever tag/property system, his approach shows the first signs of failing the Input Output chasm at the end of his video:
![[Pasted image 20240515164731.png]]
He says he wants to include tag related panels like "tag search", "recent tags", "top tags", "pinned tags". These UI components (built from generic UI) perfectly showcase the retrieval problem.
His tag/property system is so powerful, but also complex, that the amount of tags could quickly balloon, especially if you plan to organize large amount of types of files/media in such a system. This makes it hard to remember which tags you already added, could lead to duplicates, and requires a good amount of gardening work, to keep clean & useful.
To combat this problem of either being overwhelmed by the amount of options to choose from, or to potentially help with remembering what tags you used in the past, when trying to lookup a document, he want's to introduce different ways of displaying the tags.
Top Tags & Pinned Tags both will eventually gravitate to be the same list.
Which Tags do you want to pin? – The ones you need the most often, it would be a waste of space to pin tags you only need once per year. By having a certain list of tags available at all times, it's very likely the user ends up using these tags the most often, because they're always in his mind & visually presented. This will lead to the pinned tags eventually being the most used tags & therefore duplicating themselves into the "Top Tags" panel.
Title
Just a couple of weeks/months in, these two panels have become the same & functionally indifferent, maybe with a couple of meta-tags in difference.
"Recent Tags" will also most likely often be filled with your most used ones, because they tend to be used often & therefore usually be some of the last ones you've used.
To understand how a UI should be structured to help with the actual retrieval problem, we first need to define what the actual scenario looks like, such "tag panels" are trying to solve:
Problem Scenario
You've created a new document that you would want to tag to be related to "virtual reality".
Now you've so far haven't stored a lot of VR related documents, but you remember you at least have one, but you don't remember if the tag wasvr
,virtual reality
, or even if you used composed tags ofvirtual
&reality
as separate tags.
Problem To Solve
As a user I want to find out which tags in my system should be used for "virtual reality" related documents, with as minimal effort as possible.
The only currently proposed tag related panel that would help to solve this problem, would be the "Tag Search".
It would require the user to manually search for virtual
& see if any tags popup.
If the tag was vr
then no results would popup, requiring a second search.
While TagStudio supports alternative names for tags & categories, you could give "virtual" as an alias to the vr
tag; this trick only works once & is as helpful as the "password hint" on a computer, that fails to do it's job, if you also forgot that hint or alternative name.
The ideal solution to the users problem according to old paradigms, would be something that would automatically – based on the documents contents – suggest the existing tag vr
, without the user having to use a search.
The 2nd best solution would be to have the search automatically return all semantically related tags, so that searching for virtual
would automatically bring up tags like vr
, reality
, digital
, metaverse
etc. if they exist in the system.
An even better solution to the users problem would be, if he wouldn't have to set any tags at all, but could just drop in the document about his VR-headset return confirmation. And when he wants to retrieve the file in the future a search for "VR headset returned in 2024"
it would bring up the document in question.
Paradigm
The user doesn't care how it's organized. He just wants it to be organized.
Have a normal file organization UI, but always suggest tags, categories etc. by the local running LLM. When you try to add the tag "virtual" it will look in your system for existing/related tags & present them to you, to pick from if you want. Automatically on save, it will analyze the file & suggest new tags to add etc.
🚧
Now this could be solved via a well indexed search, but neither Windows nor macOS really provide such a well integrated & feature rich (or even working at all) search.
Most normal searches still require exact text entries, making a search for "apple airpods" not return any results on the example above, because it's only a contextual hit, not an actual one.
One would start to think that especially in the age of LLMs the ideal search query would look something like this:
Natural Language Query
I'm looking for an Apple invoice from somewhere around 2024, for a pair of AirPods Pro
But this natural sentence forming brings a lot of fluff with it. You could get the same information out in this query:
Effective Query
apple, somwhere 2024, airpods pro
You could store all files in a single folder. Have the LLM keep an organized database with relations, links, tags, categories, dates etc. You could have the LLM reveal its tags, categories if you want & make manual adjustments. You could prompt for any folder structure.
Example Folder Structure Prompt
I want all of my invoices organized by invoice date
Returned Folder Structure
Invoices
- / 2024
- / 01
- best_suggested_file_name_format.pdf
- / 02
- ...
- / 2023
- / 01
Would first only filter out all invoices, then return them in a Explorer/Finder/Dropbox/Google Drive like folder structure, but the folders are completely made up on the spot by the LLM & is just visual trickery to show only a selection of files.
At any moment the user could prompt:
Example Folder Structure Prompt 2
all of my invoices, organized in folders by the company they're sent to, with subfolders for projects.
Returned Folder Structure 2
Invoices
- / Client A
- / Project X
- best_suggested_file_name_format.pdf
- / Project Y
- ...
- / Client B
- / Project Z
But you could also prompt for this:
Example Folder Structure Prompt 3
all of my invoices, organized in by company with subfolders depending on the hours billed (0-50, 50-100, 100+)
Returned Folder Structure 3
Invoices
- / Client A
- / 0-50
- best_suggested_file_name_format.pdf
- / 50-100
- / 100+
- ...
- / Client B
- / 0-50]