Conceptual background
In which I describe the thinking that underlies the system named Restagraph.
What makes this system different from the others
"Why don't you tell her... why you are the way you are."
Restagraph is designed around the idea of separating what a thing is, from its relationships to other things. You might say that it makes a distinction between a thing's intrinsic characteristics, from the meaning derived from its context.
The most significant part of this is that one resource can have the same type of relationship to several other resources, and it can have several different kinds of relationships with a single other resource.
That is, a person might have written a book or an article - it's the same relationship, but with different kinds of things as its object.
Conversely, a person might have written a book or be reading a book. Or both - possibly even the same book. Different relationships to the same kind of thing; sometimes to the same thing.
This enables Restagraph to represent the real world as faithfully as possible.
History, or "why even build this system?"
I wanted a system flexible enough to represent whatever things happen to be (wanted) in an IT environment, and whatever kinds of interconnections happen to exist (or be wanted) between them. I also found it valuable to separate a thing's identity from its relationships to the things around it - so instead of having "vendors" and "customers" you have people and organisations that can have "vendor" and/or "customer" relationships between them, allowing for one organisation to be both vendor and supplier to another. A system catalogue, which I called Syscat because I have no imagination.
I also wanted something with an API that was easy to build automation on. This meant it needed a predictable, regular form with the fewest feasible number of exceptions.
All these factors led to a system built on the concepts described in this document.
It also turned out to be a general-purpose knowledge-management engine in its own right, and that's how Restagraph came to be. Syscat is just Restagraph with a schema suited to the IT infrastructure domain.
Now, on with the key ideas that this system implements:
Other considerations
Sometimes it's useful to know what reasoning lies behind the design decisions. In no particular order, these are some of the more important things behind Restagraph being designed the way it is.
Designing a tool to extend the mind
This was always intended as a kind of second brain - as an extension of the mind, which happens to be implemented via a computer.
Although I seriously considered building Syscat on RDF, we were clearly approaching the problem of knowledge-management from opposite directions. RDF aims to make it as easy as possible to apply machine-reasoning to a body of data, where a computer digests it according to our directions, and returns data for the user to inspect.
What I've tried to build is a tool that expands a person's ability to understand and digest a large body of information themself, increasing that person's knowledge and their ability to reason about it.
I also designed it to be very automation-friendly, but again I see that as an extension of the mind, where the automation just codifies the result of your understanding. To the extent that I made it easier for somebody to apply machine-learning to a database, that was an unintended side-effect.
Thoughtfulness
This is a system for hair-splitters, designed by a hair-splitter.
To properly use this system, you need to think carefully about the distinctions and connections you're making. E.g, in IT infrastructure, where most of the definitions are clear-cut, it's still easy to confuse "what should be" with "what is."
To construct your own subschemas, you need to thing deeply as well as carefully. This won't be for everybody, and something like Obsidian will be a better fit for many people. This isn't a swipe at Obsidian - it has a great reputation, and not everybody needs this extra level of structure.
Front-loading the work
Broadly speaking, there are two approaches when it comes to imposing structure on an assortment of data:
- Chuck it all in there, then sift through it later to find whatever patterns emerge, probably with the help of machine-learning.
- Think through the structure you want ahead of time, then process the data accordingly as you put it into the database.
Both approaches have their place, and the second requires that you're already familiar with the subject area.
With this system, I optimised for the second approach. It's sometimes referred to as "front-loading," because it means you do most or all of the processing work up-front, rather than leaving it until later. I say "optimised for" rather than "opted for" because I also designed this system to allow the schema to be updated as your understanding of things changes, so it can evolve in response to new insights.
The first reason for this is that, in building Syscat, I was addressing a domain that already has well-established structure and definitions: IT infrastructure. The general-purpose engine was never part of the original design; it just emerged as the only practical way of building the thing.
Then, as I realised I'd build a general-purpose knowledge-management engine, it dovetailed with my desire for a more structured KMS than a vanilla wiki. In this domain, I firmly believe that if you digest the information on the way in, it repays the effort in several ways:
- You understand the new material more deeply.
- You remember it more easily, making it less necessary to consult the KMS.
- When you do need to consult the KMS, it's easier to find what you need.
Of course, you can progressively refine the information as your understanding changes over time, and you can also (and I did) create a Notes resourcetype for quickly recording stuff for later digestion.
Inclusion
Making pronouns part of the core schema was a deliberate decision.
If somebody balks at using this system because "ew, pronouns!" then it's doing its job. That's the kind of person I don't want benefitting from all this work.
Also, it's basic personal respect, and nobody's forcing you to use them in here anyway.
Allow for conflicting schemas
Another reason I decided not to adopt RDF was the underlying ethos of "one giant, distributed database."
There are practical reasons not to force everybody into using the same set of definitions: some disciplines simply use different terms to refer to the same thing, and sometimes you need to adapt the schema to an existing body of terminology, whether in knowledge and documentation or in legacy systems that your organisation depends on.
Why "restagraph"?
- It needed a name, and I'm bad at naming things.
- I was putting a REST-like API in front of a graph database: REST+graph -> Restagraph.