Conceptual overview
Before getting into the technical details of how the API works, it'll be useful to understand the thinking that underpins Restagraph. It's sort-of like REST, but veers off in its own direction.
The key difference, to emphasise before anything else, is that Restagraph regards the relationships between things to be important in their own right.
Resources and resourcetypes
The "things."
If you're familiar with object-oriented programming, these map neatly enough onto classes and instances.
Resourcetypes are the abstract idea of a "type of thing," including its attributes. A person might have a display-name, year of birth, and a field for recording notes about them; Wikipages like this one have Title and Text, where the Title enables you to give it a friendlier, more readable title than the URI constraints allow (like non-ASCII characters).
Resources are concrete instances of a type - a specific person, or a specific book. A resource is identified by a combination of its resourcetype and its UID (Unique IDentifier).
Taking inspiration from the REST architecture, I use a convention of plural names for resourcetypes, with CamelCase where names are composed of two or more words.
UIDs are sanitised when resources are created, and the server includes the sanitised UID in its response. It actually returns the full identifying path, as per the examples earlier in this section.
Dependent resources
Some things exist only in the context of some other thing, like chapters of a book or floors of a building.
In Restagraph, the "dependent" attribute of a resourcetype signifies whether it represents this kind of thing. Dependent resourcetypes can depend on other dependent types, and RG puts no limit on how deeply you can nest them. This enables you to describe something in terms of its constituent parts (and their constituent parts, and so on), which is the intended way of thinking about them.
A resource that isn't dependent is called a primary
resource.
Primary and dependent resources work the same way in all other respects.
Resource versions
Each time you update the user-definable attributes of a resource, the server creates a new version of its attribute-set. You can query the server for the list of versions of a resource, fetch the contents of a given version, and set the "current" version to be any of those. This enables you to roll back and forward between versions.
When you create a new version, you can record a comment describing the change, or the reason for it being made. When you're looking at a resource's history, this is more informative than the timestamp for each version, so I recommend using this feature.
Relationships
As you'd expect, these are named relationships between things. Note that they're directional - each relationship is from one specific resource, to another specific resource. In most cases, one relationship is complementary to another, reflecting how things work in the real world.
In Syscat, a relationship has a few attributes:
- Its name.
- The set of source resourcetypes, i.e. what types of resource this relationship can be from.
- The set of target resourcetypes, i.e. what types of resource this relationship can be to.
- It's entirely legal for the same resourcetype to appear in both sets, e.g. to describe a relationship from one person to another.
- Cardinality. A relationship can be defined as being one of the following types:
- Many to many.
- This is the default.
- One to many.
- Many to one.
- One to one.
- This is useful for managing sets of attributes that might only exist under some circumstances. A computing device might have a 1:1 mapping to a set of SNMP attributes and/or a set of OSquery attributes, and adding all those fields directly to the
Devices
resourcetype would make it very unwieldy.
- This is useful for managing sets of attributes that might only exist under some circumstances. A computing device might have a 1:1 mapping to a set of SNMP attributes and/or a set of OSquery attributes, and adding all those fields directly to the
- Many to many.
- Its type
- "dependent" means that it identifies the manner in which that resource depends on this one.
- "self" means that both the source and target resources share a primary resource, so they're both part of the same composite thing.
- "other" means that the source and target resources belong to (or are) different primary resources.
- "any" means that none of the above three restrictions apply.
As a naming convention, I went with SCREAMING_SNAKE_CASE
- apart from it having the single coolest name in the history of case-names, it makes it nice and easy to distinguish the various parts of a URI path in Syscat.
Thus, a complementary pair of paths might look like this:
/People/John_McCarthy/CREATED/PogrammingLanguages/Common_Lisp
/ProgrammingLanguages/Common_Lisp/CREATED_BY/People/John_McCarthy
In case you're wondering whether you can chain any number of these identifiers and relationships together, the answer is yes... subject to limits on URL length. So don't bet on getting past 1024 bytes, including the base URL.
Relationships are not versioned, and there's no history outside of the application logs. It's not that I'm opposed to the idea; I just haven't yet seen a practical way of implementing it that works with this kind of application.
Important note about relationships
As I telegraphed in the opening paragraph, Syscat was designed for flexibility in describing relationships, to cover two important cases:
- The same kind of relationship from one resourcetype to any number of others.
- E.g, a person might own computers, items of clothing, and musical instruments. Same relationship, different target types.
- Different relationships between the same pair of resources.
- A person might be both the business owner of a system and its technical owner, or they might both own and recommend a particular model of drum machine.
Dependent resources
Some things exist only in the context of some other things, e.g. chapters of a book.
These are just like the resources described earlier, except that their resourcetype has the dependent
attribute set to true
. To complete the mechanism, there also needs to be at least one relationship to each dependent resourcetype, which also has dependent
set to true
. Those relationships are used to indicate that this resource is dependent on that one; unsurprisingly, the server will only permit one such relationship to any given dependent resource.
For practical reasons, each resource must have exactly one path that uniquely identifies it, so each dependent resource must have exactly one parent resource. This is called its canonical path
. It can have any number of other types of relationships to and from other resources, of course.
These can be dependent on other dependent resources, with no inherent limits beyond those imposed on URLs - you can chain them just as deeply as you need to.
To keep with the book-and-chapter theme, the path for a chapter might look like this: /Books/Practical_Common_Lisp/CONTAINS/Chapters/4_Syntax_and_Semantics
.
The astute reader will notice that this is no different to the casual reader from /People/Me/IS_READING/Chapters/4_Syntax_and_Semantics
. For this reason, each time you GET a resource, its list of attributes includes the canonical path, so that you can positively identify it without difficulty.
Paths
These combine all of the above things to tell the Restagraph server where to find the thing(s) you want to act on.
Specifically, a "path" is a repeating resourcetype/UID/relationship
sequence, which identifies one or more resources.
Paths can be of any arbitrary length greater than zero, but in practice they're subject to the maximum length of a URL that the HTTP client and server will support between them; don't count on getting past 1024 characters.
Each resource has exactly one "canonical path" that uniquely identifies that specific resource (and no other). You need to use that path when creating, modifying or deleting resources, and when creating or deleting relationships between them.
This topic is explored in more depth in Raw_API.md
, in the section on GET requests.
Visual examples
A picture being worth a thousand works, let's throw in some ASCII art to illustrate these ideas.
Primary resources
The path /Books/Practical_Common_Lisp
represents a single primary resource that you might picture as so:
|----------------------------------------|
| Books | { UID: Practical_Common_Lisp } |
|----------------------------------------|
Dependent resources
The path /Books/Practical_Common_Lisp/CONTAINS/Chapters/4_Syntax_and_Semantics
represents the same primary resource, but now a dependent resource, well, depends from it:
|----------------------------------------|
| Books | { UID: Practical_Common_Lisp } |
|----------------------------------------|
|
CONTAINS
|
V
|--------------------------------------------|
| Chapters | { UID: 4_Syntax_and_Semantics } |
|--------------------------------------------|
Relationships between resources
After creating relationships between resources, you can follow them from one to another in GET requests. For example, these paths all arise from the same set of resources and relationships.
/People/Peter_Seibel/AUTHOR_OF/Books/Practical_Common_Lisp/CONTAINS/Chapters/4_Syntax_and_Semantics
- From the author to one of his works, and then to a dependent resource with it.
/Books/Practical_Common_Lisp/HAS_AUTHOR/People/Peter_Seibel
- From the book to its author. Same pair of resources, with complementary relationships between them.
/People/Me/IS_READING/Chapters/Chapter_Four
- A new player enters the game.
/People/Me/IS_READING/Chapters/Chapter_Four/CONTAINED_IN/Books/Practical_Common_Lisp/HAS_AUTHOR/People/Peter_Seibel
- ...so who wrote the book that contains the chapter I'm reading, again?
|----------------------------------------| <--AUTHOR_OF--- |--------------------------------|
| Books | { UID: Practical_Common_Lisp } | | People | { UID: Peter_Seibel } |
|----------------------------------------| --HAS_AUTHOR--> |--------------------------------|
| ^
CONTAINS |
| CONTAINED_BY
V |
|--------------------------------------------| |----------------------|
| Chapters | { UID: 4_Syntax_and_Semantics } | <--IS_READING-- | People | { UID: Me } |
|--------------------------------------------| |----------------------|
Schemas
The set of resourcetypes and relationships defined in the Syscat server is called a schema, which is a direct reference to the relational/SQL-style schemas that partially inspired this system.
These are versioned, too, and you can roll forward and back in the same way as for resources. Unlike relational databases, though, changing the schema does not change the data already present in the database. This minimises the risk involved in testing new definitions, though I'll admit that renaming an attribute is a royal pain in the neck.
Definitions are added to the schema by uploading a JSON document, which means you can add your own resourcetypes and relationships, and even add attributes to existing resourcetypes. This enables you to seamlessly extend the system to cover your own needs and use-cases, without having to negotiate with a vendor and wait for them to release it. Because you do this using the same inbuilt mechanism as the pre-baked stuff, it's even vendor-supported!
Schemas are pretty involved, as you'd expect, so they're covered in more depth in Defining_a_schema.md
. The Schema API is also pretty useful.
Files
Files are a special case in this system, warranting a separate API for uploading, downloading and deleting them. This is because clients want either the file itself or the metadata about it, but not both at the same time, and trying to cover both in the raw API would be as painful on the client side as it would be for me to implement.
Metadata about each file is stored in the database: the UID, creation date, title, notes and SHA-256 checksum. The Mediatype/MIME-type is determined by the server, using the POSIX file
command, and the Files
resource is automatically connected to the appropriate MediaTypes
resource via the HAS_MEDIA_TYPE
relationship. The complementary MEDIA_TYPE_OF
relationship is also created in the other direction.
The file itself is stored separately in the filesystem, using a directory path and filename derived from the checksum. This provides a simple, consistent method to distribute the files across directories, to prevent the server getting bogged down in a search through thousands of files in one directory.