Restagraph - a conceptual overview (Wikipages)

Conceptual overview

Before getting into the technical details of how the API works, it'll be useful to understand the thinking that underpins Restagraph. It's sort-of like REST, but veers off in its own direction.

The key difference, to emphasise before anything else, is that Restagraph regards the relationships between things to be important in their own right.

Resources and resourcetypes

The "things."

If you're familiar with object-oriented programming, these map neatly enough onto classes and instances.

Resourcetypes are the abstract idea of a "type of thing," including its attributes. A person might have a display-name, year of birth, and a field for recording notes about them; Wikipages like this one have Title and Text, where the Title enables you to give it a friendlier, more readable title than the URI constraints allow (like non-ASCII characters).

Resources are concrete instances of a type - a specific person, or a specific book. A resource is identified by a combination of its resourcetype and its UID (Unique IDentifier).

Taking inspiration from the REST architecture, I use a convention of plural names for resourcetypes, with CamelCase where names are composed of two or more words.

UIDs are sanitised when resources are created, and the server includes the sanitised UID in its response. It actually returns the full identifying path, as per the examples earlier in this section.

Dependent resources

Some things exist only in the context of some other thing, like chapters of a book or floors of a building.

In Restagraph, the "dependent" attribute of a resourcetype signifies whether it represents this kind of thing. Dependent resourcetypes can depend on other dependent types, and RG puts no limit on how deeply you can nest them. This enables you to describe something in terms of its constituent parts (and their constituent parts, and so on), which is the intended way of thinking about them.

A resource that isn't dependent is called a primary resource.

Primary and dependent resources work the same way in all other respects.

Resource versions

Each time you update the user-definable attributes of a resource, the server creates a new version of its attribute-set. You can query the server for the list of versions of a resource, fetch the contents of a given version, and set the "current" version to be any of those. This enables you to roll back and forward between versions.

When you create a new version, you can record a comment describing the change, or the reason for it being made. When you're looking at a resource's history, this is more informative than the timestamp for each version, so I recommend using this feature.

Relationships

As you'd expect, these are named relationships between things. Note that they're directional - each relationship is from one specific resource, to another specific resource. In most cases, one relationship is complementary to another, reflecting how things work in the real world.

In Syscat, a relationship has a few attributes:

As a naming convention, I went with SCREAMING_SNAKE_CASE - apart from it having the single coolest name in the history of case-names, it makes it nice and easy to distinguish the various parts of a URI path in Syscat.

Thus, a complementary pair of paths might look like this:

In case you're wondering whether you can chain any number of these identifiers and relationships together, the answer is yes... subject to limits on URL length. So don't bet on getting past 1024 bytes, including the base URL.

Relationships are not versioned, and there's no history outside of the application logs. It's not that I'm opposed to the idea; I just haven't yet seen a practical way of implementing it that works with this kind of application.

Important note about relationships

As I telegraphed in the opening paragraph, Syscat was designed for flexibility in describing relationships, to cover two important cases:

Dependent resources

Some things exist only in the context of some other things, e.g. chapters of a book.

These are just like the resources described earlier, except that their resourcetype has the dependent attribute set to true. To complete the mechanism, there also needs to be at least one relationship to each dependent resourcetype, which also has dependent set to true. Those relationships are used to indicate that this resource is dependent on that one; unsurprisingly, the server will only permit one such relationship to any given dependent resource.

For practical reasons, each resource must have exactly one path that uniquely identifies it, so each dependent resource must have exactly one parent resource. This is called its canonical path. It can have any number of other types of relationships to and from other resources, of course.

These can be dependent on other dependent resources, with no inherent limits beyond those imposed on URLs - you can chain them just as deeply as you need to.

To keep with the book-and-chapter theme, the path for a chapter might look like this: /Books/Practical_Common_Lisp/CONTAINS/Chapters/4_Syntax_and_Semantics.

The astute reader will notice that this is no different to the casual reader from /People/Me/IS_READING/Chapters/4_Syntax_and_Semantics. For this reason, each time you GET a resource, its list of attributes includes the canonical path, so that you can positively identify it without difficulty.

Paths

These combine all of the above things to tell the Restagraph server where to find the thing(s) you want to act on.

Specifically, a "path" is a repeating resourcetype/UID/relationship sequence, which identifies one or more resources.

Paths can be of any arbitrary length greater than zero, but in practice they're subject to the maximum length of a URL that the HTTP client and server will support between them; don't count on getting past 1024 characters.

Each resource has exactly one "canonical path" that uniquely identifies that specific resource (and no other). You need to use that path when creating, modifying or deleting resources, and when creating or deleting relationships between them.

This topic is explored in more depth in Raw_API.md, in the section on GET requests.

Visual examples

A picture being worth a thousand works, let's throw in some ASCII art to illustrate these ideas.

Primary resources

The path /Books/Practical_Common_Lisp represents a single primary resource that you might picture as so:

   |----------------------------------------|
   | Books | { UID: Practical_Common_Lisp } |
   |----------------------------------------|

Dependent resources

The path /Books/Practical_Common_Lisp/CONTAINS/Chapters/4_Syntax_and_Semantics represents the same primary resource, but now a dependent resource, well, depends from it:

    |----------------------------------------|
    | Books | { UID: Practical_Common_Lisp } |
    |----------------------------------------|
                        |
                     CONTAINS
                        |
                        V
    |--------------------------------------------|
    | Chapters | { UID: 4_Syntax_and_Semantics } |
    |--------------------------------------------|

Relationships between resources

After creating relationships between resources, you can follow them from one to another in GET requests. For example, these paths all arise from the same set of resources and relationships.

    |----------------------------------------|  <--AUTHOR_OF---  |--------------------------------|
    | Books | { UID: Practical_Common_Lisp } |                   | People | { UID: Peter_Seibel } |
    |----------------------------------------|  --HAS_AUTHOR-->  |--------------------------------|
               |            ^
            CONTAINS        |
               |       CONTAINED_BY
               V            |
    |--------------------------------------------|                  |----------------------|
    | Chapters | { UID: 4_Syntax_and_Semantics } |  <--IS_READING-- | People | { UID: Me } |
    |--------------------------------------------|                  |----------------------|

Schemas

The set of resourcetypes and relationships defined in the Syscat server is called a schema, which is a direct reference to the relational/SQL-style schemas that partially inspired this system.

These are versioned, too, and you can roll forward and back in the same way as for resources. Unlike relational databases, though, changing the schema does not change the data already present in the database. This minimises the risk involved in testing new definitions, though I'll admit that renaming an attribute is a royal pain in the neck.

Definitions are added to the schema by uploading a JSON document, which means you can add your own resourcetypes and relationships, and even add attributes to existing resourcetypes. This enables you to seamlessly extend the system to cover your own needs and use-cases, without having to negotiate with a vendor and wait for them to release it. Because you do this using the same inbuilt mechanism as the pre-baked stuff, it's even vendor-supported!

Schemas are pretty involved, as you'd expect, so they're covered in more depth in Defining_a_schema.md. The Schema API is also pretty useful.

Files

Files are a special case in this system, warranting a separate API for uploading, downloading and deleting them. This is because clients want either the file itself or the metadata about it, but not both at the same time, and trying to cover both in the raw API would be as painful on the client side as it would be for me to implement.

Metadata about each file is stored in the database: the UID, creation date, title, notes and SHA-256 checksum. The Mediatype/MIME-type is determined by the server, using the POSIX file command, and the Files resource is automatically connected to the appropriate MediaTypes resource via the HAS_MEDIA_TYPE relationship. The complementary MEDIA_TYPE_OF relationship is also created in the other direction.

The file itself is stored separately in the filesystem, using a directory path and filename derived from the checksum. This provides a simple, consistent method to distribute the files across directories, to prevent the server getting bogged down in a search through thousands of files in one directory.