Defining a schema (Wikipages)

Defining a Restagraph subschema

It'd be handy to know how to define subschemas of your own, that being the point of this thing.

Why "subschema"? Because each of these documents is a subset of the final schema that's actually in effect after they've all been installed. You can upload/install any number of subschemas, each of which can build on any previously-defined resourcetypes, which are combined by the server to make up the composite schema that controls the behaviour of the API.

This page will make an awful lot more sense if you've already read the conceptual overview.

The format

In summary, a subschema is a JSON document defining a single object with three fields:

It's entirely valid, if pointless, to define a subschema with neither resourcetypes nor relationships.

To go into more detail:

Name

This is really only used for logging purposes at the moment.

Resourcetypes

Each resourcetype is defined via several key-value pairs, most of which are mandatory:

Name

The name of the resourcetype.

It's used as an identifier in the URI when interacting with the API, and as a node label when recording a resource in the database. Because it's used in the URI, it needs to be safe for such.

Type: string.

Dependent

A boolean value stating whether this is a "dependent" resourcetype, i.e. whether it exists only in the context of another one. E.g, a room only exists in the context of a building.

A dependent resourcetype can only be created in relationship to a "parent" type, via a relationship that is also defined as a dependent one, meaning that it defines the dependency between them. A dependent resourcetype can be dependent on another dependent one, e.g. the ceiling of a room, in a building.

Type: boolean. In accordance with Postel's Principle, acceptable values include true, "true", "True", false, "false" and "False". The preferred values are true and false.

Description

Description of what this type represents, and possibly how it's to be used.

E.g. the description for the built-in resourcetype Organisations is "Any kind of organisation: professional, social or other."

Type: string.

Optional; the default is null.

Attributes

A list of attributes objects. Their main keys, i.e. those shared by all attribute types, are:

Attribute types:

In case you're wondering why there are two types of string variable, instead of just varchar(65535), it's for the benefit of GUI developers. This is a hint that a GUI can use to decide whether to present a one-line field or a resizeable box for editing the text of a given attribute.

For some attribute types, you can define further constraints on their values.

Relationships

A list of relationships objects. Their keys are:

I usually define them in the order name, source-type, target-type because that matches the way I think about them. In Cypher, Neo4j's native syntax, it's represented as (:source-type)-[:name]->(:target-type). However, the server doesn't care about the order of those keys, so use whatever works best for you and your team.

Cardinality in dependent relationships

Only two types of cardinality are permitted in a dependent relationship:

  1. 1:many (the default)
  2. 1:1

The reason for this is that it doesn't really make sense for a dependent resource to have multiple parents - in this kind of situation, it's almost certainly a primary resource with the same relationship to two other resources.

The expected use-case for a 1:1 dependent relationship is for managing a set of optional attributes. E.g, Files resources could be of any kind, so it's not practical to define that resourcetype with all possible attributes. Instead, you can define a dependent resourcetype containing the attributes for each given file format: JPEG images, Ogg Vorbis audio, etc. Because it only makes sense to have one such dependent resource for each file, and each one only makes sense in the context of one specific file, the 1:1 cardinality is a natural fit here.

Example

Let's lead with an example, for adding books and authors to the schema:

{
  "name": "example_schema",
  "resourcetypes": [
    {
      "name": "Books",
      "dependent": "false",
      "notes": "Stuff printed on the corpses of trees.",
      "attributes": [
        {
          "name": "description",
          "type": "text",
          "description": "",
          "values": null
        },
        {
          "name": "ISBN",
          "type": "varchar",
          "description": "International Standard Book Number. Should be a 10- or 13-digit number, optionally interspersed with hyphens.",
          "maxlength": 17
        }
      ]
    }
  ],
  "relationships": [
    {
      "name": "AUTHOR",
      "source-types": ["Books"],
      "target-types": ["People"],
      "cardinality": "many:many",
      "reltype": "any",
      "notes": "Link from the book to its author."
    },
    {
      "name": "AUTHOR_OF",
      "source-types": ["People"],
      "target-types": ["Books"],
      "cardinality": "many:many",
      "reltype": "any",
      "notes": "Link to a book this person wrote."
    },
  ]
}

Note that it assumes the existence of the People resourcetype. This is defined in the core schema, so you know it'll always be there. However, you can equally rely on resourcetypes created in other schemas, as long as they were installed before this one.

The server installs all resourcetypes defined in a subschema before trying to create the relationships.

Note: reference errors are handled quietly. If the schema defines a relationship that refers to a resourcetype not already defined, it will log the fact and move on. So it's fine to refer to resourcetypes defined in other subschemas (in fact, it's positively encouraged) but it is important to make sure you a)only make backward references, not forward ones, and b)upload subschemas according to the order of their dependencies.

Notes about the format and naming conventions