Restagraph - what is it?
Restagraph is an application that dynamically generates an HTTP API in front of a Neo4j graph database, based on a schema defined inside that database. The auto-generated API is regular and consistent, making it easy to build automation and GUIs on top of it.
This includes features such as:
- Declaring the type of data that can be stored in each attribute of a resource.
- Constraints on the relationships that can be created between two types of resource.
- This includes cardinality constraints, i.e. 1:1, 1:many, many:1 and many:many relationships.
- Resources which only make sense in the context of other resources, e.g. interfaces on computers.
Thus, it gives you relational-like schema and constraints, but with the flexibility that only a graph database can provide.
Benefits, a.k.a. the point of this thing
- Predictable, consistent data structure: you know for sure what kinds of data are stored, and how they can be connected to each other.
- Language independence: the REST API means that any language can be used to build applications on top of this structure.
- Development speed: with a single JSON file, you define both the database schema and the API.
- GUI independence: the rules that define and validate the data structures are not mixed in with the code that generates a particular GUI for the app.
- You can build multiple GUIs, including mobile ones, that all automatically conform to the same rules.
- You can build automation against the application, still with all the same guarantees about the data.
- Explicit schema gives you a clear, reliable picture of what the available structures are and how they relate to each other.
- It's also much easier and less error-prone than grovelling through source-code and piecing it together by inference.
What does it look like?
Screenshots don't help much, because it's an HTTP API, so here's a quick demo of the very heart of it:
Create a JSON file with the following contents:
{
"name": "Movies",
"resourcetypes": [
{ "name": "Movie" }
],
"relationships": [
{
"name": "ACTED_IN",
"source-types": ["People"],
"target-types": ["Movies"],
"cardinality": "many:many"
}
]
}
This defines a resourcetype called Movie
, and a relationship from People
to Movie
(and only in that direction) called ACTED_IN
. For this relationship, any number of people can be recorded to have acted in a movie, and a person can have acted in many movies. You don't need to define the "People" resourcetype, because that's predefined in Restagraph.
Upload that file to the server, via a command like this:
curl -X POST --data-urlencode schema@core_demo.json http://192.0.2.1:4950/schema/v1
Now create a person and a movie, and link them:
curl -X POST -d 'uid=Keanu Reeves' http://192.0.2.1:4950/raw/v1/People
curl -X POST -d 'uid=The Matrix' http://192.0.2.1:4950/raw/v1/Movie
curl -X POST -d 'target=/Movie/The_Matrix' http://192.0.2.1:4950/raw/v1/People/Keanu_Reeves/ACTED_IN
Ask the server what movies Keanu has acted in:
curl -s http://192.0.2.1:4950/raw/v1/People/Keanu_Reeves/ACTED_IN
[{"type":"Movie","uid":"The_Matrix","createddate":3851927697}]
For full detail about what it's capable of, read the HTTP API docs; for a detailed walkthrough of how to use it with the Neo4j movies dataset, read the demo session.
Where to get it
Source code is in Equill/restagraph on Codeberg.
Docker images are on Docker hub.
License
AGPL 3.0
You're free to use it, and you're free to build on it. If you extend the code itself, I expect you to share your changes.
Quick start
The schema - how it works
Short version
The schema is stored in the database. Restagraph checks for it on startup; if one is present, it loads that into memory. If it doesn't find one, it installs the default (core) schema, then an additional schema if one is included, and loads the result into memory.
The additional schema is defined in a JSON file; you can upload any number of them via a separate API, and in fact this is the process for updating/upgrading a built-in one.
The core schema contains the essential resources and relationships on which the API itself relies, hence the name.
Elements of the schema
Resource-types
The types of things you can create via the API. If you're familiar with Object-Oriented Programming, resourcetypes correspond to classes and resources equate to instances.
The UID is a required attribute for all resourcetypes; you can't create a resource without one, so it isn't explicitly mentioned in the schema.
Attributes built into every resourcetype:
- The name
- This is how the resourcetype is addressed via the API, so it needs to be UID-safe.
- Following the Neo4j naming conventions, resourcetype names should be in PascalCase.
- Whether it's a dependent type
- Dependent types only exist in relation to another resource. E.g, a room only exists in the context of a building.
- Notes about the resource-type, i.e. what kind of thing it represents, and how it's intended to be used.
- Whether its value is read-only.
- This is really only useful for built-in attributes, e.g. the checksum that the server computes for each file that's uploaded.
User-defined resource attributes
Resourcetypes can have any number of user-defined attributes. Each of these is defined with three main characteristics:
name
- These are not specified in the Neo4j conventions, so I've gone with lowercase.
- Don't name them with a leading
RG
. Technically you can, but if it collides with a term used by the API for some other kind of filtering, the reserved term takes precedence. Filtered terms currently include:RGoutbound
RGinbound
description
- This only appears in the schema. It's for clarifying what the attribute is for, or how it's to be used.
type
- What kind of data can be stored in this attribute. Available options include
varchar
(the default)text
(for large stretches of text)integer
andboolean
.
- You can define further constraints on some types:
- maximum length for a varchar
- a list of acceptable values for a varchar (turning it into an enum type)
- maximum and/or minimum values for an integer.
- What kind of data can be stored in this attribute. Available options include
Relationships between resource-types
These define the relationships that the API will allow you to create from one resourcetype to another. Note that they're directional.
Mandatory attributes, which must be specified when defining one of these:
name
= the name used when referring to this relationship in a URI.- Following the Neo4j naming conventions, relationship names should be in
SCREAMING_SNAKE_CASE
.- This has nothing to do with it being the coolest case-name in the history of case-names, but it's a pleasing coincidence.
- Following the Neo4j naming conventions, relationship names should be in
sourcetypes
= the list of resourcetypes that the relationship can be from.target-types
= the list of resourcetypes it can connects to.cardinality
= how many relationships of this kind are to be permitted from an instance of a source-type, and how many to an instance of a target-type. Valid options aremany:many
1:many
many:1
1:1
Optional attributes:
reltype
= what kind of relationship this is, and thus what constraints should apply. This currently only supports two cases:dependent
= this relationship is from a parent resource to a dependent one.any
= default case: no constraints apply.
description
= descriptive text, clarifying the intended meaning/purpose of this relationship.- default is
null
- default is
Important note: The any
resourcetype can be used to define relationships from "any" other resourcetype to this one, or from this one to "any" other. So it's important to remember that when you query the Schema API about a resourcetype, the relationships
section combines outbound relationships from "any" with those from the specific type you're querying it about. The same applies when it's validating a request to create a relationship.
That's the any
resourcetype's reason for existence; the server won't allow you to create an instance, or to query one; it's only there to make relationship definitions manageable.
It's possible to define any
-to-any
relationships. There are genuine cases where it makes sense, but over-use of it defeats the point of having meaningful relationships.
The API
That is, how you actually put data into the system, and get it back out again.
dependent
= whether this relationship is from a parent resource to a dependent one.default is
false
Resources
That is, instances of a resourcetype. Their attributes are:
- The UID (Unique IDentifier)
- This is how the resource is addressed via the API, so it needs to be UID-safe. Restagraph automatically sanitises these on the way in.
- The list of acceptable characters is the unaccented Latin alphabet, digits, plus four non-alphanumeric characters (
-
,_
,.
and~
). This is the set of "unreserved characters" from section 2.3 of RFC 3986.
- The list of acceptable characters is the unaccented Latin alphabet, digits, plus four non-alphanumeric characters (
- This is the only attribute you're required to specify, when you create a resource.
- This is how the resource is addressed via the API, so it needs to be UID-safe. Restagraph automatically sanitises these on the way in.
- Original UID
- The un-sanitised version of the requested UID, regardless of whether it's different from the sanitised version.
- This is autogenerated, so you don't need to (or get to) specify it.
- Creation date/time
- A datestamp in Unix epoch time, i.e. seconds since midnight at the start of January 01 1970, recording the time at which this resource was created.
- Another autogenerated attribute.
- Version date/time
- Also a datestamp, in the same format as
createddate
. This records the last time this resource was changed.
- Also a datestamp, in the same format as
- User-defined attributes
- Whatever attributes are defined in the schema.
- These can be included when you create the resource with a POST request, or set/updated via PUT at any time after that.
Relationships between resources
The simplest of the lot, because they have no user-serviceable attributes inside.
Created via POST, as long as they meet the constraints defined in the schema at that moment in time.
Test suite
Two test suites are included:
- Client-side python tests, using pytest
.
- Internal tests of the implementation itself.
More information
There's more in the docs
folder in this repo:
- For more detail about the HTTP API, read the HTTP API docs.
- The IPAM API (IP Address Management) gets a separate page: Restagraph IPAM API.