Restagraph Raw API (Wikipages)

Raw API reference

General overview

The purpose of this API is to ensure that data going into the database fits a schema, and is the one you'll interact with most of the time.

Assuming you've already read the Conceptual Overview, the API is designed around three key things:

  1. Resources, or the "things" represented in the database.
  2. The types of those resources.
  3. Relationships between the resources in the database.

It uses regular HTTP semantics, in the vein of the REST architecture, but takes a very different approach. Rather than constructing many endpoints that each represent a thing, it has a single endpoint that dynamically validates incoming requests according to a schema defined within the database.

Thus, each request has two key elements:

  1. The HTTP method, a.k.a. the verb.
    • Methods supported are: POST, GET, PUT and DELETE.
  2. The object, i.e. the thing being acted on.
    • This is described in terms of the "path" to the object, which is written in the form of a URI. The path warrants a bit of elaboration, because it's central to how this thing works.

HTTP return codes are used according to RFC7231, and the Content-type header is set according to whether text or JSON is being returned. As a rule, JSON will be returned on success, and plain text for anything else. The one salient exception is when deleting a resource or relationship, where the MIME-type is "text/plain" and the return code is NO CONTENT.

Notes about the examples

For clarity and simplicity, the examples use the command-line utilities curl and jq.

They also assume that the Restagraph server is running as a Docker container with the bridge address 192.0.2.1, and listening on port 4950.

For real-world use I recommend putting a web proxy server such as Nginx in front of Restagraph, in which case you wouldn't need to specify the port number in the client requests. However, for instructional purposes I'm skipping such things and keeping it as direct as possible.

Authentication and sessions

In summary, if authentication is required for an API action, the client must do two things:

Thus, this document assumes that you've already created a valid session, and that you're including the cookie with each request.

To tell curl to apply the contents of a cookie file to the request, use -b to specify the cookie file, and -H to set the header. For example, if the domain is rgtest.onfire.onice and, when you created the current session, you instructed curl to save the cookies to /tmp/cookies.txt, you'd give those options as follows:

curl -b /tmp/cookies.txt -H 'Host: rgtest.onfire.onice' http://192.0.2.1:4950/raw/v1/People/Me

For more detail on how authentication and sessions work in Restagraph, see Authentication_API.md.

GET: Retrieve resources

I'm covering this first instead of POST, because it's what you'll use most often. Also because you can use it in more complex ways.

Unlike the other methods, GET is only used for one task. However, there's a variety of ways to use it, and its use of paths is the most involved.

GET /api/v1/<path>

Optional parameters: this gets a little complicated, and is covered in the following subsections.

Return values:

Note that the path does not need to be canonical, and this is the reason why it returns a list rather than a single object. Technically, it could return a single object in response to a canonical path, but then people writing tools and front-ends would have to remember to manage that special-case, and life is already confusing enough.

For a trivial example, you might want to fetch the details of an author:

$ curl http://192.0.2.1:4950/raw/v1/People/Peter_Seibel

[
  {
    "RGversion": 3915260741,
    "RGcreateddate": 3915260741,
    "RGcanonicalpath": "/People/Peter_Seibel",
    "uid": "Peter_Seibel"
  }
]

These URIs can be as long as the client supports, and can follow any relationship from one resource to another. As a hypothetical example, you could find out who wrote the book you're reading chapter 4 of:

curl http://192.0.2.1:4950/raw/v1/People/Me/IS_READING/Chapters/4_Syntax_and_Semantics/CONTAINED_IN/Books/Practical_Common_Lisp/HAS_AUTHOR/People/Peter_Seibel

You might then decide that you like his style, and you want to find out what other books this guy has written:

curl http://192.0.2.1:4950/raw/v1/People/Peter_Seibel/AUTHOR_OF/Books

If you leave the resourcetype /Books off the end of that URL, you'll get everything he's written in any form:

curl http://192.0.2.1:4950/raw/v1/People/Peter_Seibel/AUTHOR_OF

To improve predictability, the results are sorted by UID in ascending order.

Note that it will always return a list of results, even in the above case where you would expect there to only be one thing to find. This is because it's possible for a resource to have the same relationship to two or more dependent resources which have the same resourcetype and UID (but different parents). It also fits with the principle of least astonishment: you don't have to remember which cases return a list and which don't.

By default, it only returns the system-managed attributes, plus the UID:

Specifying which resourcetype-specific attributes you want returned

Most resourcetypes have additional attributes. For example, People has "displayname" and "notes." Some of these contain large amounts of data, e.g. wikipages. If you want, say a listing of UIDs and titles, it's wasteful to fetch dozens or hundreds of large blobs of text that you're not going to use; it also takes longer, making your UI slower to respond.

Use the RGattributes query parameter to specify which of the additional attributes you want it to return. Its value is a comma-separated list of attribute names - note that they're separated by commas only, not commas plus whitespace.

Example: fetch the details of a specific person, including the values for the displayname and notes attributes:

$ curl -s 'http://192.0.2.1:4950/raw/v1/People/Peter_Seibel?RGattributes=displayname,notes' | jq .

[
  {
    "RGversion": 3915280147,
    "notes": "Author and second-generation Lisp hacker.",
    "displayname": "Peter Seibel",
    "RGcreateddate": 3915260741,
    "RGcanonicalpath": "/People/Peter_Seibel",
    "uid": "Peter_Seibel"
  }
]

Remember that you're not restricted to canonical paths with GET requests. For example, if you're fetching the full list of People recorded in the database, and you want their displaynames and the notes, you'd make this query:

curl http://192.0.2.1:4950/raw/v1/People?RGattributes=displayname,notes

If you only wanted the displaynames in that listing, you'd make this one:

curl http://192.0.2.1:4950/raw/v1/People?RGattributes=displayname

The RG is short for RestaGraph, of course, and you'll see this prefix in use to distinguish such directives from attribute-names.

Filtering the results of a GET request

You can add filters to this request, as parameters in the URL, e.g:

curl 'http://192.0.2.1:4950/raw/v1/People?displayname=Foo.*'

Each of these can be applied as a negation, by prepending !. E.g, !uid=foo means "UID is not equal to 'foo'".

If you specify multiple parameters, the server combines them with an implicit AND. There's no way to specify that it should use OR instead; for that, you need to make multiple queries and combine the results on the client side.

View the version history of a resource

GET /raw/v1/<resource-type>/<resource UID>?RGversion=list

Creating new versions is well and good, but it's also handy to be able to find out what versions exist.

This returns a JSON object of this form:

{
  "versions": {
    "3838824143": "Initial version",
    "3838824187": "Remove swear-words from the description.",
  },
  "current-version": 3838824143
}

Note that you have to use a canonical path for this request, i.e. one that uniquely identifies a single resource.

To labour the obvious:

The reason for using timestamps instead of a 1, 2, 3... sequence is mostly that it's simpler this way, especially when it comes to preventing duplicates.

If a client tries to create a new version which would duplicate a timestamp, the server will wait 1.1 seconds to guarantee a new timestamp, and then create the new version. Even this should be an unlikely scenario, as this isn't intended to be used as a time-series database.

More advanced use of paths

While this API isn't a full graph-query language, you can use it to explore an awful lot.

the second path raises the question of "Chapter Four of which book, exactly?" Well, you can extend the query a bit to answer that question:

It gets worse if you're on chapter 4 of three books at the same time. That's why the server includes the canonical path for each resource, when it answers GET requests.

Now, if you've read the conceptual overview, you'll remember that Restagraph was designed for flexibility in describing relationships, and you might be wondering just how

Thus, we separate what a thing is, from its role in the scheme of things. There are no customer resourcetypes, just people or organisations that have customer or vendor relationships to other people or organisations.

The next important element is that you can trace the path of relationships from one resource to another, starting from any resource, and following any number of relationships. Among other things, this means that the hierarchy of dependent resources can go just as deep as necessary to represent your subject area.

Combining these two things, you get the resourcetype/UID/relationship/resourcetype/UID pattern. If you have two people who are friends, and one has a dog, you could trace the path to the dog via this URI: /People/PersonOne/FRIEND_OF/People/PersonTwo/HAS_PETS/Dogs/Fifi

Why is the resourcetype always in the path, even if that relationship could only lead to one resourcetype?

POST

This is next, because GET isn't much use if you haven't created something to, well, get.

However, this method is used for more than just creating one kind of thing.

POST: Create a resource

POST /api/v1/<resource-type>/

Mandatory parameters:

Optional parameters:

URL-encoding is recommended for anything non-trivial.

Example:

curl -X POST -d 'uid=Albert Schweizer' --data-urlencode 'notes=An Alsatian polymath: theologian, organist, musicologist, writer, humanitarian, philosopher and physician.' http://192.0.2.1:4950/raw/v1/People

In this form, the UID must be unique for each resource-type. That is, if you define a routers resource and a switches resource, no two routers can have the same UID, but a router and a switch can. Bear this in mind when designing your schema.

UIDs must also be URL-safe, so they're restricted to the set of "unreserved characters" from section 2.3 of RFC 3986: this is the unaccented Latin alphabet (a-z and A-Z), digits 0-9, and the four non-alphanumeric characters -, _, . and ~.

On success, the server returns a status code of 201, and the URI for the newly-created resource, e.g. /People/Albert_Schweitzer.

POST: Create a dependent resource

This works the same way as with primary resources, except that you append the dependent relationship and resourcetype to an existing parent resource, e.g:

curl -X POST -d 'uid=Basement' http://192.0.2.1:4950/raw/v1/Buildings/Xenon_Base/CONTAINS/Floors

UIDs for dependent resources must be unique within each parent resource, but are not required to be globally unique the way that primary resources are. That is, any number of network devices are likely to have an interface named eth0.

You can create resources that depend from other dependent resources, with no inbuilt limit to the depth. The only restriction is that you must use the canonical path to the resourcetype.

POST: Create a relationship from one resource to another

These are always directional, mostly due to the way Neo4J works.

POST /api/v1<//path/to/source>

Mandatory parameters:

Both paths must be canonical, to ensure that both the source and destination resources can be positively and uniquely identified.

Return values:

POST: Move a dependent resource from one parent to another

POST /api/v1</path/to/dependent/resource>

Mandatory parameters:

Note that the new parent must be a valid parent resourcetype for the child's type, and the new relationship must also be a valid dependent relationship from parent to child.

PUT: Update one or more attributes of a resource

Update resource-specific attributes

PUT /api/v1/<path>

This creates a new version of the resource, containing the updated set of attributes. If you supply the attribute "RGversioncomment", this will be recorded along with the version identifier. If you use this to leave a note about what you changed, the version history will make much more sense when you view it later.

The payload must be supplied in the request body, POST-style. This is mainly to get past the 1024-byte limit for GET-style parameters, which is a little short for something like a wiki page. For context, one of the tests for the database driver involves the entire text of the novel Dracula.

This always returns a status code of 204 (no content) on success.

Although section 4.3.4 of RFC 7231 states that 201 must be returned "[i]f the target resource does not have a current representation and the PUT successfully creates one," this API provides for updating multiple resources in a single request, making it entirely possible to create, update and delete attributes in a single transaction. It seems like a backward step to restrict clients to updating a single attribute per request, so I´m making the counter-argument that unpopulated attributes have the de facto representation of Null, so technically there aren´t any valid resources lacking representation in the context of this method.

There's currently no way to delete the value of an attribute, but I do plan to add this feature.

Note that you need to use the canonical path for this or any other request that makes a change, i.e. the RGcanonicalpath attribute returned by GET requests, to unambiguously identify the resource in question.

Change the UID of a resource

This is a separate operation from changing other attributes, because it's the only user-modifiable attribute that is not subject to versioning.

PUT /api/v1/<path/to/resource>?uid=<new-uid>

DELETE: Remove something

Remove a resource

DELETE /api/v1/<path>

Returns 204 (no content) on success.

If the parameter recursive=true is supplied, all dependent resources depending on this one will also be deleted, from the bottom up. That parameter is accepted in both GET-style (within the URL) and POST-style (within the request body).

If the parameter yoink=true is supplied, the resource's representation will be returned in the body of the reply in the same manner as a GET request.

This operation applies equally to primary and dependent resources, so you can recursively delete a dependent resource and all those that depend on it, should the occasion require it.

Remove a relationship to another object

DELETE /api/v1/<resource-type>/<Unique ID>/<relationship>/

Mandatory parameter: 'target=/<resource-type>/<Unique ID>'

The reason for doing it this way is that it's the only way to distinguish between deleting the relationship vs deleting the resource at the far end of it.