Graphs for Identity


Graphs for Identity

Today’s hyper-connected society is witnessing an explosive increase of networked services and resources. This is especially true of the Internet of Things (IoT) tsunami hitting our digital shores these days. Are current identity and access management (IAM) systems up to the task of securing the billions of new devices expected to come online by 2020?

This is a legitimate question to ask for two main reasons: volume and complexity.

A Question of Volume

Given the sheer number of things that come online, we need the ability to manage billions of new identities and their defining attributes and contexts. This includes device-related data such as :

  • owner(s)
  • other identities that may or may not have access to device data or functions
  • other services that manage telemetry data to or from devices

How many authentications and authorizations per second will a given IAM system handle when billions of things call home from all corners of the world? These are real use cases that we, at Nulli, have started to see with our global customers active in this field. We’re entering the realm of big data, where no-SQL backends make a lot of sense, including economic sense.

A Question of Complexity

Access rights and supportive access policies to all these new connected things become more and more intricate. Access policies to any given device are currently determined not only by business requirements, but also by legal contracts in many cases. In such environments, defining a digital identity is not enough to satisfy the stated requirements. It’s becoming more and more necessary to evaluate the relationships that devices have with one another as well as with people and processes.

Furthermore, these relationships span the technical realm, but now also involve legal teams, human resources, finance, sales and marketing, to name a few. Who legally owns the device versus who owns the data collected or disseminated by the device? If the device is sold or the domain the device resides in is sold, does ownership of the device transfer with the sale? What about data collected under the previous owner? Who can approve access to it and under what conditions? Will geolocation of device determine access policies? Does the entity trying to access it have the right license? Is it a customer or a technician? Does the technician have the right qualifications to service the device? These are fascinating questions that cannot not be overlooked. Additionally, relationships are central to the field of identity relationship management (IRM) and the work being done within the IRM Working Group at Kantara, which promises to encompass domains beyond the IoT as well.

Relationships Matter

In any case, new problems require new types of solutions and new technology stacks. Traditional SQL or LDAP backend systems have a difficult time modeling or representing these requirements and will crumble under close-to-real-time IoT environments. Nulli has seen that IoT IAM projects usually depend on modeling relationships. As a result, graph databases have proven to be the logical backend solution to support these relationship-based problems. Graph databases can handle the immense volume of relationship data while addressing the latency challenges presented by LDAP and SQL.

Furthermore, graph databases support full-fledged query languages. A few are available, but Nulli has settled on Cypher because it’s open source, has the support of popular graph databases, is relatively simple to learn and has worked to address our problems.

In this paradigm, graphs represent the access policies themselves, and we can ask access questions of this model through Cypher. But there can be many access policies, and these can also change over time.

The access questions we ask today may not be relevant tomorrow. After all, laws change and business models evolve. We need to be able to future-proof our access management system in such a way that future policies have minimal impact on the array of users, apps, services and things that rely on these devices. And here we encounter another limitation of the traditional technology paradigm in widespread use today: REST APIs. A typical organization needs to write tens, hundreds or maybe thousands of RESTful APIs in order to expose its services, and to also protect them. A new functionality may require a brand new API or changes to the existing ones, which then impacts all the systems and components that rely on the changed APIs. Wouldn’t it be nice to have only one RESTful endpoint that could service all possible requests, now and in the future?

Well, that’s the promise of a relatively new paradigm and specification called GraphQL. Originally developed by Facebook in 2012, but released in 2015, it’s now a full-fledged query language. GraphQL isn’t actually tied to any specific backend data store, but we found that it’s well adapted to graph databases (which was also Facebook’s original intent).

The GraphQL specification defines a client, a server, and a backend model/data store. The server uses definitions of the data modeled in the backend data store, and these definitions constitute the schema of that data. The schema also contains the definitions of typical queries that can be run against the data store. These queries are coded functions that can run anything on the backend, such as Cypher queries. As a result, the server exposes one single REST endpoint that can service any query.

A simple GraphQL query may look like this:

{

 user(id: 01) {

name

 }

}

To which a server may return this result:

{

 “user”: {

“name”: “Joe Doe”

 }

}

The GraphQL client runs in the user-agent, issues GraphQL queries and makes sense of the responses sent by the server.

An advantage of this model is that one can easily modify the server schema (which is akin to modifying a configuration file) and run queries on it right away without deploying new REST endpoints or modifying existing APIs. Modeling our access policies in graphs and implementing the access requests through GraphQL has given us and our clients a lot of flexibility.

The task of shifting to this new paradigm may not be all that daunting, as all the moving pieces have already been consolidated through a technology stack: the GrandSTACK. Consisting of the Neo4J graph database, Appollo GraphQL server and client and React.JS for the client/front-end UI layer, the tools are there and ready to use.

And as always, you just need the right tools for the task at hand.

 

By Alex Babeanu
Senior Identity Specialist, Nulli

View More Posts

Leave a Reply

Your email address will not be published. Required fields are marked *