SCIM in a nutshell

System for Cross-domain Identity Management (SCIM) is a standard set of REST API routes you can add to your API, enabling a SSO provider to manage users in your application.

There are two SCIM RFCs. They work together to define a comprehensive SCIM implementation. Despite the breadth, the RFCs are surprisingly readable if you are a software developer with a few years of experience in web tech.

The RFCs are strongly recommended reading. Or, honestly just skip this guide and go straight to the sources:

SCIM Protocol – RFC 7644
SCIM Data Schemas – RFC 7643

Why SCIM? Say you have a customer that’s a large enterprise. Rather than making them manually add and remove users in your app, getting out of sync with their internal directory, your customer can use SCIM. SCIM will handle syncing users from their own directory of users, into your app. Your customer might use Microsoft Active Directory (AD), and setup SCIM to allow all of their AD users to be synced into your app. In this way, your customer doesn’t have to worry about figuring out your user management system when they hire a new employee. And when your customer’s employee gets a new job, they don’t have to find their way into your system to remove the old employee.

SCIM’s “de-provisioning” of users helps your customer avoid a potentially large security hole. If a disgruntled employee leaves the company, they’ll almost certainly get their email disabled. But 3rd party apps, such as yours, could be forgotten. Enter SCIM – apps that support it will be able to automatically remove inactive employees – saving time and risk.

This overview of SCIM is written for a developer implementing SCIM user provisioning and de-provisioning in their SaaS application.

SCIM vs SAML

SCIM is different than Single Sign-On, like SAML. Both involve a 3rd party identity provider. SAML lets users log into your app with a 3rd party identity provider. SCIM adds and removes users in your app with a 3rd party identity provider.

SCIM Auth

A third party identity provider can authenticate to your SCIM endpoints using a variety of methods, which the RFC says are outside the scope. In practice this will be Oauth, HTTP basic auth, or some sort of bearer tokens. The WWW-Authenticate header must be returned from your server describing the authentication methods available.

SCIM API Expected Endpoints

EndpointMethodsDescription
/Users
/Users/{id}
/Groups/.search
GET
POST
PUT
PATCH
DELETE
Manage users. Search is POST only.
/Groups
/Groups/{id}
/Groups/.search
GET
POST
PUT
PATCH
DELETE
Manage groups. Search is POST only.
/MeGET
POST
PUT
PATCH
DELETE
Alias for the User resources at for the currently authenticated user
/ServiceProviderConfigGETService provider (your API server) config information
/ResourceTypesGETList supported resource types
/SchemasGETList supported schemas
/BulkPOSTPerform many updates at once, to one or more resources

MIME type for SCIM

The MIME type for SCIM is application/scim+json, which can be used in the Accept and Content-Type headers. In other words, an identity provider will speak SCIM JSON to your server, and you must speak SCIM JSON back.

Response Error JSON

In addition to normal HTTP status codes, your server will need to reply with a status code in JSON, per the RFC. Two other fields are recommended.

{
  "status": 409,
  "detail": "this is a user readable message - user already exists",
  "scimType": "uniqueness"
}

scimType applies to 400 and 409 status codes, and may be required in your API’s response.

scimType error valueWhen to use it
invalidFilterBad filter syntax on POST search, or bad PATCH filter.
tooManyThe server doesn’t want to return that many results, as it would be too many. It may be too resource intensive to process.
uniquenessCreating or updating a resource cannot be done because there is already a resource with a value being given which is supposed to be unique.
mutabilityUpdating a resource cannot be done because a field being modified is supposed to be immutable (read only, cannot be changed).
invalidSyntaxA search request or bulk request had bad syntax.
invalidPathInvalid URL path for SCIM.
noTargetThe path for an attribute does not match any results
invalidVersWrong SAML protocol version, or unsupported version by the API
sensitiveThe request contained personal or private information in the URI (URL), and is not allowed per the SCIM spec. For example, a server may disallow filtering by name in the URL querystring as this is personal information.

Uniqueness

A resource ID need not be identical globally, but it must be unique with the externalId. The externalId is controlled by the identity provider. If you are using globally unique IDs for users, this should not be a problem. However if you have limitations such as globally unique email addresses, and the SCIM partner wants those only unique per externalId, additional work may be required to modify your SaaS schema.

Creating and Modifying Resources

When the identity provider creates a user or a group, it may look like this:

If the user or group already exists, you must reply with status code 409 and scimType: uniqueness.

Create user request:

{
     "schemas":["urn:ietf:params:scim:schemas:core:2.0:User"],
     "userName":"bjensen",
     "externalId":"bjensen",
     "name":{
       "formatted":"Ms. Barbara J Jensen III",
       "familyName":"Jensen",
       "givenName":"Barbara"
     }
   }

and example response from your server:

{
     "schemas":["urn:ietf:params:scim:schemas:core:2.0:User"],
     "id":"2819c223-7f76-453a-919d-413861904646",
     "meta":{
       "resourceType":"User",
       "created":"2011-08-01T21:32:44.882Z",
       "lastModified":"2011-08-01T21:32:44.882Z",
       "location":
   "https://example.com/v2/Users/2819c223-7f76-453a-919d-413861904646",
       "version":"W\/\"e180ee84f0671b1\""
     },
     "name":{
       "formatted":"Ms. Barbara J Jensen III",
       "familyName":"Jensen",
       "givenName":"Barbara"
     },
     "userName":"bjensen",
     "emails":[
       {
           "value":"bjensen@example.com"
       },
       {
           "value":"babs@jensen.org"
       }
     ]
   }

Creating via POST, and updating via PUT (replace resource) or PATCH (modify only passed fields), must return the entire current resource and include an ETAG header.

Creates must return 201 on success. If the create happens during a Bulk operation (below), the create part will return a response field of 201, but the bulk request will return a 200.

Minimum Required Fields

Even if you are looking to do the bare minimum amount of integration work with SCIM, you must still include the required resource fields from RFC 7643.

The generally required format for any resource is:

{
  "id": "case-sensitive globally unique or unique when combined with externalId",
  "externalId": "case-sensitive set by provisioning client",
  "meta": {
    "resourceType": "User or Group or other resource name",
    "created": "when created on the responding server, format ISO8601-UTC timestamp string is probably best (2008-01-23T04:56:22Z)",
    "lastModified": "last modified by responding server, in same format as created"
    "location": "full URI of this specific resource - as if it were a GET request",
    "version": "the weak (prefixed with W/) ETAG (entity-tag) of this resource, if it were to be fetched individually with a GET request"
  }
}

Minimum Recommended Fields for User Resource

Only schemas and userName are required by the spec (in addition to the general resource properties above). However, omitting all names and email address may cause certain clients or servers to have undefined behavior. Thus it is strongly recommended to include either displayName or name, and emails.

The boolean property active is also not required, but for purposes of deprovisioning a user, may be expected by the identity provider. A user with "active": false" should not be allowed to log in.

employeeNumber may be expected if the identity provider implements the schema extension “Enterprise” User urn:ietf:params:scim:schemas:extension:enterprise:2.0:User.

This is the minimum recommended set of fields – in addition to the resource fields above – but not identically prescriptive of what’s minimally required in RFC7643.

{
  "schemas": ["urn:ietf:params:scim:schemas:core:2.0:User"],
  "userName": "unique (may be only by externalId) but case-INSENSITIVE user name for authentication; could be an email address and under SAML SSO should be equal to the NameID",
  "displayName": "name of user suitable for display to users",
  "name": {
    "formatted": "if not passing first, last, and other parts of name",
    "familyName": "aka last name, if not passing formatted",
    "givenName": "aka first name, if not passing formatted",
  },
  "active": true,
  "emails": [{
    "value": "email address, canonicalized according to RFC5321",
    "display": "optional but may indicate the 'primary' or 'work' email, in addition to 'home' or 'other' which may be less desirable"
  }],
  "employeeNumber": "if schemas has 'urn:ietf:params:scim:schemas:extension:enterprise:2.0:User' , alphanumeric value unique to the externalId"
}

Minimum response to /ServiceProviderConfig request

There are some assumptions below about what you have implemented in your server.

Even if a feature is disabled via the supported boolean, RFC 7643 still requires returning additional information about the feature. A few other notes:

  • maxPayloadSize is in bytes, integer (no scientific notation). The example shows 1 MB.
  • etag is required per the SCIM schema RFC for many responses, but can be disabled in ServiceProviderConfig, so we include it as enabled below.
  • authenticationSchemes is not required in the RFC, but the client consuming may require it because they may not have implemented the code to infer from a WWW-Authenticate header. The fields listed in the example are all required.

Assuming you are doing the minimum necessary and have many features omitted, here is an example:

{
  "schemas": ["urn:ietf:params:scim:schemas:core:2.0:ServiceProviderConfig"],
  "patch": { "supported": false },
  "bulk": { "supported": false, "maxOperations": 1, "maxPayloadSize": 1000000 },
  "filter": { "supported": false, "maxResults": 1 },
  "changePassword": { "supported": false },
  "sort": { "supported": false },
  "etag": { "supported": true },
  "authenticationSchemes": [{
    "type": "required - one of: 'oauth', 'oauth2', 'oauthbearertoken', 'httpbasic', 'httpdigest'",
    "name": "required - friendlier name of the value at type, like 'HTTP Basic Auth'",
    "description": "required - any additional information for an implementer to know"
  }]
}

Query Features

Sorting, pagination, attributes, and filters are encouraged, but optional.

Sorting queries may include a sortBy and sortOrder field. Default sorting must be ascending. Client example:
sortBy=userName&sortOrder=descending

Pagination queries include a startIndex integer, which is where the query must start, and count integer, which indicates the maximum results to return. Pagination does not require any locking or cursors to be stored. Clients must handle situation like results being added or removed while they are paginating.
Client example: ?startIndex=1&count=10
Server response example:

{
     "totalResults":100,
     "itemsPerPage":10,
     "startIndex":1,
     "schemas":["urn:ietf:params:scim:api:messages:2.0:ListResponse"],
     "Resources":[{
       ...
     }]
   }

Attributes can specify which specific fields should be returned. Likewise attributes may be excluded from the default set. For a request to GET /Users/{id} the client may also include a querystring of ?attributes=displayName to only return display name. Or perhaps they do not want to see any of the name attributes, so they could pass ?excludedAttributes=name.

Filtering is encouraged, but optional. It uses two-character operators as a query search syntax that a client can pass to a resource request. These operations are:

eqequal
nenot equal
cocontains
swstarts with
ewends with
prpresent; has value, is not null or empty
gtgreater than
gegreater than or equal to
ltless than
leless than or equal to
andlogical AND linking two or more expressions
orlogical OR linking two or more expressions
notlogically inverts an expression

Parentheses ( ) are used to group operations, and square brackets [ ] are used for attribute access.

Here are some example filters from RFC 7644:

filter=userName eq "bjensen"

filter=name.familyName co "O'Malley"

filter=userName sw "J"

filter=urn:ietf:params:scim:schemas:core:2.0:User:userName sw "J"

filter=title pr

filter=meta.lastModified gt "2011-05-13T04:42:34Z"

filter=meta.lastModified ge "2011-05-13T04:42:34Z"

filter=meta.lastModified lt "2011-05-13T04:42:34Z"

filter=meta.lastModified le "2011-05-13T04:42:34Z"

filter=title pr and userType eq "Employee"

filter=title pr or userType eq "Intern"

filter=
 schemas eq "urn:ietf:params:scim:schemas:extension:enterprise:2.0:User"

filter=userType eq "Employee" and (emails co "example.com" or
  emails.value co "example.org")

filter=userType ne "Employee" and not (emails co "example.com" or
  emails.value co "example.org")

filter=userType eq "Employee" and (emails.type eq "work")

filter=userType eq "Employee" and emails[type eq "work" and
  value co "@example.com"]

filter=emails[type eq "work" and value co "@example.com"] or
  ims[type eq "xmpp" and value co "@foo.com"]

Bulk

SCIM bulk operations are used to pass a variety of changes at once. It’s a way to stuffing what would be HTTP headers into a JSON array of objects. Each of the Operations will have a method (like POST, PUT, etc) and path – similar to a normal HTTP request.

Bulk requests will return status code 200 on the HTTP header, but may have different status codes for each of the sub-tasks.

Example bulk request:

POST /v2/Bulk
   Host: example.com
   Accept: application/scim+json
   Content-Type: application/scim+json
   Authorization: Bearer h480djs93hd8
   Content-Length: ...

   {
     "schemas": ["urn:ietf:params:scim:api:messages:2.0:BulkRequest"],
     "Operations": [
       {
         "method": "POST",
         "path": "/Groups",
         "bulkId": "qwerty",
         "data": {
           "schemas": ["urn:ietf:params:scim:schemas:core:2.0:Group"],
           "displayName": "Group A",
           "members": [
             {
               "type": "Group",
               "value": "bulkId:ytrewq"
             }
           ]
         }
       },
       {
         "method": "POST",
         "path": "/Groups",
         "bulkId": "ytrewq",
         "data": {
           "schemas": ["urn:ietf:params:scim:schemas:core:2.0:Group"],
           "displayName": "Group B",
           "members": [
             {
               "type": "Group",
               "value": "bulkId:qwerty"
             }
           ]
         }
       }
     ]
   }

The bulkId is a transient identifier used during the bulk request, used to track bulk operations historically.

The example response to the above bulk request is:

{
     "schemas": ["urn:ietf:params:scim:api:messages:2.0:ListResponse"],
     "totalResults": 2,
     "Resources": [
       {
         "id": "c3a26dd3-27a0-4dec-a2ac-ce211e105f97",
         "schemas": ["urn:ietf:params:scim:schemas:core:2.0:Group"],
         "displayName": "Group A",
         "meta": {
           "resourceType": "Group",
           "created": "2011-08-01T18:29:49.793Z",
           "lastModified": "2011-08-01T18:29:51.135Z",
           "location":
   "https://example.com/v2/Groups/c3a26dd3-27a0-4dec-a2ac-ce211e105f97",
           "version": "W\/\"mvwNGaxB5SDq074p\""
         },
         "members": [
           {
             "value": "6c5bb468-14b2-4183-baf2-06d523e03bd3",
             "$ref":
   "https://example.com/v2/Groups/6c5bb468-14b2-4183-baf2-06d523e03bd3",
             "type": "Group"
           }
         ]
       },
       {
         "id": "6c5bb468-14b2-4183-baf2-06d523e03bd3",
         "schemas": ["urn:ietf:params:scim:schemas:core:2.0:Group"],
         "displayName": "Group B",
         "meta": {
           "resourceType": "Group",
           "created": "2011-08-01T18:29:50.873Z",
           "lastModified": "2011-08-01T18:29:50.873Z",
           "location":
   "https://example.com/v2/Groups/6c5bb468-14b2-4183-baf2-06d523e03bd3",
           "version": "W\/\"wGB85s2QJMjiNnuI\""
         },
         "members": [
           {
             "value": "c3a26dd3-27a0-4dec-a2ac-ce211e105f97",
             "$ref":
   "https://example.com/v2/Groups/c3a26dd3-27a0-4dec-a2ac-ce211e105f97",
             "type": "Group"
           }
         ]
       }
     ]
   }

Implementing a partial SCIM API

The best way to know which resources and routes are required for a compliant SCIM API is to read RFC 7644. However there are some shortcuts which may make it possible to implement fewer parts of SCIM without catastrophic results. Only through testing will you know.

Perhaps your SaaS does not have user groups. Perhaps a special user should not be deleted. Perhaps you are feeling too lazy to implement the complex filtering language. Whatever the reason, for parts of SCIM you are not implementing for one reason or another, the following strategies may help:

  • List only the few supported operations of your server on the schemas and/or service provider config responses
  • Return 501 Not Implemented
  • Return 403 Forbidden for routes which should be implemented as part of the spec, but are not…(this isn’t recommended, but possible to do)