CogTL Integration Guide

The external cogtl.properties file

CogTL is configured by default to work with a containerized version of MongoDB, default credentials, and in a quite "standalone" manner. However, as soon as you need to personalize its integration with other systems, you have to use a properties file that must be stored outside of the container (to survive updates), and made accessible to the server through a docker volume.

To use an external cogtl.properties file, you have to:

Create a volume linking a local path outside the container with a path inside the container. We recommend using the following path: /opt/jboss/cogtl/props.
Define an environment variable named cogtl.properties.file, containing the path to your cogtl.properties file.

For example, in pure "vanilla" Docker, you will run the CogTL server container with a command like:

docker run -d --name cogtl-server --volume <local_path_to_props_file>:/opt/jboss/cogtl/props --env cogtl.properties.file=/opt/jboss/cogtl/props/cogtl.properties cognitechs/cogtl-server

Permissions

On Linux, you have to take care of the permissions of your properties folder located outside of the container. The Wildfly container of CogTL server runs with a user « jboss » having a well known user uid/gid (1000/1000), therefore the mapped volume must be accessible to a user having the same uid/gid on the host.

To create this user, use the following commands on the host machine:

groupadd -r cogtl -g 1000
useradd -u 1000 -r -g cogtl -s /sbin/nologin -c "CogTL server" cogtl

Then create the directory (or change the owner of the existing directoy) :

chown -R cogtl:cogtl <local_path_to_props_file>

Usage of secrets for sensitive properties

Sensitive properties (e.g. database/trust store credentials) should be carefully protected against intruders or - in some enterprises - technical administrators.

In order to achieve this goal, CogTL allows you to add a supplementary indirection to retrieve properties values. It requires the following elements:

Define the secrets storage strategy, by setting an environment variable secrets.storage, with the value docker, or vault (from version 1.4.0+).
Update your sensitive properties, by setting a value of the following form:

cogtl.sensitive.property=${secret:<secret_key>}

If using docker secrets storage strategy, set the Docker secret with the key secret_key, containing the sensitive value.
If using vault secrets storage strategy (v1.4.0+ of CogTL), refer to the Vault configuration below.

Creation of a Docker secret

If you're on Windows, you must use a source file with the values of the secret (as Windows includes spaces and carriage returns, unfortunately):

docker secret create <secret_key> source.txt

On Linux you can do it inline :

echo -n "<secret_value>" | docker secret create <secret_key> -

Use Hashicorp Vault for secrets storage

With a secrets storage of type vault, you have to setup how CogTL will connect to the Vault to get the secrets. First, you must define the Vault address in the cogtl.properties file:

vault.address: The address of the Vault.

Then CogTL must be able to present a security token to the Vault, so that it can retrieve the secrets.

As a debug possibility, you can first directly provide a token that will be used by CogTL. This is a bad practice, as anybody having this token may be able to retrieve all the secrets, therefore there is virtually no added value to use secret properties:

vault.token.direct: An explicit token to access the Vault (don't use in production!)

As a second step in debugging, you can also provide a JWT token that allows CogTL to authenticate to the Vault, and obtain a temporary token for secrets retrieval. This is still not really secure, as it only adds a communication to retrieve all secrets:

vault.token.jwt: The JWT to authenticate to the Vault (avoid using in production also).

The production setup should be to use the following property to indicate the location of a file containing the JWT to use. This file should be written by the orchestrator, or a sidecar container at runtime or startup time. E.g. for OpenShift, please refer to the official documentation here:

vault.token.file: The file to read to get the JWT allowing to authenticate to the Vault. E.g. for OpenShift, this file should be /var/run/secrets/kubernetes.io/serviceaccount/token.

If you're using another system than pure Kubernetes (e.g. GCP), you may want to adjust the following property:

vault.token.provider: By default kubernetes. This string is used in the URI to the Vault provider (vault.address + /v1/auth/[provider]/login).

Finally, you have to set the role used for the request to the Vault:

vault.token.role: Role used in the authentication request.

Now, you can use the Vault secrets in properties using the following syntax:

cogtl.sensitive.property=${secret:<secret_path>:<secret_key>}

Example:

mongo.connection.password=${secret:mysecrets/db/mongo:password}

Use an external MongoDB database

You can connect the CogTL server to an external (non-containerized) MongoDB database/cluster.

Requirements

The minimum required MongoDB version is 3.6.
The required storage capacity will depend on the topology of the data imported in CogTL. A fair approximation of the required space is around 1 Gb for 1 million entities.

Setup

A database "cogtl" must be created on the MongoDB server/cluster. If you need to use another name, use the property mongo.connection.db described below.
On this database, a user must be created with dbAdmin and readWrite roles:

db.createUser(
    {
        user: "<username>",
        pwd: "<password>",
        roles: [
            {role: "dbAdmin", db:"cogtl"},
            {role: "readWrite", db:"cogtl"}
            ]
    }
   )

Note: the "dbAdmin" role is necessary as CogTL need to be able to create collections and indexes on its own database. Note 2: adjust the name of the database if you have created the database with another name in the previous step.

Setup CogTL server to use this database with the right credentials. This configuration must be stored in the cogtl.properties file. Properties to fill are the following:
- mongo.connection.host: Hostname or IP of an unique MongoDB server (not a replica set)
- mongo.connection.port: TCP port of an unique MongoDB server (not a replica set)
- mongo.connection.hosts.list: Comma-separated list of hostname:port identifying a replica set (e.g. server1:27017,server2:27017). If this list is provided, it replaces the individual definition of host and port.
- mongo.connection.db: The name of the database schema to use on the Mongo instance (by default cogtl)
- mongo.connection.username: The username that you created in step 2, or a reference to a secret
- mongo.connection.password: The password that you set in step 2, or a reference to a secret
- mongo.connection.admindb: Optionally, the administration database that contains this user
- mongo.connection.ssl: Set to 1 if you want to use a TLS connection to the MongoDB database. Refer to this chapter to be able to use a custom trust store.

Use custom certification authorities for SSL/TLS

If you need to connect to data sources using TLS, you may need to be able to trust certificates issued by your company PKI, or self-signed certificates. CogTL provides the possibility to specify a complimentary trust store that will be used along with the default JDK trust store (containing common public certification authorities).

To specify a custom trust store (as a JKS file), you must make it accessible to the CogTL server using a docker volume (you can use the same volume as the one defined for the properties file). You must then inform CogTL to use it by adding the following properties in the cogtl.properties file:

custom.trust.store.filename: The filename of your trust store
custom.trust.store.password: The password of the trust store, or a reference to a secret

The login configuration defined on CogTL Server (i.e. in the cogtl.properties file) applies to both CogTL Admin UI (cogtl-web) and PeopleRisk (cogtl-people).

By default, CogTL allows "guests" to login (i.e. anonymous), and you can define in System > Users and roles what they are allowed to do using the Guest role.

You may also want all users to be authenticated. Therefore you can use the following property in the cogtl.properties file:

security.authentication.mode: By default OPTIONAL, can be set as MANDATORY to automatically redirect all users to the login page when they aren't logged in.

Note: In MANDATORY mode, if you have setup SSO with one unique provider, the user will be automatically redirected to the provider to authenticate without displaying the login page.

By default, CogTL only relies on his own internal users referential. However in an enterprise you may want to authenticate users against an external referential (e.g. Active Directory / LDAP), or implement single sign-on. CogTL can thus be configured to rely on one or multiple security providers, that can be of different types. For username/password authentication on an external source, you can use an external LDAP users connection. As for single sign-on, the OpenID Connect protocol is a standard de facto to achieve this goal, and CogTL can be configured to trust one or multiple relying third parties, in order to identify the users.

To configure security providers, you have to add some properties in the cogtl.properties file.

Every security provider must be referenced by the global property:

security.providers=<providerName>[,<providerName2>,<providerName3>,...]: Comma-separated list of providers ids. The name that you will give to the providers has no importance in terms of user interface, but it will be the key used in the following properties, and as the provider ID in the users database. Therefore you may not change it afterwards, or all users of this provider will lose their configuration.

Every provider then has a type, defined by a property:

security.provider.<providerName>.type=<type>, where type can be OpenID or LDAP. Look at the following chapters to get details on the configuration for the different providers types.

Every provider can also be allowed to create users automatically in CogTL database whenever they authenticate the first time. This is the default behaviour.

security.provider.<providerName>.create.users, can be set to 0 if you want to deactivate this behavior. In this case, you will have to manually create the users and assign them to a provider if needed, in the System > Users and roles screen

OpenID Connect configuration

Every OpenID Connect provider is configured by a set of properties, a part of it being obtained by registering your CogTL server with the relying third party:

security.provider.<providerName>.issuer: The issuer of OIDC tokens
security.provider.<providerName>.client.id: The client ID that you got when registering CogTL with the provider
security.provider.<providerName>.client.secret: The client secret that you got when registering CogTL with the provider. You may use a reference to a secret there.

To auto-configure provider endpoints, you can use the following property:

security.provider.<providerName>.autoconf: Set to 1 if you want CogTL to automatically retrieve endpoints from the well-known OpenID URL (<issuer>/.well-known/openid-configuration).

Alternatively, you can set all endpoints manually (or override the values) using the following properties:

security.provider.<providerName>.endpoint.authorization: The URL to use for authorization (that is the URL where the end user will be redirected to be authenticated)
security.provider.<providerName>.endpoint.token: The token URL that CogTL server will use to get the identity and access token from the authorization code obtained by the user
security.provider.<providerName>.endpoint.userinfo: The userinfo URL that CogTL server will use to get the additional claims required to identify and provision the user
security.provider.<providerName>.endpoint.jwk: The URL that CogTL server will use to get the cryptographic certificates allowing to verify that the received tokens are valid

As an alternative for the JWK endpoint, you can also configure a specific certificate to use for the tokens validation, using the following property:

security.provider.<providerName>.certificate: The filename of a DER-encoded X509 certificate to use for tokens validation. Note that the path is relative to the cogtl.properties file. Therefore if the certificate file is stored in the same directory, you can just specify its filename without path.

You can customize the authorization request by adding a resource parameter (may be useful for ADFS authentication):

security.provider.<providerName>.resource: An additional "resource" URL parameter added to the authorization URL.

Then we have to define what to ask to the provider (the scopes) and how to translate the received information (claims) to CogTL information:

security.provider.<providerName>.scopes: A comma-separated list of scopes that CogTL needs to get when identifying the user (by default, only openid will be asked, meaning that we won't have a display name or an e-mail for the users authenticated by this provider
security.provider.<providerName>.claim.userid: The name of the claim to use as the user unique identifier (by default sub).
security.provider.<providerName>.claim.name: The name of the claim to use as the user "display name". By default, the userid will be used.
security.provider.<providerName>.claim.mail: The name of the claim to use as the user e-mail address. By default, it won't be set, and the user won't be able to receive e-mails.
security.provider.<providerName>.claim.roles: The name of the claim to use as the "roles" list. By default, it won't be set, and any user authenticated with this provider comes with a Guest role (and you can then add roles in System > Users and roles). This claim must be an array in the response of the provider. On authentication, CogTL will take every string returned by the provider and compare its value with the external mapping set for every Role in CogTL configuration. If the CogTL role has an external mapping set, then the user will get this role if, and only if, the external mapping is found in the string list returned by the provider.

Finally, a set of properties allow you to customize a bit the user interface:

security.provider.<providerName>.label: The "readable" name of the provider. By default, the providerName will be used.
security.provider.<providerName>.button.color.background: The background color of the login button for this provider. You can use any valid CSS value here (#hexa, named color, rgb(), etc). By default black.
security.provider.<providerName>.button.color.text: The color of the text of the login button for this provider. You can use any valid CSS value here (#hexa, named color, rgb(), etc). By default white.
security.provider.<providerName>.button.icon: CSS classes (FontAwesome) representing an icon that will be integrated next to the button label

To help debugging integration issues with an OpenID provider, you can also use the property:

security.provider.<providerName>.debug: If set to 1, more logs will be output in the server logs. Warning: ID tokens, access tokens, authorization codes will be output in the logs for the next authentication requests on this provider. Therefore, use it only to debug issues, but don't leave the server with this setting!

Example OpenID configuration

To authenticate users with their Google account for example, you must first obtain a client ID / secret from Google, following the documentation that you can find here: https://developers.google.com/identity/protocols/OpenIDConnect.

Then you can declare a provider in cogtl.properties with the following properties:

security.providers=google
security.provider.google.type=OpenID
security.provider.google.label=Google

security.provider.google.issuer=https://accounts.google.com
security.provider.google.client.id=<your client id>
security.provider.google.client.secret=<your client secret>

security.provider.google.autoconf=1
# If you don't want to use autoconf, uncomment the 4 following lines:
# security.provider.google.endpoint.authorization=https://accounts.google.com/o/oauth2/v2/auth
# security.provider.google.endpoint.token=https://oauth2.googleapis.com/token
# security.provider.google.endpoint.userinfo=https://openidconnect.googleapis.com/v1/userinfo
# security.provider.google.endpoint.jwk=https://www.googleapis.com/oauth2/v3/certs

security.provider.google.scopes=openid,profile,email
security.provider.google.claim.userid=email
security.provider.google.claim.name=name
security.provider.google.claim.mail=email

security.provider.google.button.color.background=#D62D20
security.provider.google.button.color.text=white
security.provider.google.button.icon=fab fa-google

Use multiple OIDC providers

If you configure CogTL to trust multiple third parties, you can also adapt the behaviour when multiple providers authenticate users with the same user ID. Depending on the nature of this user ID, you may want to accept that multiple providers can authenticat an unique user with this same user ID (for example, in the case of an e-mail address). In other situations, two providers giving the same user ID may be an unfortunate collision, and you may not want CogTL to accept it.

By default, CogTL refuses to authenticate a user with another provider than the one that has been used when the user has been created. If you want to accept multiple providers for the same user ID, you may use the property:

security.accept.multiple.providers=true

LDAP security providers configuration

LDAP providers allow to perform the validation of the credentials of a user against an external LDAP directory (e.g. an Active Directory).

A LDAP provider requires the following properties:

security.provider.<providerName>.host: The host name of the LDAP directory
security.provider.<providerName>.port: The tcp port to use (generally 389 for ldap or 636 for ldaps)
security.provider.<providerName>.tls : Set to 1 if connection must be done over TLS (by default 0)
security.provider.<providerName>.pattern: The pattern that will be used to generate a binding username from the received username. In this pattern, the character $ will be replaced by the provided username.

Example patterns are: cn=$,ou=Users,dc=acme,dc=com, or $@acme.com.

On authentication, CogTL will try to authenticate against this host/port using the pattern including your username, and the provided password.

A simple cosmetic property can also be set:

security.provider.<providerName>.label: The "readable" name of the provider. By default, the providerName will be used.

Work with multiple LDAP providers

If you may authenticate users against multiple LDAP directories, you can specify for every provider the following property:

security.provider.<providerName>.users.mask: This is a regular expression that will be applied to the username transmitted by the user. If the username doesn't match, this provider will ignore the authentication request. If multiple providers match, the first one will be used.

Use a custom TLS certificate for CogTL Admin interface, and/or for PeopleRisk

If you deploy CogTL within an Enterprise and want to expose it over https (which is a good practice!), you will need to use a custom TLS certificate that is trusted by your workstations. By default, CogTL Admin and PeopleRisk come with a self-signed certificate that won't be trusted by the clients (therefore users will see a security warning when opening CogTL or PeopleRisk over https).

To use a custom certificate, you must provide it as two files:

The full certificates chain, PEM-encoded(named fullchain.pem), which is not sensitive as it is what will be exposed by the server.
The corresponding private key (named privkey.pem), which must be protected because this key may put your encrypted trafic at risk of interception if it is stolen.

Then you must start the cogtl-web and cogtl-people containers with a mapped volume to the location where you have put those two files, and set the environment variable COGTL_TLS_DIRECTORY with the mapped location. For example, with a simple "Vanilla" Docker command:

docker run -d --name cogtl-web --volume <local_path_to_certs_file>:/opt/cogtl --env COGTL_TLS_DIRECTORY=/opt/cogtl -p 443:8443 cognitechs/cogtl-web

docker run -d --name cogtl-people --volume <local_path_to_certs_file>:/opt/cogtl --env COGTL_TLS_DIRECTORY=/opt/cogtl -p 443:8443 cognitechs/cogtl-people

Configure staging storage

CogTL server offers the possibility to upload some files to a local staging, which is kind of versioned internal file system, in order to use them as knowledge sources, instead of reading them from a remote location.

By default, the staging is stored in an internal location of cogtl-server container. However, if you want to use this possibility, you really should externalize this storage to an external volume or location. Otherwise, you may lose the uploaded files everytime CogTL server is updated or re-created from an orchestrator, as the staging directory will be empty everytime in the new version of the container.

To externalize the staging storage, you must set the staging.path property in cogtl.properties to a value that you will declare as a volume. For example:

In cogtl.properties:

staging.path=/opt/jboss/cogtl/staging

... and in your startup configuration, for example:

docker run -d --name cogtl-server --volume <local_staging_path>:/opt/jboss/cogtl/staging [...other_options...] cognitechs/cogtl-server

Configure CogTL server JVM settings

Since v1.12.7, two environment variables can be set when starting the server, to customize the CogTL's JVM settings:

COGTL_HEAP_SIZE is set to 8192m by default (8 Gb). It defines the maximum available heap memory that the JVM is allowed to take. For big models, it may be necessary to set a higher limit.
ADDITIONAL_JAVA_OPTS can be used to insert other options to the JVM command line.