CogTL Basics
What is CogTL
CogTL is a calculation engine, that allows you to manage risks or opportunities in a complex, evolving system.
CogTL is basically structured in three parts:
- Knowledge acquisition, in other words, connectors to data sources like databases, extractions, configuration files, or other services, transforming the raw data to a knowledge semantic graph.
- Intelligence, to compliment the acquired knowledge by deduction or principles, and to perform automated analysis of the knowledge to detect policy violations, or virtually any condition that you can imagine.
- Imagination to envision scenarios that could happen in the observed system, based on the knowledge that we now have acquired and complimented.
Those three parts are based on a common set of concepts, a dictionary that describes your business domain, regrouped in the Structure part of CogTL.
Structure
CogTL organizes all the data as a semantic graph. This means that every element of knowledge, every concept, is represented as a node that we call entity. Every node can have multiple relations with another entities. Those relations are directed, meaning that they have a source entity and a target entity.
CogTL is completely agnostic of any business domain. Therefore you have to define the dictionary of your objects before importing any knowledge. This configuration is done in the Structure part, and consists of the following elements:
- Entity types: The nature of the concepts (called Entities) that will be present in your knowledge graph. For example, a
Person
, aComputer
, aContract
, aVehicle
, etc. If you are familiar with object-oriented programming, you could think about it as a kind of class. - Relation types: The semantic nature of the relations that link entities toghether. They are generally named with a verb. For example,
is member of
,contains
,belong to
,is a relative of
, etc. - Relation features: Abstract concepts that may be common to multiple relation types. For example,
membership
,connectivity
,association
. It is not at all mandatory to use them, but it can ease the definition of complex systems with complex rules. - Entity constraints: Rules that help CogTL to recognize entities in different contexts (i.e. in different knowledge sources). If you are familiar with relation databases, you could think about it as a kind of primary key, but with more complex possibilities, as they can be conditional, or scoped.
- Tags: As the name says, tags are labels that may be applied to your entities or relations, following intelligence rules.
Knowledge
This part of CogTL configuration is where you define knowledge sources, which are connectors to data sources of multiple different types, allowing CogTL to build his knowledge graph by merging all information that it gathers from the different sources.
This part also includes the definition of prepared queries, which are a way of teaching CogTL how to navigate on the knowledge graph to search for specific patterns, or get from one entity to others by following specific relations.
Knowledge sources
CogTL includes multiple data connectors allowing to read a large variety of data formats, as well as connect to different kinds of other systems, or even accept incoming data pushed by other systems.
The idea, for every knowledge source, is to gather records, i.e. atomic elements with multiple properties, that CogTL will convert to small knowledge graphs, that will be merged together, then merged to the knowledge graph when the source is completely loaded.
Record-based knowledge sources Some kinds of knowledge sources are already fundamentally organized as sets of records with properties, in particular:
- LDAP directories
- LDIF files (extractions of LDAP data)
Table-based knowledge sources In table-based knowledge sources, every record is a row, and the properties are the cells of those rows, named with the columns headers.
- CSV files (comma-separated values)
- Excel files
- Databases (MySQL / Microsoft SQL Server)
Structured hierarchical knowledge sources In structured hierarchical knowledge sources, the source data is a tree, and you define what kind of node of this tree is to be considered as a record. The whole subtree of this node is then transformed as a the properties of the record. This kind of parsing applies to:
- JSON files or streams (e.g. REST web services)
- XML files or streams (e.g. SOAP web services)
Non-structured or (unknown) knowledge sources Some kind of knowledge source are non-structured (full text) or the structure is not known (configuration files). For example:
- Generic text files - where you will teach CogTL how to parse the document in order to get the data that you need.
- CogTL module - when more advanced and specific operations are required to get the data. It requires the implementation of a common API.
- Random data - used to generate random models, based on random datasets.
Reception-oriented knowledge sources When integrating CogTL in a complex environment, it may be useful to be able to accept pushed knowledge. Therefore, CogTL can expose an API accepting records of data, potentially associated with a transaction mechanism to ensure that all data is received before merging the imported graph with the current knowledge.
- CogTL API (push)
Every knowledge source is composed of several parts (some of them depend on the type of source, as they may not be pertinent):
- Basic elements: The name of the source and which agent is in charge of loading the data
- Input configuration: How to access the source data, including the connectivity parameters, location of the files, and configuration of the data parsers
- Preloading actions: Actions that should be executed before loading the data
- Adapters: Modifiers of the input properties, to adapt or combine them
- Filter: Optional expression to select the records to import or discard
- Modelling: The final graph that will be constructed for every single record
The logic of CogTL is then to merge every "little" graph build from a record, based on entity constraints, thus leading to the construction of a bigger, complex graph. When all records are loaded, this complex graph representing all the knowledge of a specific source, is merged in the core brain, the principal knowledge graph, by applying all entity constraints again.
More detailled information on the different elements composing the definition of a knowledge source can be found on this help page.
The merging algorithm
In CogTL, entities are constantly merged. Entities are merged most of the time because of the application of entity constraints. For example, if your model contains entities of type Country
, you may have defined an entity constraint ensuring the unicity of Country
entities based on their name
. Now, whenever you load data from a knowledge source and you build an entity of type Country
, you may have to merge existing entities that live in the core brain with the entities just loaded by the source, as they represent the same country.
This algorithm will use proceed the following way:
- If entities don't have the same probability, the highest probability will be kept.
- If entities have the same probability, all the information is merged. Existing tags are merged. Entities data from the existing entity are kept, unless another value is given by the imported entity (as we consider that the information is more recent and thus more up-to-date). Same logic applies to the factor values.
- Relations are redirected from the imported entity to the final entity, and potentially merged if the source and destination are both merged with the existing graph.
Prepared queries
Prepared queries are common, potentially complex, queries that can be performed over the knowledge graph. They can have different usages in CogTL:
- Represent a re-useable complex query that will be used in other contexts (assertions, explorer navigation, OpenAPI function...)
- Expose a simple graphical user interface for client users that allows them to perform queries without the need for them to learn to use CogTL
A prepared query may be based either on a simple CoQL query (more information here, or on navigations.
More details about prepared queries can be found on this page.
Intelligence
Intelligence is an kind of abstract name to designate business logic, under the form of assertions that you can implement in CogTL, and that can have multiple goals:
- Compliment the knowledge, by adding, modifying or merging entities or relations in the core brain (the central knowledge graph).
- Flag elements of the core brain with Tags. This can be the way to implement business or compliance rules, or can simply be used to speed up other assertions.
- Discover where Scenarios may be possible. This is a key element of the imagination part described below.
- Invoke a CogTL module to perform an action whenever a specific condition is met.
Based on all this, we can summarize assertions as being a way of implementing specific features of our brain, like deduction capabilities, recognition or similar concepts in different contexts, application of principles to a specific case, compliance to rules or action/reaction patterns.
Assertions are composed of three main parts:
- Criteria used to identify subjects (i.e. entities or relations in the core brain.
- An optional condition that further refines the identified subjects.
- Consequences, i.e. actions that must be taken whenever a set of subjects matching the condition is found.
The whole idea of CogTL is to let it orchestrate all assertions evaluation automatically, as they are - as much as possible - only evaluated on the pertinent portion of the knowledge graph. However, you may also want to schedule a full reevaluation from time to time to be sure to catch all subjects combinations.
More information about assertions setup can be found on this page.
Imagination
The imagination layer of CogTL is where all probabilistic calculations happen. Imagination is composed of scenarios describing a "story" that could happen, and those scenarios may have an impact on entities factors, which are numeric values, and that can be themselves used to derivate other factors.
Scenarios
A scenario is basically a "story" that could happen, to a set of actors. It is described in a very abstract way, that can be - in a way - completely independent of the structure of the knowledge brain.
For example, you can completely write a scenario named A person gets injured by an industrial machine
, when your structure contains entities of type Employee
, Tool
, etc. A scenario should be as close as possible as a human concept.
An actor can be a human being, but could also be any other concept. In fact, you can think of an actor as a kind of "variable" in your scenario. The same scenario could happen to different sets of actors.
Scenarios can be composed of multiple sub-scenarios, when dealing with complex successions of elements, and scenarios can also be specializations of a parent scenario.
In the previous example, you could imagine a parent scenario named A person cannot perform his job
, that could be the parent of multiple variations (illness, injury, maternity leave, etc.).
The discovery of potential realizations of the different scenarios is done using assertions (described previously in the Intelligence part).
Factors
Factors are numeric values attached to an entity, but that are calculated through one of the following options:
- Based on the probabilities of one or multiple scenarios affecting this entity. A scenario affects an entity when this entity is an actor of this scenario.
- Based on other factors and potentially other elements of this entity (typically some properties), by an expression that can be a kind of mathematic formula (or more complex).
Factors are typically used to designate and quantify a risk or an opportunity.
Put it all together
In fine and to summarize, CogTL...
- Maintains a model (knowledge graph) of your business domain, automatically by watching multiple sources of data.
- Compliments this model by assertions, implementing "deduction" and "recognition" mechanisms, and/or business logic.
- Using the same assertions, discover where and in what conditions your scenarios can happen.
- Cumulate all the scenario probabilities in factors.
The big advantage of CogTL is that every change in the underlying knowledge graph is repercussed in the other parts: business logic is applied, deductions and recognition are performed, possible scenarios are discovered or disappear, and factor values are re-calculated.
As a result, you can use:
- The internal notification system, to get alerts whenever a condition is fulfilled (e.g. an entity is marked with a tag by applying business logic, or a factor value is modified for an entity, or passes a specific threshold).
- The query system or dashboards to get the top entities having the highest score for any factor
- Outliers in the system, using the Analytics view, i.e. entities that are particularly different from the others.