Troubleshooting your COGNIGY.AI Installation


Did something go wrong while installing / upgrading COGNIGY.AI? Or are you unsure on how to proceed? If yes, then read this guide for help to solve your issue. The last section in this guide contains instructions on how to contact Cognigy in case your issue is not described in this guide.

Error in the Installer

If you get an error while running the Installer, then please take a note at which step you are.

Creating backup for cognigy related files

If you get an error while creating a backup for cognigy related files, then it's most likey due to missing one of the files that you need to backup. In the image below, the stateful.yml file is missing. Please first check that the three following files are present in the /var/opt/cognigy directory:

  • stateful.yml
  • cognigy.yml
  • production.env

If any of these files are missing, then you should be able to find them in a backup folder from a previous upgrade. If not, then please contact Cognigy stating your issue and which files are missing.

If the files are there, then make sure that you are executing the installer within the /var/opt/cognigy directory.


Missing stateful.yml file

Creating backup for all databases within your MongoDB server

In case you get an error while doing database backups, then it means that your production.env file is corrupt / missing. First try whether you can find a copy of the file in one of the Installer backup folders from previous upgrades and try running the Installer again. If this doesn't work, then you should immediately contact Cognigy. Please send a screenshot of the error message in the Installer and tell us whether you have the production.env, and whether it contains passwords (e.g. that the env variable AI_DB_PASSWORD isn't empty). Do NOT send us the file since it contains database passwords.


Unable to to database backups

Missing files

Missing cognigy.yml file

If you cannot find the cognigy.yml file to deploy after the first installation, then you should run the Installer once again. It is very important to run the Installer twice during the first installation to get all the required files.

Missing files after upgrade

After running the Installer, it will tell you to either deploy the cognigy.yml file or the stateful.yml file. The Installer always generates these files in the /var/opt/cognigy directory, so you have to make sure that you are in the /var/opt/cognigy directory when deploying these files.

In case you cannot find these files in the /var/opt/cognigy directory - don't worry. The Installer creates a backup of these files before modifying them, so you can easily recover them. There will be a backup folder in the /var/opt/cognigy directory with the name: backup-pre--. You should copy any missing files from here into the /var/opt/cognigy directory and run the installer again. If you still face issues, then follow the steps in the section below to contact Cognigy.

Cognigy services not starting

Service-ai, service-flows, service-custom-modules or service-nlp is not starting


NFS misconfigured

If none of these services are starting, then the chance is high that you have a misconfigured NFS.
Run the command docker service ps cognigy_service-flows --no-trunc (or choose one of the other services) and compare the output to the scenarios described below.

NFS connection timed out


NFS cannot connect

In the output of the command docker service ps cognigy_service-flows --no-trunc there should be an error message like in the image above. If the error is that the connection timed out, then chances are high that the IP Address you are trying to connect to is wrong, or that the server is not reachable. In the output of the command, there will be a line displaying the IP Address you are trying to connect to (e.g. o='addr=,rw'). If this IP Address is correct, then you need to ensure that your servers can reach each other (e.g. are in the same network). If the IP Address is wrong, then you need to go through the following steps:

  • Remove the cognigy stack by executing the command: docker stack rm cognigy.
  • Wait until all services have been removed (can take a minute). Run the command docker service ls and make sure no services from the cognigy stack are shown.
  • Remove the NFS volumes by executing the following command: docker volume rm flow-modules nlp-models config. If this throws an error that the volume is still in use, then manually delete the Docker Containers listed in the error message or wait another minute until they have been deleted
  • Edit the file located at /var/opt/cognigy/installerConfig.json and change the nfsMasterIp to the correct IP address. Save the file.
  • Run the installer once again and redeploy the cognigy stack.

NFS permission denied


NFS permission denied

In the output of the command docker service ps cognigy_service-flows --no-trunc there should be an error message like in the image above. If the error is that the permission is denied, then some IP Addresses haven't been whitelisted to access certain directories in your NFS. You therefore need to make sure that you have the following lines in the /etc/exports file on the machine where the NFS server is running:

/var/opt/cognigy/data/models <node-ip>(rw,fsid=0,insecure,no_subtree_check,sync)
/var/opt/cognigy/data/flow-modules <node-ip>(rw,fsid=1,insecure,no_subtree_check,sync)
/var/opt/cognigy/data/config <node-ip>(rw,fsid=2,insecure,no_subtree_check,sync)

If just a single IP Address / folder combination is missing, then some services on some servers will not be able to connect. Here is an example of the NFS configuration for a Swarm cluster with two machines:



When you think the contents of the /etc/exports file, then save the file and run the following command:

sudo systemctl restart nfs-kernel-server

Then wait a few minutes and the services should start.

Multiple server setup

If you have multiple servers joined into a cluster, then you have to make sure that they are all in the same private network, so that they can reach each other. In case your machines are not in the same network, but have to go through a firewall in order to communicate, then you have to make sure that port 50 ESP is open for traffic. This is required because the Docker Networks used for communication between containers are encrypted for security reasons. If the port is not open, the containers will not be able to connect to e.g. MongoDB.

Traefik Not Starting

Traefik is the reverse-proxy used by COGNIGY.AI and might not be starting if the TLS private key / certificate is misconfigured. To figure out what the issue is, it can be helpful to temporarily enable debug logging for Traefik. You can do this by opening /var/opt/cognigy/cognigy.yml, scrolling all the way to the bottom and add the command flag --logLevel=DEBUG to the command field of the traefik service:

image: ''
command: '--logLevel=DEBUG --api --docker --docker.swarmmode --metrics.statsd --metrics.statsd.address= --entryPoints=''Name:http Address::80'' --defaultentrypoints=http'

Encrypted private key

Traefik currently doesn't support encrypted private keys, and will not start if such a key is used. If you enabled the debug logging for Traefik as described above, then you will see a log output similar to this:


Encrypted Private Key

If you are indeed using an encrypted private key, you first need to decrypt it:

openssl rsa -in private.key -out private.key.unencrypted

When you have the new private key, you have to recreate the docker secret used to store the private key:

docker stack rm cognigy && \
docker secret rm traefik.key && \
cat private.key.unencrypted | docker secret create traefik.key - && \
docker stack deploy -c /var/opt/cognigy/cognigy.yml --with-registry-auth cognigy

If docker complains that the secret traefik.key is still in use, then wait a minute and try again.

Proxy related issues

This section shows you how to debug and fix issues related to some parts of your installation, because of the usage of a corporate proxy. If the COGNIGY.AI installation is only allowed to access the public network through a proxy, you need to slightly adjust your COGNIGY.AI setup. First, go through the following guide: Docker Proxy. You need to set the HTTP_PROXY and HTTPS_PROXY environment variables in the /var/opt/cognigy/production.env file.

HTTP Request Node not working


HTTP Node not using proxy

If you are trying to use an HTTP Request Node on your on-prem installation, and you get an EPROTO error like above complaining about either a wrong version number or an unknown protocol, then you are most likely trying to access an http proxy over https. To fix this, open the file /var/opt/cognigy/production.env in a text editor and make sure that the HTTPS_PROXY environment variable is set, and that the url says http:// and not https://. After changing the production.env file, run the command:

docker stack deploy -c /var/opt/cognigy/cognigy-yml --with-registry-auth cognigy

to redeploy the cognigy stack.

Unable to get local issuer certificate

If you get the error Unable to get local issuer certificate or something similar, then it means that your proxy is using an untrusted certificate. The best way to fix this is to use a trusted certificate in your proxy, however if this proxy is used for many different projects across your organisation, and if it is therefore impossible to change it, then you can set the following environment variable in the /var/opt/cognigy/production.env file: NODE_TLS_REJECT_UNAUTHORIZED=0. Afterwards, you need to redeploy the cognigy stack:

docker stack deploy -c /var/opt/cognigy/cognigy-yml --with-registry-auth cognigy



Setting this environment variable imposes a severe security issue in your application, and should only be used if you are sure that you are e.g. only sending http requests through a trusted proxy.

Unable to save some Flow Nodes


Unable to save Node

If you are unable to save certain Flow Nodes, and if you are getting the error above displayed in your Flow, then the issue is that your proxy doesn't allow encoded forward slashes. This can especially be an issue if your are using an Apache Proxy, and potential ways to fix it are described here.

Contacting Cognigy

Send an email to [email protected] with the following information:

  • A description of your problem
  • Which version of COGNIGY.AI you are currently running, and which version you are trying to upgrade to.

If there was an error in the Installer:

  • A screenshot of the entire Installer output

If certain services do not start:

  • A screenshot of the command docker service ls
  • A screenshot of the command cat /etc/exports
  • The IP Addresses of all servers in your cluster
  • A screenshot of the command docker service ps <service> --no-trunc for the services that are not starting
  • A screenshot of the command docker service logs <service> for the services that are not starting. If the log output is empty, then do not send us a screenshot but simply tell us that the service do not produce any logs.

If you cannot reach the installation:

  • A screenshot of the command cat /var/opt/cognigy/installerConfig.json
  • A screenshot of the command docker service ps cognigy_traefik --no-trunc
  • A screenshot of the command docker service logs -f cognigy_traefik

If some other error happened then please send us a screenshot of that error