- Create a failover cluster
- Add all necessary recording servers to the CORTROL Global configuration — manually or using server autodiscovery
- Assign server roles (primary recorder or failover node) and define their settings
- Put the servers into the cluster(s)
The order of these steps is not crucial, e.g., you can first add the servers and create the failover cluster later.
We shall consider a system with one cluster that contains two primary recording servers and two spare servers. All these servers (four of them) are located in the same local network and, as a result, have access to the same cameras. This network can be different from the Global server network.
In Configuration / Servers, verzamel alle servers. Benoem een server tot "Failover Server" en andere server tot "Reserve Server". Benoem alle overige servers "Opname Servers X".
Instellen van de Recording Servers
Double-click a recording server to bring up its properties editing dialog box, and switch to the Failovertab (do not forget to fill in the rest of the tabs afterward). Here, it is necessary to choose the failover cluster that servers will belong to, and define failover settings for this particular server.
Click Change to choose a pre-created failover cluster from the list. Do not worry if you have not created a cluster beforehand, you can do so right now by clicking the button in the bottom of the cluster list.
Dubbelklik op een Recording Server. Klik op Failover. Voeg de Recording Server toe aan een Failover Cluster (in het voorbeeld MostImportantClusterEver).
Current Failover Server blijft "none" en wordt straks door de Reserve Server toegewezen.
In the Current failover server field, it should be saying none at this point: here, the currently active failover node is shown, so this field will become non-empty once the target recording server fails and is replaced by a failover server.
The rest of the settings explained:
- Failover timeout: for how long the Global server will wait after the last heartbeat message before it marks the target server faulty and replaces it with a failover node. We shall set this to the minimum equal to ten seconds;
- Central server connection timeout: defines for how long the remote server should try receiving configuration from the Global server before giving up and starting with last good known configuration. Also set to minimum, one minute;
- Auto recovery: if enabled. the remote recording server will start operating automatically once it is back to life. If the central server connection is available at that point, the failover server will be stopped by the Global server. We need to enable this in order to ensure autonomous system operation (otherwise, the failover will continue operating until manually replaced by the primary server);
- Recovery timeout: delay before Global server activates the target server once it is back online; we shall set this to zero to make the recovered server resume operation without delay.
Repeat this for every primary recording server in the list.
Instellen van de Reserve Server.
For failover servers, the settings differ a bit. Double-click a server that is intended to be a failover node and stay on the Details tab: here, enable the Failover node role.
Dubbelklik op een Reserve Server.
Klik op Details tab. Vink aan Failover Node.
Vervolgens
Klik op Failover tab. Voeg de Reserver Server toe aan een Failover Cluster (in het voorbeeld MostImportantClusterEver)
Track Server Status
Current server status and also hardware load (for connected machines) can be viewed in the Monitoring section of CORTROL Console, under Servers.
- If a recording server is offline and its duties are performed by a failover node, failover server status will be Substituted and its failover configuration will display the primary recording server name (whose configuration is currently used).
- At the same time, the faulty recording server will have an Unknown status and will be marked red as unavailable.
Also, for each primary recording server, its current failover substitute is displayed in the server settings, Failover tab (as shown in the snapshot above).
Intended Effect
Once you have clustered the servers and configured them as described above, CORTROL Global is ready for the recording server misbehavior: whenever any of the two recording servers fails, it will be replaced by a failover server right away.
In addition to the camera configuration, the failover server sustains the state of the server event & action configuration.
From the point of view of the connected clients — CORTROL Monitor, mobile apps, and others — all failover operations are transparent for the user’s convenience.
Clients receive the requested live streams and recordings, and present these to their users, so the latter may not even suspect something might have gone wrong with any of the servers.
Recordings that have been made on failover servers remain there until they are erased, having reached one of the quotas. Individual archive duration quotas set for certain channels will affect all servers, i.e., the outdated recordings will be erased from the failover servers as well (provided, of course, that these servers are online and connected to the Global server).
Tips and Tricks
There are a few hints that can make your experience with CORTROL Global failover even more exciting ?
- You can force a recording server to be replaced by a failover node before taking it down for maintenance, in this way eliminating the downtime (those several seconds necessary to re-initialize the streams). In this case, do not forget to set the recording server’s recovery timeout to be greater than zero so that you have time to turn it off! You can do this via Console > Configuration > Failover clusters > view cluster servers > Change failover server.
- Server role (failover/primary) can be assigned already at the recording server auto discovery step.
- No matter how fancy the recording servers’ storage configuration is (different labels for per-disk channel grouping etc), failover servers can simply have one capacious storage with a Defaultlabel.
Now to you are ready to set this up for your own CORTROL Global system. May your servers never break down!