Disaster Recovery refers to the procedures to restore a situation that has compromised the correct operation of the platform and where all the Business Continuity systems have failed with the consequent non-availability of services.
Disaster Recovery procedures can be divided into three parts:
- systems checklist;
- maintenance and verification;
- implementation of the procedures in case of malfunctions.
In order to successfully complete the Disaster Recovery procedures in case of failures it is essential that:
- all the data of MongoDB are backed up;
- all the Mia Platform custom configurations are under repository;
- the chain of continuous delivery and deployment are active and functioning;
- you have track of the versions of the docker images installed;
- update the deployment document with the description of the entire runtime infrastructure;
- the runtime infrastructure is generated automatically by scripts and the scripts are under repository.
The following actions must be carried out periodically:
- each checklist point is verified;
- the system is stressed by scheduling parts of the services in a programmed way;
- simulate, at least once every six months, a Recovery action on a test system.
In case of problems, the actions to restore the platform are:
- install the reference infrastructure with automatic scripts;
- install the Mia Platform nodes to the versions that were on the systems in production;
- install custom configurations;
- restore MongoDB backups.
Si consiglia di far eseguire tutte queste operazioni da script automatici.