Volume 14, Number 2

Building Reliable Cloud Systems through Chaos Engineering

  Authors

Yolam Zimba, University of Lusaka, Zambia

  Abstract

Cloud computing systems need to be reliable so that they can be accessed and used for computing at any given point in time. The complex nature of cloud systems is the motivation to conduct research in novel ways of ensuring that cloud systems are built with reliability in mind. In building cloud systems, it is expected that the cloud system will be able to deal with high demands and unexpected events that affect the reliability and performance of the system.

In this paper, chaos engineering is considered a heuristic method that can be used to build reliable cloud systems. Chaos engineering is aimed at exposing weaknesses in systems that are in production. Chaos engineering will help identify system weaknesses and strengths when a system is exposed to unexpected knocks and shocks while it is in production.

Chaos engineering allows system developers and administrators to get insights into how the cloud system will behave when it is exposed to unexpected occurrences.

  Keywords

Reliability, Cloud Computing, Chaos Engineering, Distributed Systems.