MongoDB – Replication

Replication is the process of synchronizing data across multiple servers.

Replica sets and scaling are used to achieve reliable and high performance deployments. Replica sets ensure multiple copies of data are available. They are build using multiple types of MongoDB nodes. Replica sets exists on odd numbers in order to allow election of a primary node. The Write operations go to primary node, reads can be distributed to the order nodes.

Replication benefits:

  • High Availability – automatic failover
  • Data Safety – durability, extra copies
  • Disaster recovery
  • Scaling (some situations)

Node types:

  • Primary node – writes always go to it
  • Regular node – function as secondary nodes and it can take over the role of primary node in the event of a failure.
  • Arbiter node – it doesn’t keep a copy of the data. It plays a role in the elections that select a primary if the current primary is unavailable.
  • Special purpose nodes – active backup

On regular nodes we can apply restrictions to keep the role of that node (only read so node will never be promoted as primary node).

Building a replica set – it requires installing MongoDB on additional hosts (I recommend an automatic tool to do the provisioning – eg: vagrant + ansible) or to use a cloud based solution. Please be aware it’s highly recommended to keep your data folder out of any container so mapping host folder to guest can be a good practice.

Replica Set

Create the replica set configuration document:

The configuration is showed below:

Verifying if failover works

First we’ll need to config that all nodes are running. The status of the replica set can confirm that. Then we’ll remove one server to simulate the failure and we expect the replica set to elect a primary and remain responsive.

We’ll connect to the one of the running mongoDB instances (eg: port 27018).

The previous elected primary node shows as “(not reachable/healthy)”.

You can see the second replica node was elected as primary in few seconds.

Read concern

After you finish the configuration of the replica set and everything is up and running you can start using it. If you’re performing an insert (on primary) via shell and you want to read the data from one of the secondary nodes, you will probably have an error:

You need to accept eventually inconsistent reads:

Sample exercise

ReadPreference modes – it affects consistency and speed by telling mongoDB how to route the read requests among nodes in a replica set:

  • primary – default, for high data consistency requirements, all data are read from the primary.
  • primaryPreferred – use secondary if primary node is not responding.
  • seconday – go to second for the reads as primary needs to be 100% for write.
  • secondaryPreferred – secondary is the top of the list for reads, but you can go to primary if requested (you cannot reach any secondaries).
  • nearest – based on network latency (recommended on remote data center)

You can see an implementation in nodeJS for the ReadPreference by checking reference 4.

Level of Write Concern

The level of write concern instructs MongoDB how to respond to writes by describing the level of acknowledgement requested from MongoDB for write operations in standalone, replica set or sharded clusters.

The Write concern is reflected in the consistency, redundancy and responsiveness of the entire mongoDB.

Write concern can tell MongoDB to act synchronous or asynchronous to persistence operations. If data consistency requirements are high then write concern should be synchronous (MongoDB will wait until data is replicated on all nodes).

  • Unacknowledged – async – send the write and immediately move. it doesn’t even wait the mongod process to confirm the request.
  • Acknowledged – it’s the default level – The mongo client will wait mongod process to confirm the request. It doesn’t mean mongod process has done anything with the request.
  • Journaled – instructs mongod process to respond only after the Write has been written to the journal on disk.
  • Replica acknowledgment – instructs mongod process to respond only after the Write has been written to the primary and to only 1 or more nodes from the replica set.

Values for w:

  • 0 – Requests no acknowledgment of the write operation.
  • 1 – Requests acknowledgement
  • n > 1 – copies of data spread on n nodes.

For large volumes of data, sharding is the preferred method to scale as replica set is increasing the traffic.

Single replica set has limitation of 12 nodes (edit: raised to 50 nodes as of MongoDB 3.0)

You can retrieve the errors from specific each host by using

Oplog – The oplog (operations log) is a special capped collection that keeps a rolling record of all operations that modify the data stored in your databases. MongoDB applies database operations on the primary and then records the operations on the primary’s oplog. The secondary members then copy and apply these operations in an asynchronous process. All replica set members contain a copy of the oplog, in the collection, which allows them to maintain the current state of the database.