MongoDB – Aggregation

Aggregation is a tool that transforms and summarizes data. Aggregations operations process data records and return computed results. Aggregation operations group values from multiple documents together, and can perform a variety of operations on the grouped data to return a single result. Aggregation can be used in real-time or cached.

In MongoDB there are 3 modalities for data aggregation:

  • Single purpose aggregation – Count, Group, Distinct
  • Aggregation Pipeline – single collection, multiple processing steps
  • MapReduce – multiple collections, shards, complete flexibility
Aggregation Single Purpose Aggregation Aggregation Pipeline MapReduce
Complexity Low Medium High
Speed Very Fast Medium Medium / Slow
Usefulness Narrow Broad All Inclusive
Yes Yes / limitations No / Pre-processed

1. Single purpose aggregation

Count examples:

Distinct examples:

Group By examples:

2. Aggregation pipeline

Pipeline operators are using for each stage of the transformation. Each pipeline operation works on the output from the previous operation and order matters. Some are required to be the first or last in the pipeline.

2.1. $geoNear

The fields “dist.calculated” and “dist.location” are merged into output result.

2.2. $group

Note: $action, $request_time_milliseconds, $request_time_milliseconds – the value of the field.

You can add $$field to differentiate between a projected field and an existing one: we projected “round” field and we refer to it on condition: $$round.raised_amount = the value of the projected field: $round.

2.3. $limit

2.4. $match

2.5. $project

The output:

2.6. $sort

2.7. $unwind – explodes document into multiple documents by creating a document for each element into array.

The partial output:

You can combine multiple stages as example below:


In the next article I’m going to touch another option for data aggregation: MapReduce