MongoDB Aggregation

Aggregation in MongoDB is an operation to process the data that returns what we query from the database. MongoDB Aggregation is a great solution when we talk about gathering metrics from MongoDB.

Aggregation functions basically group the record from two or multiple documents and manipulate those grouped data in order to return a single combined result.

If you are familiar with SQL then simply understand that count(*), group by, order by are similar to MongoDB aggregation.

In this tutorial, we’ll explore how to create basic data manipulation using aggregations, and then we’ll create some complex queries by chaining multiple data transformations together.

MongoDB Aggregation Example:

Aggregate function groups the data from the collection, and then used to provide the average(avg), maximum(max), minimum(min), the total number(sum), etc. out of the groups of data selected.

aggregate() is the function that is used to perform an aggregate function in MongoDB. The syntax for aggregation is shown below:

db.collection_name.aggregate(aggregate_operation)

Now, let’s see how to use the aggregate function in MongoDB. In the below examples, we are going to use the “customers” collection.

If you don’t have a collection in the database, try creating a collection.

MongoDB Aggregation Pipeline

To start an aggregation pipeline is a simple process: simply you need to call the aggregate function on any collection.

Let’s start by using a simple customer object:

{
  "id": "10",
  "userName": "rijothomas",
  "firstName": "Rijo ",
  "lastName": "Thomas",
  "phoneNumber": "011-7689415",
  "city": "New Delhi",
  "state": "Delhi",
  "zip": 121008,
  "email": "rijo@techyhunger.com"
}

Customers collection can contain any number of customers with a similar data format.

To start an aggregation on the customers‘ collection, You need to call the aggregate function on it.

see the syntax below:

> db.customers.aggregate([ ... aggregation goes here ...]);

Aggregation function takes an array of data transformations which are applied in the same sequence as they are defined.

It means that the data transformations that are defined first will be executed first and the output will be used by the next transformation in the same order and so on.

Let’s understand the aggregation pipeline by example:

db.orders.aggregate([
   { $match: { status: "Active" } },
   { $group: { _id: "$custmer_id", total: { $sum: "$amount" } } }
])

In the above example, we are aggregating on orders collection and here, we have two pipeline stages in aggregation.

The first stage is $match and the second stage is $group.

In the first stage, $match filters the document whose status is ‘Active’ & passes the result to the next stage $group.

In the second stage, $group groups the documents by the customer_id field to calculate the sum of the amount for each unique customer_id.

MongoDB Matching Documents

In the above aggregation example, the first stage of the pipeline is matching, and this allows us to filter documents so that we could manipulate only those documents that we want to transform. The matching expression looks and works like same as the MongoDB find function OR a WHERE clause in a SQL.

We are going to find all customers that live in New Delhi, we’ll add a $match stage to our aggregation pipeline:

db.customers.aggregate([ 
  { $match: { "zip": 90310 }}
]);

The above example will return an array of customers who live in area code 121008 using the $match stage and this is similar to find method (in SQL) on a collection.

Let’s explore some more stages of the aggregation pipeline.

MongoDB Grouping Documents

After filtering out the documents we can group the documents that are useful subsets. Using $group we can groups documents by some specified expression and outputs to the next stage a document for each distinct grouping.

We can also use groups to perform operations on a common field in all documents, like calculating the total of a set of transactions and counting documents.

So, before diving into more complex operations, let’s begin with something simple such as counting the documents we matched in the previous example:

db.customers.aggregate([ 
  { $match: {"zip": "121008"}}, 
  { 
    $group: {
      _id: null, 
      count: {
        $sum: 1
      }
    }
  }
]);

The $group stage allows us to group documents together and performs operations on all of those grouped documents.

In the above code, we have created a new field in the results called count which adds value 1 to a running sum for each document.

Here, The _id field is required for grouping and it contains fields from each document that we’d like to preserve.

Actually, we are only looking for the count of each document so we can make it null.

{ "_id" : null, "count" : 37 }

Here you can see, we have used $sum which sums a field in all of the documents in a collection.

We can group all documents together in any fields that we want and can perform other types of computations as well.

Some Different expressions used by an Aggregate function

Expression Description
$min It returns the minimum of all values of documents in the collection.
$max It returns the maximum of all values of documents in the collection.
$sum Add the defined values from all the documents in the collection.
$avg It calculates the average values from all the documents in the collection.
$first Return the first document from the source document.
$last Return the last document from the source document.
$push Insert values to an array in the resulting document.
$addToSet Insert values to an array but no duplicates in the resulting document.

MongoDB Sorting Documents using $sort

MongoDB provides sort() function to sort the data in a collection. MongoDB sort function accepts a list of values and an integer value 1 or -1 which is similar to ‘ascending’ and ‘descending’ respectively.

Syntax for sort function in MongoDB:

db.COLLECTION_MAME.find().sort({KEY : 1})

If we have users collection with multiple documents stored and we want to list them by username ascending or descending order.

Here $sort comes into the picture and the syntax would be:

db.users.find().sort({username : 1}) // 1 denotes 'asc'
db.users.find().sort({username : -1}) // -1 denotes 'desc'

Conclusion

Now I hope you know what is aggregation in MongoDB? What is the aggregation pipeline? There are a lot of operations still we have not covered here but this tutorial will definitely make you interested that how to analyze stored data in MongoDB.

Finally, MongoDB Aggregation tutorial with example is over. Hope you liked this, you can share your thoughts, suggestion in the comment box.

Leave a Reply