MongoDB Basic Shell Commands (part-7)

Aggregation Framework

Aggregation framework in its simple form is just another way to query data in MongoDB. It can do whatever the MongoDB Query Language (MQL) can do and more.

Let us switch to the sample_airbnb database and query the listingsAndReviews collection and find all documents that have Wifi as one of the amenities. Only include price and address in the resulting cursor.

Using MQL

db.listingsAndReviews.find({ "amenities": "Wifi" },
                           { "price": 1, "address": 1, "_id": 0 }).pretty()

Using Aggregation Framework

db.listingsAndReviews.aggregate([
                                  { "$match": { "amenities": "Wifi" } },
                                  { "$project": { "price": 1,
                                                  "address": 1,
                                                  "_id": 0 }}]).pretty()

Why `aggregate` instead of `find`:

Instead of just filtering or projecting data we can aggregate in a group, modify our data in a cursor, or calculate them.

The aggregation framework works as a pipeline. where the order of actions in the pipeline matters. And each action is executed in the order in which we list it.

We give our data to the pipeline on one end, then we describe how this pipeline is going to treat our data using aggregation stages. And then the transformed data emerges at the end of the pipeline.

In the following case, we have two separate filters in the pipeline. The first filter is the $match stage, which acts as a filter that keeps all the amenities without Wifi from passing through to the next stage of the pipeline. The second filter is the project stage that filters out all the fields that are not address or price from each document.

With aggregation, we can compute and reshape data, unlike MQL which only can find and update or delete.

`$group` operator:

One of the most important operators in the aggregation framework is the $group operator.

Not filtering stages like grouping in the aggregation pipeline do not modify the original data. Instead they work with the data in the cursor.

Syntax:

Let us project only the address field value for each document, then group all documents into one document per address.country value.

db.listingsAndReviews.aggregate([ { "$project": { "address": 1, "_id": 0 }},
                                  { "$group": { "_id": "$address.country" }}])

The resulting cursor may look like this:

Now Let us project only the address field value for each document, then group all documents into one document per address.country value, and count one for each document in each group.

db.listingsAndReviews.aggregate([
                                  { "$project": { "address": 1, "_id": 0 }},
                                  { "$group": { "_id": "$address.country",
                                                "count": { "$sum": 1 } } }
                                ])

The resulting cursor may look like this:

Still confused about the $group and $sum operator? Look at this picture:

So Aggregation Framework is really a powerful tool.

`sort()` and `limit()` cursor methods:

sort() will sort the data in a certain order (ascending or descending) depending on the argument passed into it. limit() limits the number of documents returned as the result of the query.

Followings are some commands containing those cursor:

use sample_training

db.zips.find().sort({ "pop": 1 }).limit(1)

db.zips.find({ "pop": 0 }).count()

db.zips.find().sort({ "pop": -1 }).limit(1)

db.zips.find().sort({ "pop": -1 }).limit(10)

db.zips.find().sort({ "pop": 1, "city": -1 })

Some other cursor methods are pretty() and count() that we have already used in this series.

Thanks for reading, any correction or recommendation or question is welcome.

MongoDB Basic Shell Commands (part-7)

Aggregation Framework

Why aggregate instead of find:

$group operator:

sort() and limit() cursor methods:

Why `aggregate` instead of `find`:

`$group` operator:

`sort()` and `limit()` cursor methods: