MongoDB Basic Shell Commands (part-7)

MongoDB Basic Shell Commands (part-7)

MongoDB Basic Shell Commands (part-6)

Aggregation Framework

Aggregation framework in its simple form is just another way to query data in MongoDB. It can do whatever the MongoDB Query Language (MQL) can do and more. image.png

Let us switch to the sample_airbnb database and query the listingsAndReviews collection and find all documents that have Wifi as one of the amenities. Only include price and address in the resulting cursor.

Using MQL

db.listingsAndReviews.find({ "amenities": "Wifi" },
                           { "price": 1, "address": 1, "_id": 0 }).pretty()

Using Aggregation Framework

db.listingsAndReviews.aggregate([
                                  { "$match": { "amenities": "Wifi" } },
                                  { "$project": { "price": 1,
                                                  "address": 1,
                                                  "_id": 0 }}]).pretty()

image.png

Why aggregate instead of find:

Instead of just filtering or projecting data we can aggregate in a group, modify our data in a cursor, or calculate them.

The aggregation framework works as a pipeline. where the order of actions in the pipeline matters. And each action is executed in the order in which we list it.

We give our data to the pipeline on one end, then we describe how this pipeline is going to treat our data using aggregation stages. And then the transformed data emerges at the end of the pipeline.

image.png

In the following case, we have two separate filters in the pipeline. The first filter is the $match stage, which acts as a filter that keeps all the amenities without Wifi from passing through to the next stage of the pipeline. The second filter is the project stage that filters out all the fields that are not address or price from each document. image.png

With aggregation, we can compute and reshape data, unlike MQL which only can find and update or delete. image.png

$group operator:

One of the most important operators in the aggregation framework is the $group operator.

image.png Not filtering stages like grouping in the aggregation pipeline do not modify the original data. Instead they work with the data in the cursor.

Syntax:

image.png

Let us project only the address field value for each document, then group all documents into one document per address.country value.

db.listingsAndReviews.aggregate([ { "$project": { "address": 1, "_id": 0 }},
                                  { "$group": { "_id": "$address.country" }}])

The resulting cursor may look like this:

image.png

Now Let us project only the address field value for each document, then group all documents into one document per address.country value, and count one for each document in each group.

db.listingsAndReviews.aggregate([
                                  { "$project": { "address": 1, "_id": 0 }},
                                  { "$group": { "_id": "$address.country",
                                                "count": { "$sum": 1 } } }
                                ])

The resulting cursor may look like this:

image.png

Still confused about the $group and $sum operator? Look at this picture:

image.png

So Aggregation Framework is really a powerful tool.

image.png

sort() and limit() cursor methods:

sort() will sort the data in a certain order (ascending or descending) depending on the argument passed into it. limit() limits the number of documents returned as the result of the query.

image.png

Followings are some commands containing those cursor:

use sample_training

db.zips.find().sort({ "pop": 1 }).limit(1)

db.zips.find({ "pop": 0 }).count()

db.zips.find().sort({ "pop": -1 }).limit(1)

db.zips.find().sort({ "pop": -1 }).limit(10)

db.zips.find().sort({ "pop": 1, "city": -1 })

Some other cursor methods are pretty() and count() that we have already used in this series.

image.png

Thanks for reading, any correction or recommendation or question is welcome.