MongoDB Basic Shell Commands (part-6)
Aggregation Framework
Aggregation framework in its simple form is just another way to query data in MongoDB. It can do whatever the MongoDB Query Language (MQL) can do and more.
Let us switch to the sample_airbnb
database and query the listingsAndReviews
collection and find all documents that have Wifi
as one of the amenities. Only include price
and address
in the resulting cursor.
Using MQL
db.listingsAndReviews.find({ "amenities": "Wifi" },
{ "price": 1, "address": 1, "_id": 0 }).pretty()
Using Aggregation Framework
db.listingsAndReviews.aggregate([
{ "$match": { "amenities": "Wifi" } },
{ "$project": { "price": 1,
"address": 1,
"_id": 0 }}]).pretty()
Why aggregate
instead of find
:
Instead of just filtering or projecting data we can aggregate in a group, modify our data in a cursor, or calculate them.
The aggregation framework works as a pipeline. where the order of actions in the pipeline matters. And each action is executed in the order in which we list it.
We give our data to the pipeline on one end, then we describe how this pipeline is going to treat our data using aggregation stages. And then the transformed data emerges at the end of the pipeline.
In the following case, we have two separate filters in the pipeline. The first filter is the $match
stage, which acts as a filter that keeps all the amenities without Wifi
from passing through to the next stage of the pipeline. The second filter is the project
stage that filters out all the fields that are not address
or price
from each document.
With aggregation, we can compute and reshape data, unlike MQL which only can find and update or delete.
$group
operator:
One of the most important operators in the aggregation framework is the $group
operator.
Not filtering stages like grouping in the aggregation pipeline do not modify the original data. Instead they work with the data in the cursor.
Syntax:
Let us project only the address
field value for each document, then group all documents into one document per address.country
value.
db.listingsAndReviews.aggregate([ { "$project": { "address": 1, "_id": 0 }},
{ "$group": { "_id": "$address.country" }}])
The resulting cursor may look like this:
Now Let us project only the address
field value for each document, then group all documents into one document per address.country
value, and count one for each document in each group.
db.listingsAndReviews.aggregate([
{ "$project": { "address": 1, "_id": 0 }},
{ "$group": { "_id": "$address.country",
"count": { "$sum": 1 } } }
])
The resulting cursor may look like this:
Still confused about the $group
and $sum
operator? Look at this picture:
So Aggregation Framework is really a powerful tool.
sort()
and limit()
cursor methods:
sort()
will sort the data in a certain order (ascending or descending) depending on the argument passed into it. limit()
limits the number of documents returned as the result of the query.
Followings are some commands containing those cursor:
use sample_training
db.zips.find().sort({ "pop": 1 }).limit(1)
db.zips.find({ "pop": 0 }).count()
db.zips.find().sort({ "pop": -1 }).limit(1)
db.zips.find().sort({ "pop": -1 }).limit(10)
db.zips.find().sort({ "pop": 1, "city": -1 })
Some other cursor methods are pretty()
and count()
that we have already used in this series.
Thanks for reading, any correction or recommendation or question is welcome.