Aggregation Pipeline In MongoDB and the use of $unwind & $lookup(Part 3)

Aggregation Pipeline In MongoDB and the use of $unwind & $lookup(Part 3)

Hello and welcome back, my fellow readers and sorry for the delay, in this part of the aggregation pipeline series where we deeply explore further operators that are used frequently in the pipeline operation, that can be performed in a certain dataset to get the desired output.

Before moving forward, let us first revisit what we have learned, the operators we have explored and the topics that have been covered in the previous part,

  1. An aggregation pipeline consists of one or more stages that process documents, the output of the first stage is fed as the input to the second stage and so on, till the desired output is achieved.

  2. Types of aggregation that can be used in MongoDB, such as Map-reduce function, single purpose aggregation and usage of pipeline.

  3. What are the $match operator and $group operator?

Let us now move forward in learning other operators that are usually used in the aggregation pipeline process, and exploring the power of MongoDb's aggregation operators, starting with $unwind.

$unwind:

As the name suggests, unwind means uncoiling something which is tangled, In the mongoDB case it is an array, for example, you have a certain document inside your collections and each document has an array with certain items inside it, then we can extract each item of an array, with a similar document.

Let us see the $unwind in action,

//the basic syntax of the unwind operator is as shown below
//includeArrayIndex & preserveNullAndEmptyArrays is an optional field
//which performs the function of providing the item's index and null 
//and empty array as well respectively
{
  $unwind:
    {
      path: <field path>,
      includeArrayIndex: <string>,
      preserveNullAndEmptyArrays: <boolean>
    }
}

let us suppose we have student collections, with the subject filed as an array, which indicates the subject that a particular student is enrolled for, we can unwind this subject field and extract, items inside it individually

//Consider a student collection,with each document having a subject field as an array
{ "_id": ObjectId("512bc95fe835e68f199c8686"), "student": "Dave Smith", "subject": ["English", "Maths", "Physics", "Biology"]}
{ "_id": ObjectId("512bc962e835e68f199c8687"), "student": "Sarah Tylor", "subject" :["Chemistry", "Physics", "Psychology"] }
{ "_id": ObjectId("55f5a192d4bede9ac365b257"), "student": "John Doe", "subject" :"History" }

//Now if we unwind the collections based on the subject
db.students.aggregate([{$unwind:"$subject"}])

//The operation returns the following results:
{ "_id": ObjectId("512bc95fe835e68f199c8686"), "student": "Dave Smith", "subject": "English" }
{ "_id": ObjectId("512bc95fe835e68f199c8686"), "student": "Dave Smith", "subject": "Maths" }
{ "_id": ObjectId("512bc95fe835e68f199c8686"), "student": "Dave Smith", "subject": "Physics" }
{ "_id": ObjectId("512bc95fe835e68f199c8686"), "student": "Dave Smith", "subject": "Biology" }
{ "_id": ObjectId("512bc962e835e68f199c8687"), "student": "Sarah Tylor", "subject": "Chemistry" }
{ "_id": ObjectId("512bc962e835e68f199c8687"), "student": "Sarah Tylor", "subject": "Physics" }
{ "_id": ObjectId("512bc962e835e68f199c8687"), "student": "Sarah Tylor", "subject": "Psychology" }
{ "_id": ObjectId("55f5a192d4bede9ac365b257"), "student": "John Doe", "subject": "History" }
//we will get each document for each item of the subject array,
//we can also note that although for student "John Doe", the subject was
//not an array but $unwind considers it as an array of a single element.

From the above discussion, it is clear that we can use the $unwind operator whenever we want to uncoil the array field inside our collection and bring them into a single separated document, But Before moving forward with our next operator, there are certain that we need to keep in mind while working with unwind operator.

  1. $unwind treats the subject field as a single element array if:

  2. the field is present,

  3. the value is not null, and

  4. the value is not an empty array.

The above rules are generally taken care of when there is a subject field which has a null value, or the subject field is not present in this case by default those documents will not be returned as output but we can include those null documents by making use our optional filed called preserveNullAndEmptyArrays.

//here is an example student collection with a null and empty array as the subject
{ "_id": ObjectId("512bc95fe835e68f199c8686"), "student": "Dave Smith", "subject": ["English", "Maths", "Physics", "Biology"]}
{ "_id": ObjectId("512bc962e835e68f199c8687"), "student": "Sarah Tylor", "subject" :["Chemistry", "Physics", "Psychology"] }
{ "_id": ObjectId("55f5a192d4bede9ac365b257"), "student": "John Doe", "subject":"History" }
{ "_id": ObjectId("55f5a192d4bede9ac365b258"), "student": "li ben", "subject": []}
{ "_id": ObjectId("55f5a1d3d4bede9ac365b25a"), "student": "Brendon Scott", "subject":null }
{ "_id": ObjectId("55f5a1d3d4bede9ac365b25b"), "student": "Emily Walker" }

//to include all the null and empty array documents as in our output
db.students.aggregate([{$unwind:{path:"$subject",preserveNullAndEmptyArrays: true}])

//The result of the operations as shown  below
{ "_id": ObjectId("512bc95fe835e68f199c8686"), "student": "Dave Smith", "subject": "English" }
{ "_id": ObjectId("512bc95fe835e68f199c8686"), "student": "Dave Smith", "subject": "Maths" }
{ "_id": ObjectId("512bc95fe835e68f199c8686"), "student": "Dave Smith", "subject": "Physics" }
{ "_id": ObjectId("512bc95fe835e68f199c8686"), "student": "Dave Smith", "subject": "Biology" }
{ "_id": ObjectId("512bc962e835e68f199c8687"), "student": "Sarah Tylor", "subject": "Chemistry" }
{ "_id": ObjectId("512bc962e835e68f199c8687"), "student": "Sarah Tylor", "subject": "Physics" }
{ "_id": ObjectId("512bc962e835e68f199c8687"), "student": "Sarah Tylor", "subject": "Psychology" }
{ "_id": ObjectId("55f5a192d4bede9ac365b257"), "student": "John Doe", "subject": "History" }
{ "_id": ObjectId("55f5a192d4bede9ac365b258"), "student": "li ben"}
{ "_id": ObjectId("55f5a1d3d4bede9ac365b25a"), "student": "Brendon Scott"}
{ "_id": ObjectId("55f5a1d3d4bede9ac365b25b"), "student": "Emily Walker"}
//from the above output we can see that this time the empty array and null
//subject filed documents are also been returned at the output.

$lookup:

The lookup operator as the name suggests, allows us to look into the other collections, from where we can extract certain fields, the lookup operator in MongoDB is similar to "join" operations performed in SQL operations, where we can join the fields of the two different tables.

The Basic Definition and Syntax for lookup operators are shown below

The $lookup stage adds a new array field to each input document. The new array field contains the matching documents from the "joined" collection. The $lookup stage passes these reshaped documents to the next stage.

//the syntax for operating using lookup operations
{
   $lookup:
     {
      //"from" field specify the outer collection
       from: <collection to join>, 
       localField: <field from the input documents>,
       foreignField: <field from the documents of the "from" collection>,
       as: <output array field>
     }
}

let's move forward to see the $lookup operator in action, where we have an order and Inventory collections.

//suppose we have an order collection with documents
{ "_id" : 1, "item" : "almonds", "price" : 12, "quantity" : 2 },
{ "_id" : 2, "item" : "pecans", "price" : 20, "quantity" : 1 },
{ "_id" : 3  }

//suppose we have an other collection named as inventory
 { "_id" : 1, "sku" : "almonds", "description": "product 1", "instock" : 120 },
   { "_id" : 2, "sku" : "bread", "description": "product 2", "instock" : 80 },
   { "_id" : 3, "sku" : "cashews", "description": "product 3", "instock" : 60 },
   { "_id" : 4, "sku" : "pecans", "description": "product 4", "instock" : 70 },
   { "_id" : 5, "sku": null, "description": "Incomplete" },
   { "_id" : 6 }

//we can pefroms the lookup operation here to see the number of inventory
//left for the paticular ordered item or not by writing the agregationa as
db.orders.aggregate([{
                    $lookup:
                    {
                    from: "inventory",
                     localField: "item",
                    foreignField: "sku",
                    as: "inventory_docs"
                     }
                    }])

//it will return the output as
{
   "_id" : 1,
   "item" : "almonds",
   "price" : 12,
   "quantity" : 2,
   "inventory_docs" : [
      { "_id" : 1, "sku" : "almonds", "description" : "product 1", "instock" : 120 }
   ]
}
{
   "_id" : 2,
   "item" : "pecans",
   "price" : 20,
   "quantity" : 1,
   "inventory_docs" : [
      { "_id" : 4, "sku" : "pecans", "description" : "product 4", "instock" : 70 }
   ]
}
{
   "_id" : 3,
   "inventory_docs" : [
      { "_id" : 5, "sku" : null, "description" : "Incomplete" },
      { "_id" : 6 }
   ]
}

From the above discussions, of the $lookup operator, we got a clear understanding of when to use the lookup operator and how it is useful in managing the different collection data to get the required output, such in this case managing the orders and inventory system, to validate the orders of a particular item by looking up into the inventory collections and thus this is the oer of MongoDB which we can use to simply fly the development process and ease the load from the frontend.
By manipulating the data at the backend and so on.

Conclusions:

In this Article, we now get a basic understanding of what is the $unwind and $lookup operators, why it is used and how they effective techniques for unwinding the data and also validating certain conditions by looking up in different collections which involve a tedious process the front end if they are not handled from the backend.

But MongoDB's aggregation pipeline can do this thing in merely two lines of command to extract the particular required data, and that is what we call the essence of MongoDB Aggregation Power which can solve a bigger problem.

I hope you like this article a lot and would appreciate my work, I have learned this thing from the internet as well do check the below link, and also stay tuned for further part of this series.

Did you find this article valuable?

Support Ganesh Yadav by becoming a sponsor. Any amount is appreciated!