Aggregate 10000 records takes more than an hour

I just aggregate collectionA(total 9k records) to collectionB ( total 10k records) like this but it takes more than 1-2 hours to finish

db.collectionB .aggregate( [
  {
		$match:{"ngayGDich":{$gt:new Date('2021-12-09T07:00'),$lte: new Date('2021-12-10T07:00')},
				"loaiGDich":{"$eq":"204"}
		}
	},
  {
		$lookup:{
			from: "collectionA",
			localField : "maGDichTChieu",
			foreignField: "maGDich",
			as: "data102"
		},
	},
  {
	$unwind:"$data102"
  },
  {
    $facet: {
      "soLuong200": [
        
        {$match: 	 
			{"data102.loaiGDich": {"$eq": "200"}},
		},{
			$count:"value"
		}
      ],
      "soLuong203": [
        {$match:
			
						{"data102.loaiGDich": {"$eq": "203"}},
                
		},{
			$count:"value"
		}
      ],
	  "soLuong300": [
        {$match:
						{"data102.loaiGDich": {"$eq": "300"}},
                    
		},{
			$count:"value"
		}
      ],
	  "soLuong400": [
        {$match:
						{"data102.loaiGDich": {"$eq": "400"}},
                    
		},{
			$count:"value"
		}
      ],
      
    }
  },
  
])
  • my db size about 300 MB
    Can another way resolve this issue?

Hello @le_an,

There are many factors that can affect the performance of a query. Browse through this MongoDB documentation topic on Aggregation Pipeline Optimization.

An important aspect that can affect the performance is the Indexes. See Indexing Strategies for a few ideas.

Tell about a few things about your data, like:

  • Are there any indexes created on the collections (db.collection.getIndexes() returns the list of indexes)
  • What is the version of the MongoDB you are working with

In case you have already defined indexes on the collections, you can also generate a Query Plan, which tells about any indexes used (or not) and also can have information about why the query is taking a long time. The db.collection.explain() can be used with your aggregation query to get a query plan (use this with the verbosity of “executionStats”).

1 Like

If I may add. It looks like you are doing with $facet what is really supposed to be done with $group.

I think you could replace your $facet with:

{ "$group" :
  { "_id" : "$data102.loaiGDich" , "value": { "$sum" : 1 } }
}
1 Like