Query Index Performance

Query index performance

Indexes dramatically improve the speed and the efficient resolution of read operations. Without indexes, MongoDB must scan every document of a collection to select those documents that match the query statement. This scan is highly inefficient and require the MongoDB to process a large volume of data.

Indexes store a small portion of the data set in an easy to traverse form. The index stores the value of a specific field or set of fields, ordered by the value of the field as specified in index.

CreateIndex command – is very similar to sort command (1 = ascending, -1 = descending). Compound index has multiple fields in it’s definition. You can index array contents, subdocuments and subfields.

Optional you can specify the name of the index. If unspecified, MongoDB generates an index name by concatenating the names of the indexed fields and the sort order.

If the field does not exists in any of the documents from the collection, MongoDB will create the index without any warning. A MongoDB index can have keys of different types (i.e., ints, dates, string).

Retrieve the list of indexes for a collection:

Delete an index:

ObjectID – unique, autogenerated (unless specified), 12 bytes, it provides valuable information about the document: eg. when was created:

Explain queries:

MongoDB allow to examine queries and see what indexes are used and figure out the performance of the query statement.

$explain

  • nReturned – indicates the number of documents matching returned
  • nscannedObjects indicates the total number of documents scanned
  • nscanned indicates the total number of documents or index entries scanned

We look for totalKeysExamined = 0 (no index yet) and totalDocsExamined = 2 and nReturned  = 2.

Another example, this time using 1.000.000 documents.

and output:

Collection scan (COLLSCAN) occurs when indexes are not involved:

$explain allows you to evaluate the write operations. When applying it does not perform the operations but it estimates the time required to perform it and figure out what indexes are used.

Output:

queryPlanner vs executionStats vs allPlansExecution

The behavior of db.collection.explain() and the amount of information returned depend on the verbosity mode. The default mode for $explain is queryPlanner.

executionStats includes queryPlanner and additional information:

  • time to execute the query
  • number of documents returned
  • documents examined.

The output:

MongoDB runs the query optimizer to choose the winning plan, executes the winning plan to completion, and returns statistics describing the execution of the winning plan.

If we drop the index and rerun the statement the output will change:

We see executionTimeMillis = 24ms, totalKeysExamined and totalDocsExamined but also the executionStages.

allPlansExecution - like executionStats but runs each available plan and returns the stats for each one.

The output will show all available execution plans with the stats:

If you look on allPlansExecution, the first plan returned nReturned = 0 documents and was stopped because it was overruled by secondPlan that was already finished.

$hint

Indexes are automatically used by mongoDB when performing queries (apart from using sparse index). The $hint operator forces the query optimizer to use the specified index to run a query. This is useful when you want to test performance of a query with different indexes.

Covered queries

As per the official MongoDB documentation, a covered query is a query in which:

  • all the fields in the query are part of an index and
  • all the fields returned in the query are in the same index

Since all the fields present in the query are part of an index, MongoDB matches the query conditions and returns the result using the same index without actually looking inside documents. Since indexes are present in RAM, fetching data from indexes is much faster as compared to fetching data by scanning documents.

Selectivity is the primary factor that determines how efficiently an index can be used. Ideally, the index enables us to select only those records required to complete the result set, without the need to scan a substantially larger number of index keys (or documents) in order to complete the query. Selectivity determines how many records any subsequent operations must work with. Fewer records means less execution time.

MongoDB Index types

In MongoDB there are few index types:

Sparse index is useful in case of rare fields. Sparse indexes only contain entries for documents that have the indexed field, even if the index field contains a null value. The index skips over any document that is missing the indexed field.

TTL index – MongoDB is using to automatically remove documents from a collection after a certain amount of time.

No TTL indexed document will be deleted before the expireAfterSeconds is passed since the last update in that document.

Geospatial index – supports only 2d dimensions (eg: locations: [longitude, latitude]).

Text indexes – the field you want to index must be a text.

 

Additional links:

  1. https://docs.mongodb.org/manual/reference/method/db.collection.createIndex/
  2. http://geojson.org/