Introduction to MongoDB Indexes

Introduction to MongoDB IndexesDvir AradBlockedUnblockFollowFollowingMay 26What are the MongoDB indexes?How do they work?What problem do indexes try to solve?Answer: Slow queriesWe can look at MongoDB collection’s indexes as books indexesCollection ScanIf you don’t use an index when we query our collection, then the database will have to look at every single document* Our collection grows in size, we will have to search through more and more documents to satisfy our query* Order of N operation* Linear running timeIndex ScanRather than searching through every document, we can search through the ordered index first.

Key-Value pair, where the key is the value of the field that we’ve indexed on, and the value of the key is the actual document itself.

* _id is automatically indexedIt is possible to have many indexes on the same collection.

You might create multiple indexes in different fields if you find that you have different queries for different fields.

MongoDB uses a data structure called a b-tree to store its indexesThe awesome query performance gain that we get with indexes doesn’t come for free.

With each additional index, we decrease our write speed for a collection.

If a document were to change, or if it was completely removed, one or more of our b-trees might need to be balanced.

This means that we need to be careful when creating indexes.

We don’t want to have too many unnecessary indexes in a collection because there would then be an unnecessary loss in insert, update, and delete performance.

You should have a good idea of what indexes are, their pros and cons, and how they work.

Single Field IndexThe simplest index that MongoDBdb.

<collection>.

createIndex({<field>:<direction>})Key features:* The keys are from only one field.

 * Can find a single value for the indexed field.

 * Can find a range of values.

 * Can use dot notation to index fields in subdocuments.

* Can be used to find several distinct values in a single queryExample: db.

getCollection(‘people’).

find({“ssn”:”829–00–2162"}).

explain(“executionStats”){“queryPlanner” : {… },“winningPlan” : {“stage” : “COLLSCAN”, …},“executionStats” : {“nReturned” : 52,“executionTimeMillis” : 30,“totalKeysExamined” : 0,“totalDocsExamined” : 20267,“nReturned” : 52,…}We can see that we use here a collection scan(COLLSCAN), we examined 0 index, 20,267 documents and return 52 documents.

The server worked about 30 milliseconds.

let’s add Index to our collectiondb.

people.

createIndex( { ssn : 1 } )run the same query again db.

getCollection(‘people’).

find({“ssn”:”829–00–2162"}).

explain(“executionStats”){“queryPlanner” : {… },“winningPlan” : {“stage” : “FETCH”,“inputStage” : {“stage” : “IXSCAN”, …},“executionStats” : {“executionTimeMillis” : 2,“totalKeysExamined” : 52,“totalDocsExamined” : 52,“nReturned” : 52,…}Now with Index scan(IXSCAN), we can see that we examined 52 keys and 52 documents and return 52 documents.

The server worked about 2 milliseconds.

Sorting with IndexIn memory sortingThe documents in our collections are stored on disk in an unknown order.

The server is going have to read the documents from disk into RAM.

In RAM, it will perform some kind of sorting algorithm on them.

Depending on how many documents you have, this might take a long timeMoreover, because sorting a large number of documents in memory might be an expensive operation, the server is going to abort sorting in memory when 32 megabytes of memory is being used.

Example:db.

getCollection(‘people’).

find({}).

sort({ssn:1}).

explain(“executionStats”){“queryPlanner” : {“winningPlan” : {“stage” : “SORT”,“sortPattern” : { “identity_unique_id” : 1.

0 },“inputStage” : {“stage” : “SORT_KEY_GENERATOR”,“inputStage” : {“stage” : “COLLSCAN”,“direction” : “forward” … },“executionStats” : {“nReturned” : 20267,“executionTimeMillis” : 202,“totalKeysExamined” : 0,“totalDocsExamined” : 20267}The server run as memory sorting and execut about 202 miliseceondIndex sortingThe keys are ordered according to the field specified during index creation.

The server can take advantage of this via sort.

The order of the documents returned is guaranteed to be sorted by the index keys.

This means that there is no need to reform an explicit sort, as the documents will be fetched from the server in the sorted orderExample:let’s add Index to our collectiondb.

people.

createIndex( { ssn : 1 } )db.

getCollection(‘people’).

find({}).

sort({ssn:1}).

explain(“executionStats”){“queryPlanner” : {“winningPlan” : {“stage” : “FETCH”,“inputStage” : {“stage” : “IXSCAN”,“keyPattern” : { “identity_unique_id” : 1 },“indexName” : “identity_unique_id_1”,“direction” : “forward” },“executionStats” : {“nReturned” : 20267,“executionTimeMillis” : 35,“totalKeysExamined” : 20267,“totalDocsExamined” : 20267,}The server run as index sorting and execut about 35 miliseceondCompound IndexesA compound index is an index on two or more fields, and it can support queries based on those fieldsNow, in MongoDB, indexes are b-trees and they have an ordering, and that ordering is flat.

So if you have two fields, you might somehow think that your indexes are two-dimensional in some way, but they’re not.

They’re actually one-dimensional.

You can actually think of index keys as an ordered list.

Index prefixIndex prefixes are the beginning subsets of indexed fields.

For example, consider the following compound index:{ “item”: 1, “location”: 1, “stock”: 1 }The index has the following index prefixes:* { item: 1 }* { item: 1, location: 1 }For a compound index, MongoDB can use the index to support queries on the index prefixes.

As such, MongoDB can use the index for queries on the following fields: The item field, the item field and the location field, the item field and the location field and the stock field.

MongoDB can also use the index to support a query on item and stock fields since the item field corresponds to a prefix.

However, the index would not be as efficient in supporting the query as would be an index on only item and stock.

However, MongoDB cannot use the index to support queries that include the following fields since, without the item field, none of the listed fields correspond to a prefix index:The location field, the stock field, or the location and stock fields.

If you have a collection that has both a compound index and an index on its prefix (e.

g.

{ last_name: 1, first_name: 1 } and {last_name: 1 }), if neither index has a sparse or unique constraint, then you can remove the index on the prefix (e.

g.

{last_name: 1 }).

MongoDB will use the compound index in all of the situations that it would have used the prefix index.

Sort orderIndexes store references to fields in either ascending (1) or descending (-1) sort order.

For single-field indexes, the sort order of keys doesn’t matter because MongoDB can traverse the index in either direction.

However, for compound indexes, sort order can matter in determining whether the index can support a sort operation.

.

. More details

Leave a Reply