Why is regex prefix query on indexed array slow in MongoDB? -
I am trying to do a regex query on an array of stars in the MongoDB archive. I could only find this limit in the following:
$ regex can only use index individually when regular expression has an anchor for string start (i.e. ^) and case-sensitive match .
Let's do a test:
& gt; (.db.test.count () 100000 & gt; Db.test.ensureIndex ({f: 1}) & gt; ({? F) for (Var i = 0; i & lt; 100000; i ++) : / ^ a_ (0) _ $ 12 /}) db.test.find { "_ id": ObjectId ( "514ac59886f004fe03ef2a96"), "f": [ "a_0_12", "a_1_2"]} & gt; Db. test.find ({f: / ^ a_ (0)? _ 12 $ /}) explain () { "cursor": "Bitirikarsr F_l Multi", "Aimltiki": true, "N": 1, "Anscentted objects ": 200000," nscanned ": 200000," nscannedObjectsAllPlans ": 200000," nscannedAllPlans ": 200000," scanAndOrder ": false," indexOnly ": false," nYields ": 0," nChunkSkips ": 0," milliseconds " : 482, "indexbonds": {"f": [["a_", "a`"], [/ ^ a_ (0)? _12 $ /, / ^ / (a) (0)? _ $ 12 ]]}, "Server": "someserver: 27017"}
query is sloooow on the other hand, The query is optimal: (but does not conform to my usage case)
& gt; db.test.find ({f: 'a_0_12'}) Explained () {"cursor" :. "BtreeCursor f_1", "isMultiKey": true, "n": 1, "nscannedObjects": 1, "nscanned": 1, "nscannedObjectsAllPlans": 1, "nscannedAllPlans": 1, "scanAndOrder": false, "IndexOnly": False, "nYields": 0, "nChunkSkips": 0, "milliseconds": 0, "indexBounds": {"Regex query is all scanning (sub) records when it is? an index? What am I missing?
There are many features in your test case that are useless for regex and indexing purposes:
- Each document contains an array of values, which starts with "a_". Your regex
/ ^ a_ (0)? _ 12 $ / is looking for a string starting with an optional "0", so all the indicator entries (200k values) are compared. <
- Your regex also matches a value every in the document (
a_1_2 ), even if index can match all the documents < / Li>
Since you have a multiki (array index), so instead of scanning the full table of documents <100> in fact compared to the index, worse is you You can test with a hint to see:
db.test.find ({f: / ^ a_ (0 |) $ 12) Signal ({$ natural: 1}) .explain () { "cursor": "BasicCursor", "isMultiKey": false, "n": 0, "nscannedObjects": 100000, "nscanned": 100000, "nscannedObjectsAllPlans": 100000, "nscannedAllPlans": 100000, "scanAndOrder": false, "indexOnly": false, "nYields": 0, "nChunkSkips": 0, "milliseconds": 192, "indexBounds": {},}
More random data or more selective regex will result in less comparison.
Comments
Post a Comment