How do i iterate over large Mongo collections without OutOfMemoryError

In my Spring application i want to iterate over a Mongo collection to perform some work with each entry. The collection can be quite large so can't simply get a list of all entries as that would result in OutOfMemoryErrors.

My latest attempt is this:

void m(MongoOperations ops, Set<String> ids) {
   Query query = new Query().addCriteria(Criteria.where("id").in(ids));
   CloseableIterator<Foo> it = ops.stream(query, Foo.class);
   it.forEachRemaining(foo -> {
       System.out.println(foo.getName());
   }
}

It suprised me to see that i am getting OutOfMemoryErrors here. It looks like all entries of Foo that match the query are loaded into memory as soon as it.forEachRemaining is called.

A Heapdump shows that the CloseableIteratorAdaptor holds a DBCursor which holds a QueryResultIterator which in turn holds an ArrayList with all entries.

Am i doing something wrong? Is stream() always loading all entries into memory? Do i have to implement paging?

Here is the relevant part of my heap dump's dominator tree.

Class Name                                                                                                           | Shallow Heap | Retained Heap | Percentage
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
org.example.server.OrganizationScopedThreadFactory$OrganizationScopedThread @ 0x81f71718  pool-1-thread-1 Thread|          128 | 1,453,308,456 |     87,50%
|- org.springframework.data.mongodb.core.MongoTemplate$CloseableIterableCursorAdapter @ 0x8b2df1c0                   |           24 | 1,432,708,656 |     86,26%
|  |- com.mongodb.DBCursor @ 0x8b3bb0f8                                                                              |           96 | 1,432,708,600 |     86,26%
|  |  |- com.mongodb.QueryResultIterator @ 0x8b5e7c70                                                                |           72 | 1,431,064,320 |     86,16%
|  |  |  |- java.util.ArrayList$Itr @ 0x8b5e7cb8                                                                     |           32 | 1,431,064,152 |     86,16%
|  |  |  |  '- java.util.ArrayList @ 0x8b5e7cd8                                                                      |           24 | 1,431,064,120 |     86,16%
|  |  |  |     '- java.lang.Object[30391] @ 0x8b5e8ed8                                                               |      121,584 | 1,431,064,096 |     86,16%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8c2eed70                                                          |           64 |       123,528 |      0,01%
|  |  |  |        |  |- java.util.LinkedHashMap$Entry @ 0x8c2eef18                                                   |           40 |       122,240 |      0,01%
|  |  |  |        |  |  |- com.mongodb.BasicDBList @ 0x8c2eef78                                                      |           32 |       122,144 |      0,01%
|  |  |  |        |  |  |  '- java.lang.Object[10] @ 0x8c2eef98                                                      |           56 |       122,112 |      0,01%
|  |  |  |        |  |  |     '- com.mongodb.DBRef @ 0x8c2eefd0                                                      |           32 |       122,056 |      0,01%
|  |  |  |        |  |  |        |- com.mongodb.BasicDBObject @ 0xc5dbc778                                           |           64 |       121,992 |      0,01%
|  |  |  |        |  |  |        |- org.bson.types.ObjectId @ 0x8c2eeff0                                             |           32 |            32 |      0,00%
|  |  |  |        |  |  |        '- Total: 2 entries                                                                 |              |               |           
|  |  |  |        |  |  |- java.lang.String @ 0x8c2eef40  projects                                                   |           24 |            56 |      0,00%
|  |  |  |        |  |  '- Total: 2 entries                                                                          |              |               |           
|  |  |  |        |  |- java.util.LinkedHashMap$Entry @ 0x8c2eee00                                                   |           40 |         1,024 |      0,00%
|  |  |  |        |  |- java.util.LinkedHashMap$Entry @ 0x8c2eeea0                                                   |           40 |           120 |      0,00%
|  |  |  |        |  |- java.util.HashMap$Node[16] @ 0x8c2eedb0                                                      |           80 |            80 |      0,00%
|  |  |  |        |  '- Total: 4 entries                                                                             |              |               |           
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8bcfd4c0                                                          |           64 |       123,480 |      0,01%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8c48e2c8                                                          |           64 |       113,520 |      0,01%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8c66f668                                                          |           64 |       112,296 |      0,01%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8c87afe0                                                          |           64 |       112,120 |      0,01%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8c521008                                                          |           64 |       106,096 |      0,01%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8cdc68d0                                                          |           64 |        99,576 |      0,01%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8c8efa40                                                          |           64 |        90,456 |      0,01%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8b8e0d18                                                          |           64 |        77,600 |      0,00%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8bbc5a30                                                          |           64 |        77,600 |      0,00%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8bc446e8                                                          |           64 |        77,600 |      0,00%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8bcc0ca0                                                          |           64 |        77,600 |      0,00%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8bda1d30                                                          |           64 |        77,600 |      0,00%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8be46048                                                          |           64 |        77,600 |      0,00%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8be462e8                                                          |           64 |        77,600 |      0,00%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8beb24f8                                                          |           64 |        77,600 |      0,00%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8beb2798                                                          |           64 |        77,600 |      0,00%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8bee79d0                                                          |           64 |        77,600 |      0,00%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8bf04f38                                                          |           64 |        77,600 |      0,00%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8bf0eae8                                                          |           64 |        77,600 |      0,00%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8bf0ed88                                                          |           64 |        77,600 |      0,00%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8bf14220                                                          |           64 |        77,600 |      0,00%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8bf3edf0                                                          |           64 |        77,600 |      0,00%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8bf78640                                                          |           64 |        77,600 |      0,00%
|  |  |  |        |- com.mongodb.BasicDBObject @ 0x8bf7dd30                                                          |           64 |        77,600 |      0,00%
|  |  |  |        '- Total: 25 of 30.276 entries; 30.251 more                                                        |              |               |           
-----------------------------------------------------------------------------------------------------------------------------------------------------------------

Update: I have since tried implementing paging by setting a limit to the query and sorting it and then repeatedly skip entries until i have iterated over everything. While this does take care of my memory issues, it significantly reduces performance (about 100 times slower), probably due to the need for sorting.

I am using Spring Boot 1.3.7

Answers


A memory limit of 16 MB is applied for cursor, you might need to write application level looping to pick batches of say 1000 by using skip and limit and sorting.

First pick 1000, then 1000 to 2000 and so on until the end of documents, make sure to use sorting in query so that data is consistent.

Also you have to be mindful if you are going to modify the field on which you are querying and sorting so that same data does not keep popping up again and again.


Need Your Help

Cruise Control clear case integration

cruisecontrol.net clearcase

I am trying to integrate clear case with cruise control.I have sucessfully integrates it ,

Visual C++ Library Directories Command Line equivalent

c++ command-line compilation linker

To use some recompiled libraries (f.ex. boost chrono) i need to specify the library folder in visual studio at Properties -> VC++ Directories -> Library Directories. How can i achieve this using the