865

Why does MongoDB *client* use more memory than the server in this case?

Question:

I'm evaluating MongoDB. I have a small 20GB subset of documents. Each is essentially a request log for a social game along with some captured state of the game the user was playing at that moment.

I thought I'd try finding game cheaters. So I wrote a function that runs server side. It calls find() on an indexed collection and sorts according to the existing index. Using a cursor it goes through all documents in indexed order. The index is {user_id,time}. So I'm going through each user's history, checking if certain values (money/health/etc) increase faster than is possible in the game. The script returns the first violation found. It does not collect violations.

The ONLY thing that this script does on the client is define the function and calls mymongodb.eval(myscript) on a mongod instance on another box.

The box that mongod is running on does fine. The one that the script is launched from starts losing memory and swap. Hours later: 8GB of RAM and 6GB of swap are being used on the client machine that did nothing more than launch a script on another box and wait for a return value.

Is the mongo client really that flakey? Have I done something wrong or made an incorrect assumption about mongo/mongod?

Answer1:

If you just want to open up a client connection to a remote database you should use the mongo command, not mongod. mongod starts up a server on your local machine. Not sure what specifying a url will do.

Try

mongo remotehost:27017

Answer2:

From the <a href="http://www.mongodb.org/display/DOCS/Server-side+Code+Execution" rel="nofollow">documentation</a>:

<blockquote>

Use map/reduce instead of db.eval() for long running jobs. db.eval blocks other operations!

</blockquote>

eval is a function that blocks the entire server if you don't use a special flag. Again, from the docs:

<blockquote>

If you don't use the "nolock" flag, db.eval() blocks the entire mongod process while running [...]

</blockquote>

You are kind of abusing MongoDB here. Your current routine is strange, because it returns the first violation found, but it will have to re-check everything when run the next time (unless your user ids are ordered and you store the last evaluated user id).

Map/Reduce generally is the better option for a long-running task, but aggregating your data does not seem trivial. However, a map/reduce based solution would also solve the re-evaluation problem.

I'd probably return something like this from map/reduce:

user id -> suspicious actions, e.g. ------ 2525454 -> [{logId: 235345435, t: ISODate("...")}]

Recommend

  • Visual Studio unable to use debugger (AccessViolationException). Code executing in random order
  • Why do class member functions defined outside the class (but in header file) have to be inlined?
  • C code crashes in Windows, but not in Linux
  • Can a Winforms app unlock/replace its own exe file for auto-update?
  • spring boot + apache camel + mongodb integration issue
  • Accessing instantiated object from another class - c#
  • Collecting wall posting permission when using the Facebook Registration plugin
  • Scala's collect inefficient in Spark?
  • Unmanaged code calling vb.net callback
  • What is the difference between GetComponent ().enabled and .SetActive (false); in unity
  • How to create L lists of n non-zero random decimals where each list sums to 1.0?
  • What's the logic in HKObserverQuery background delivery?
  • Is there a way to directly consume a Rayon chain without collecting it first?
  • Spring MVC redirect with custom http headers
  • JPA/Hibernate - Entity name seems to be important. If I rename to “Bob” works fine
  • How to Add Polymorphic Comments to Feed?
  • Write output of for loop to multiple files
  • Plotting densities in R
  • Consuming a WCF service in a Java Client using wsHttpBinding
  • User messaging system
  • Laravel: Getting Session ID oddly truncates when using foreach
  • Using Sax parsing to edit and write XML in VB6
  • How can I speed up CURL tasks?
  • Magento Fatal error: Maximum execution error solution, on WAMP
  • DomPDF {PAGE_NUM} not on first page
  • Javascript simulate pressing enter in input box
  • Is my CUDA kernel really runs on device or is being mistekenly executed by host in emulation?
  • How to make a tree having multiple type of nodes and each node can have multiple child nodes in java
  • TFS: Get latest causes slow project reloading
  • Does CUDA 5 support STL or THRUST inside the device code?
  • Fill an image in a square container while keeping aspect ratio
  • ActionScript 2 vs ActionScript 3 performance
  • Importing jscolor library in angular 2
  • Rearranging Cells in UITableView Bug & Saving Changes
  • A cron job substitute?
  • Windows forms listbox.selecteditem displaying “System.Data.DataRowView” instead of actual value
  • Can Visual Studio XAML designer handle font family names with spaces as a resource?
  • How can I remove ASP.NET Designer.cs files?
  • Are Kotlin's Float, Int etc optimised to built-in types in the JVM? [duplicate]
  • Reading document lines to the user (python)