17359

Creating a pagination index in CouchDB?

<strong>I'm trying to create a pagination index view in CouchDB that lists the doc._id for every Nth document found.</strong>

I wrote the following map function, but the <strong>pageIndex</strong> variable doesn't reliably start at 1 - in fact it seems to change arbitrarily depending on the emitted value or the index length (e.g. 50, 55, 10, 25 - all start with a different file, though I seem to get the correct number of files emitted).

function(doc) { if (doc.type == 'log') { if (!pageIndex || pageIndex > 50) { pageIndex = 1; emit(doc.timestamp, null); } pageIndex++; } }

What am I doing wrong here? How would a CouchDB expert build this view?

Note that I don't want to use the "startkey + count + 1" method that's been mentioned elsewhere, since I'd like to be able to jump to a particular page or the last page (user expectations and all), I'd like to have a friendly "?page=5" URI instead of "?startkey=348ca1829328edefe3c5b38b3a1f36d1e988084b", and I'd rather CouchDB did this work instead of bulking up my application, if I can help it.

Thanks!

Answer1:

View functions (map and reduce) are purely functional. Side-effects such as setting a global variable are not supported. (When you move your application to BigCouch, how could multiple independent servers with arbitrary subsets of the data know what pageIndex is?)

Therefore the answer will have to involve a traditional map function, perhaps keyed by timestamp.

function(doc) { if (doc.type == 'log') { emit(doc.timestamp, null); } }

How can you get every 50th document? The simplest way is to add a skip=0 or skip=50, or skip=100 parameter. However that is not ideal (see below).

A way to pre-fetch the exact IDs of every 50th document is a _list function which only outputs every 50th row. (In practice you could use Mustache.JS or another template library to build HTML.)

function() { var ddoc = this, pageIndex = 0, row; send("["); while(row = getRow()) { if(pageIndex % 50 == 0) { send(JSON.stringify(row)); } pageIndex += 1; } send("]"); }

This will work for many situations, however it is not perfect. Here are some considerations I am thinking--not showstoppers necessarily, but it depends on your specific situation.

There is a reason the pretty URLs are discouraged. What does it mean if I load page 1, then a bunch of documents are inserted within the first 50, and then I click to page 2? If the data is changing a lot, there is no perfect user experience, the user must somehow feel the data changing.

The skip parameter and example _list function have the same problem: they do not scale. With skip you are still touching <strong>every</strong> row in the view starting from the beginning: finding it in the database file, reading it from disk, and then ignoring it, over and over, row by row, until you hit the skip value. For small values that's quite convenient but since you are grouping pages into sets of 50, I have to imagine that you will have thousands or more rows. That could make page views slow as the database is spinning its wheels most of the time.

The _list example has a similar problem, however you front-load all the work, running through the entire view from start to finish, and (presumably) sending the relevant document IDs to the client so it can quickly jump around the pages. But with hundreds of thousands of documents (you call them "log" so I assume you will have a ton) that will be an extremely slow query which is not cached.

In summary, for small data sets, you can get away with the page=1, page=2 form however you will bump into problems as your data set gets big. With the release of BigCouch, CouchDB is even better for log storage and analysis so (if that is what you are doing) you will definitely want to consider how high to scale.

Recommend

  • Highlighting current page item in the nav menu with jQuery
  • Why would one use Dreamweaver Templates over PHP or Javascript for templating?
  • Is there an easy way to “append()” two dictionaries together in Python?
  • Getting java.lang.IllegalStateException: This call must happen in the AWT Event Dispatch Thread! Ple
  • Find exception hiding/swallowing in C# code in VS2013
  • SVG .end event not working?
  • I need result from select statement as multiple data of birth for a single id
  • Grouping vars in function
  • In powershell, using the export-csv cmdlet, my ints are being encapsulated by quotes any idea why?
  • Memory usage and time for execution for another process using C#?
  • Operation not supported on read-only collection
  • How to execute Blackberry OS 6 app in Blackberry 10 OS
  • What is wrong with this emulation of CMPXCHG16B instruction?
  • missing pie chart and other elements when display serenity report from jenkins
  • How can I use a custom function within an expression using the eval dataframe method?
  • not able to create VC++ project, with VS11
  • C++ and JS RegEx equivalent of \\p{L}
  • Return null in boolean to checkbox state converter in XAML
  • Azure table query partial partitionkey guid match
  • Can long-polling be achieved in Restlet by just making the thread sleep?
  • What is the default HTTP verb in WebApi ? GET or POST?
  • SIP API media codecs
  • ckeditor and jquery UI dialog not working
  • Angular2 - Template reference inside NgSwitch
  • Does Apportable support to build library binary (.a/.so)?
  • How can I display the parent menu item's description using Wordpress walkers?
  • Ember.js model to be organised as a tree structure
  • WPF - CanExecute dosn't fire when raising Commands from a UserControl
  • Jackson Parser: ignore deserializing for type mismatch
  • OpenGL ES texture problem, 4 duplicate columns and horizontal lines (Android)
  • How to rebase a series of branches?
  • Change JButton Shape while respecting Look And Feel
  • Azure Cloud Service Web Role web pages do not load
  • Bug in WPF DataGrid
  • 'TypeError' while using NSGA2 to solve Multi-objective prob. from pyopt-sparse in OpenMDAO
  • How can I estimate amount of memory left with calling System.gc()?
  • How get height of the a view with gone visibility and height defined as wrap_content in xml?
  • How to get Windows thread pool to call class member function?
  • Linking SubReports Without LinkChild/LinkMaster
  • java string with new operator and a literal