34505

What are the pros and cons of using the Hadoop NameNode, Checkpoint Node and Backup Node?

Question:

I'm currently evaluating Hadoop 1.0.2 for an in-house project.

The Hadoop docs say that

<blockquote>

<a href="http://hadoop.apache.org/common/docs/current/hdfs_user_guide.html#Secondary+NameNode" rel="nofollow">The Secondary NameNode has been deprecated. Instead, consider using the Checkpoint Node or Backup Node</a>

</blockquote>

There is information on what the three options <em>are</em> and what they <em>do</em>, but I'm having trouble finding information on which of the three options is <em>recommended</em> in which situations.

Answer1:

Basically the checkpoint node is a new implementation of the secondary name node and the backup point is an interim release on the way to a warm-standby for the namenode (plus it can currently offer a small performance boost by separating reads and writes - reads in the name node and writes in the backup node

from the <a href="https://issues.apache.org/jira/browse/HADOOP-4539" rel="nofollow">Backupnode documentation</a> as explained by Konstantin Shvachko :

<blockquote>

This patch introduces two new types of name-nodes: a Checkpoint node and a Backup node.

<ul><li>The role of the Checkpoint node to checkpoint name-node meta-data by merging image and edits files.</li> <li>The Backup node extends functionality of the Checkpointer by that it can receive online updates of the file system meta-data, apply them to its memory state and persist them on disks just like the name-node does. Thus at any time the Backup node contains an up-to-date image of the namespace both in memory and on local disk(s). This also results in much more efficient checkpointing because backup node does not need to transfer files from the active name-node and does not need to replay (merge) edits.</li> <li>The Term Standby node is reserved for further extension of the backup node functionality, when cluster will be able to switch over to the new name-node if the active dies. This is mentioned in the "Warm standby provision" section of the design document.</li> </ul>

Typical use cases:

<ol><li>Run Checkpoint node only to create checkpoints. This should be used instead of the current SecondaryNameNode, which is deprecated by the patch. I reused a lot of the SecondaryNameNode code so this effort was not wasted, it just evolved.</li> <li>Run Backup node to support online streaming of edits and efficient checkpointing. This particularly targets eliminating NFS as a remote storage for edits.</li> <li>Run NameNode without persistent storage at all and delegate all "persisting" functionality to the Backup node. The trick here is to start name-node with -importCheckpoint option and then run the Backup node.</li> </ol></blockquote>

Recommend

  • GLM in statsmodel returning error
  • how to use forecast function for simple moving average model in r?
  • Combining shiny with Quantstrat backtests
  • Preventing Abuse: Cloud Functions for Firebase
  • Perl: With Text::CSV can I write out a hash ref?
  • Rails 3.2.8. Upgrade checkboxes from Rails 1.x.x to 3.2.8
  • Google Scripts: How to call a function to run after another function is completed
  • Python read CSV file, and write to another skipping columns
  • Custom Value of Checkbox 'Y'/'N' instead of true/false. Angular 2
  • Unittest Jinja2 and Webapp2 : template not found
  • Can I have a macro run whenever I save a file in Visual Studio 2005?
  • Writing JSON in Classic ASP page, and general (mis)understanding of Http Response
  • Writing a whole cmd command in a file that contains different characters,such as “>” or “ >>
  • How to make multi-line UILabel text fit within predefined width without wrapping mid-word
  • Cloud Function stuck in an infinite loop
  • Execute MLCP Content Load Command as a schedule task in Marklogic
  • MySQL Deadlock MySQLTransactionRollbackException in Java Concurrent Application
  • Using textfile as stdin in python under windows 7
  • Exit code “lost” from child process in Windows XP, not in Windows Server 2003
  • What is the function of the additional coprocessor register of the MRC command?
  • print lines between patterns individual separate files
  • What's the best way to download multiple images and display multiple UIImageView?
  • Does OpenCL allow concurrent writes to same memory address?
  • Delphi: how to compose an email in Outlook without using MAPI?
  • JavaFX ComboBox setItems triggers onAction event
  • Dropping support for JRE 1.3
  • How does MemberWiseClone create a new object with the cloned properties?
  • why adding a space after `(.+?)` can completely change the result
  • Can TextIO write to prefixes derived from the window maxTimestamp?
  • “Complex Header” not responsive in current DataTables.net build?
  • Replicating and differentiating portions of a form
  • How to set `secure` and `httpOnly` for Plones `__ac` cookie?
  • Converter from SAT to 3-SAT
  • Get the number 18437736874454810627
  • formatting the colorbar ticklabels with SymLogNorm normalization in matplotlib
  • Finding past revisions of files in StarTeam w/ .NET SDK / C#
  • Java applet as stand-alone Windows application?
  • Jquery - Jquery Wysiwyg return html as a string
  • SQL merge duplicate rows and join values that are different
  • Reading document lines to the user (python)