Archive for the ‘zookeeper’ tag

Hadoop/HBase automated deployment using Puppet

with 27 comments


Deploying and configuring Hadoop and HBase across clusters is a complex task. In this article I will show what we do to make it easier, and share the deployment recipes that we use.

For the tl;dr crowd: go get the code here.

Cool tools

Before going into how we do things, here is the list of tools that we are using, and which I will mention in this article. I will try to put a link next to any tool-specific term, but you can always refer to its specific home-page for further reference.

  • Hudson – this is a great CI server, and we are using it to build Hadoop, HBase, Zookeeper and more
  • The Hudson Promoted Builds Plug-in – allows defining operations that run after the build has finished, manually or automatically
  • Puppet – configuration management tool We don’t have a dedicated operations team to hand off a list of instructions on how we want our machines to look like. The operations team helping us just makes sure the servers are in the rack, networked and powered up, but once we have a set of IPs (usually from IPMI cards) we’re good to go ourselves. We are our own devops team, and as such we try to automate as much as possible, where possible, and using the tools above helps a lot.

Written by Cristian Ivascu

May 21st, 2010 at 4:38 pm