Puppet provides a mechanism for managing a heterogeneous cluster of Unix-like machines using a central configuration system and a declarative scripting language for describing machine configuration. The declarative scripting language abstracts away the many differences in various Unix-like operating systems.
Puppet is used for server management by a number of startups including PowerSet, Joost and Slide.
In a system managed using Puppet, Pupper Master is the central system-wide authority for configuration and coordination. Manifests are propagated to the Puppet Master from a source external to the system. Each server in the system periodically polls the Puppet Master to determine if their configuration is up to date. If this is not the case, then the new configuration is retrieved and the changes described by the new manifest are applied. The Puppet instance running on each client can be considered to be made up of the following layers
Each manifest is described in the Puppet Configuration Language which is a high level language for describing resources on a server and what actions to take on them. Retrieving the newest manifests and applying the changes they describe (if any) is provided by the Transactional Layer of Puppet. The Puppet Configuration Language is actually an abstraction that hides the differences in various Unix-like operating systems. This abstraction layer maps the various higher level resources in a manifest to the actual commands and file locations on the target operating systems of the server.
The Puppet Master is the set of one or more servers that run the puppetmasterd daemon. This daemon listens for polling requests from the servers being managed by Puppet and returns the current configuration for the server to the machine. Each server to be managed by Puppet, must have the Puppet client installed and must run the puppetd daemon which polls the Puppet Master for configuration information.
puppetmasterd
puppetd
Each manifest can be thought of as a declarative script which contains one or more commands (called resources in Puppet parlance) and their parameters, dependencies along with the prerequisites to running each command. Collections of resources can be grouped together as classes (complete with inheritance) which can be further grouped together as modules. See below for examples
service { "apache": require => Package["httpd"] }
class apache { service { "apache": require => Package["httpd"] }
file { "/nfs/configs/apache/server1.conf": group => "www-data } }
class apache-ssl inherits apache { Service[apache] { require +> File["apache.pem"] } }
class webserver::apache-ssl inherits apache { Service[apache] { require +> File["apache.pem"] } }
node "webserver.example.com" { include webserver }
A node describes the configuration for a particular machine given its name. Node names and their accompanying configuration can be defined directly in manifests as shown above. Another option is to either use external node classifiers which provide a dynamic mechanism for determine a machine's type based on it's name or use an LDAP directory for storing information about nodes in the cluster.
FURTHER READING
Now Playing: Linkin Park - Cure For The Itch