Stefan SchlesingerI'm an Engineer!http://sts.ono.at/2012-02-01T16:17:00ZStefan SchlesingerA Systems Policyoperationspolicyinfra-talk/blog/2012/02/01/a-systems-policy/2012-02-01T16:17:00Z2012-02-01T16:17:00ZStefan Schlesinger<p>Recently I talked to a couple of friends, which all wailed quite a bit about
their operations or internal IT departments.</p>
<p>Most of these teams had to fight with some very basic things. They lacked a
decent monitoring system or monitoring at all. They didn’t deploy systems, they
installed it by hand. Systems where not documented etc.</p>
<p>So here are some guidelines, I try to aspire with my team. This is by far not a
complete list of things you need to run successful operations but it should
give you a fair hint about what it takes.</p>
<p>Also please note that you might want to adapt your own policy a bit to fit your</p>
<p>Recently I talked to a couple of friends, which all wailed quite a bit about
their operations or internal IT departments.</p>
<p>Most of these teams had to fight with some very basic things. They lacked a
decent monitoring system or monitoring at all. They didn’t deploy systems, they
installed it by hand. Systems where not documented etc.</p>
<p>So here are some guidelines, I try to aspire with my team. This is by far not a
complete list of things you need to run successful operations but it should
give you a fair hint about what it takes.</p>
<p>Also please note that you might want to adapt your own policy a bit to fit your
needs. I’m coming from the web industry, but we still run our own hardware, so
this might especially not fit a typical cloud based infrastructure.</p>
<h1>Systems</h1>
<p><em>A System is considered the lowest part of our infrastructure and services. All rules defined here, should be considered in all other policies.</em></p>
<p>A system….</p>
<ul>
<li>is documented at a central location.</li>
<li>is monitored and being graphed.</li>
<li>is being backuped.</li>
<li>is updated regularly.</li>
<li>has a defined production level. (spare, pre-production, production)</li>
<li>has a defined owner and maintainer.</li>
<li>has a predefined maintenance level.</li>
<li>has a predefined availability.</li>
<li>has a physical location.</li>
<li>has a unique name, which is resolvable by DNS.</li>
<li>has only required software installed.</li>
<li>was installed with all currently available updates.</li>
<li>was inspected and approved by a second man before being released to production.</li>
<li>All parts are functional at any time. All Faults get documented RFN and repaired as soon as possible.</li>
<li>There are always 2+ people informed about it.</li>
<li>Network access vectors are defined.</li>
<li>Configurations are not only available locally (including scripts).</li>
<li>Sensible data gets protected.</li>
</ul>
<h1>Hardware</h1>
<p><em>A piece of hardware can be anything from a big server to a small temperature sensor in your server room.</em></p>
<p>A piece of hardware…</p>
<ul>
<li>has a maintenance contract or spare hardware available.</li>
<li>has got an inventory number.</li>
<li>is labeled (hostname + inventory).</li>
<li>is physically secure (environmental! and mechanical access control).</li>
<li>has got a bill, which is documented at a central location.</li>
<li>should have redundant power supplies.</li>
<li>should have some kind of out of band management solution (OOB).</li>
<li>has at least one power circuit connected to an electronic circuit protected by an uninterruptible power supply (USV).</li>
</ul>
<p>All tools needed to open and repair any part of the system are available.</p>
<h1>Servers</h1>
<p>A server…</p>
<ul>
<li>has at least two disks configured with RAID >= 1.</li>
<li>has at least two separate network interface cards (NICs).</li>
<li>has all RAID controllers backed with battery backed write caches (BBWC).</li>
<li>was dimensioned with adequate future-proof hardware.</li>
<li>has a lifetime of 2+ years.</li>
</ul>
<h1>Switches</h1>
<p>A switch</p>
<ul>
<li>is manage- or configurable.</li>
<li>is supported by the configuration backup software in use (e.g. RANCID)</li>
<li>provides the following protocols: STP, SNMP, IPv6 support (mgmt+multicast), RADIUS for AAA</li>
<li>does not forward the default VLAN (1) on it’s uplink/trunk ports.</li>
<li>does have a description for every port in use (including hostname and interface, e.g.: server01#eth0, server01#oob, switch03#24)</li>
<li>does not have any enabled, unused ports: set them to disabled and remove any other configuration for this port.</li>
<li>blocks or does not forward any discovery protocols on it’s user ports.</li>
<li>is using AAA for authenticating users.</li>
<li>logs to a central syslog server.</li>
</ul>
<h1>Operating Systems</h1>
<p><em>An operating system (OS) is considered as everything running on a server or instance, to support a service or an application.</em></p>
<p>An Operating System…</p>
<ul>
<li>uses <em>OS-CHOICE-HERE/stable</em> as default distribution on servers.</li>
<li>uses <em>OS-CHOICE-HERE</em> as default on clients.</li>
<li>is rebooting without any manual interventions.</li>
<li>provides access by SSH.</li>
<li>does not permit root login via SSH.</li>
<li>has a root password set.</li>
<li>has the current time, synchronized with a time server and uses TIMEZONE-CHOICE-HERE as time zone.</li>
<li>can resolve internal and internet names via DNS.</li>
<li>installs software by packages.</li>
<li>installs packages from a central internal repository and the official distribution repositories.</li>
<li>software installed by packages should conform to the FHS.</li>
<li>software not installed by packages should be installed by a reproducible deployment process.</li>
<li>has sane defaults set, for user and process environments (locales, shells, screen, got some handy tools, etc.).</li>
<li>should not provide typical compiler tools (gcc, build-essential).</li>
<li>provides a manageable AAA concept (e.g. automated provisioning and de-provisioning of staff users).</li>
<li>sends mails destinated for root to a central location.</li>
<li>provides a local mailer.</li>
</ul>
<h1>Hostnames</h1>
<p><em>Hostnames exist to identify every part of your infrastructure uniquely. They are used to refer to systems in your configurations and in discussions. You should think about a naming convention, but here are some rough guidelines.</em></p>
<p>Hostnames …</p>
<ul>
<li>have to be unique.</li>
<li>have to end with a number, which should never be reused and always be incremented.</li>
</ul>
<h1>Services</h1>
<p><em>A service is considered as everything running on a server’s operating system, to provide continuous functionality (e.g. a script or an application).</em></p>
<p>A service…</p>
<ul>
<li>does only log errors and auditing information. Application services may as well log more information (e.g. Apache access log).</li>
<li>has defined log retention times.</li>
<li>logs to syslog unless it’s not possible.</li>
<li>is authenticating only on secure connections.</li>
<li>has an adequate and future-proof dimensioned datastore.</li>
<li>was deployed in a reproducible way.</li>
</ul>
<h1>Networks</h1>
<p>A network is considered any part of infrastructure, which is used to interconnect servers or systems. (Layer 1,2,3,4,…)</p>
<p>A Network…</p>
<ul>
<li>has clear entry and routing points.</li>
<li>has a diagram which describes access vectors, the logical and physical setup.</li>
<li>is deployed in adequate and future-proof dimensions (vlans, ip addresses, bandwidth).</li>
<li>uses structured cabling.</li>
<li>there is no cross-cabling, except for very rare situations (e.g. HA cabling).</li>
<li>should not be used for multiple purposes at least not share one of the following classifications.</li>
</ul>
<table>
<thead>
<tr><th>Class</th><th>Description</th></tr>
</thead>
<tbody>
<tr><td>net </td><td>Internet/upstream network</td></tr>
<tr><td>mgmt </td><td>Management network (monitoring, remote access)</td></tr>
<tr><td>traffic </td><td>Site local traffic network</td></tr>
<tr><td>backup </td><td>Traffic network for backups</td></tr>
<tr><td>voip </td><td>Voip Telephony network</td></tr>
<tr><td>clients </td><td>A network with client workstations.</td></tr>
<tr><td>devel </td><td>A network with development machines.</td></tr>
<tr><td>staging </td><td>A network with staging equipment.</td></tr>
</tbody>
</table>
<ul>
<li>OOBs are easy to reach, even in case of an outage.</li>
<li>VLAN-IDs are considered global, create a list.</li>
<li>All VLAN-IDs below 99 are switch-local.</li>
<li>VLANs have a name and a location.</li>
<li>All address space is considered global (vlans, ip- and mac addresses, including RFC1918)</li>
</ul>
<p>To round up my article, here is a example checklist we use to peer review new systems:</p>
<h1>Example Review Checklist</h1>
<p>Every newly deployed host or instance should undergo a peer-review process. The
checklist below will provide you with a couple of base acceptance criteria and
is going to ensure a certain level of quality. Give it to any other sysadmin
and ask him or her to check the system, before it’s put into production.</p>
<ul>
<li>DNS works (including reverse dns) :</li>
<li>SSH login works :</li>
<li>Host+services monitored :</li>
<li>Host+services graphed :</li>
<li>All Filesystems backuped :</li>
<li>Database dumps :</li>
<li>All Updates installed :</li>
<li>Host in HostDoc : </li>
<li>Puppet works :</li>
<li>Time is accurate :</li>
<li>Root mails are being delivered :</li>
<li>Firewall is active :</li>
<li>No unneeded services are reachable (nmap) :</li>
<li>Network configuration works (+ipv6) :</li>
<li>Syslog/dmesg/oob logs are clean of errors :</li>
</ul>
<p>-- Physical Host --</p>
<ul>
<li>Root password documented :</li>
<li>Root login works :</li>
<li>OOB password documented :</li>
<li>OOB login works :</li>
<li>OOB monitored :</li>
<li>Switch ports are labeled (+ documented) :</li>
<li>Hardware is labeled (+ documented in rack docu) :</li>
<li>Firmware up to date :</li>
<li>RAID level is > 1 and all disks OK :</li>
</ul>
OpenVZ APIopenvzinfra-talk/blog/2011/10/26/openvz-api/2011-10-26T16:17:00Z2011-10-26T16:17:00ZStefan Schlesinger<p>I love OpenVZ, I think its one of the easiest to use virtualisation
technologies on the market and it adds almost no overhead compared to other
technologies.</p>
<p>I’ve been using it for a couple of years now and I always wanted to have a
nicer way to automate container creation, configuration or actions than writing
shell scripts. There are already a couple of webinterfaces around, but none of
them I liked.</p>
<p>Another possibility would be to use libvirt - but libvirt always felt a bit too
complex, since its a general API implementation for several hypervisors.</p>
<p>I love OpenVZ, I think its one of the easiest to use virtualisation
technologies on the market and it adds almost no overhead compared to other
technologies.</p>
<p>I’ve been using it for a couple of years now and I always wanted to have a
nicer way to automate container creation, configuration or actions than writing
shell scripts. There are already a couple of webinterfaces around, but none of
them I liked.</p>
<p>Another possibility would be to use libvirt - but libvirt always felt a bit too
complex, since its a general API implementation for several hypervisors.</p>
<p>So I started to implement my own API, which should rather be a simple and
minimalistic approach. The project is hosted on GitHub but you can as well
install it by Rubygems.</p>
<pre>:::shell
gem install openvz
</pre>
<h1>Restart a container</h1>
<p>A small example to restart the container.</p>
<pre>:::ruby
require 'rubygems'
require 'openvz'
container = OpenVZ::Container.new('109')
container.restart
</pre>
<h1>Provisioning Example</h1>
<p>Here is an example about how a whole provisioning for a new container could
look like.</p>
<p>The script creates the container configuration and runs deboostrap, sets the
nameserver, ip address and hostname. Afterwards it will run a couple of
commands to update it and install Puppet.</p>
<pre>:::ruby
require 'rubygems'
require 'openvz'
container = OpenVZ::Container.new('110')
container.create(:ostemplate => 'debain-6.0-boostrap',
:config => 'vps.basic')
container.deboostrap(:dist => 'squeeze',
:mirror => 'http://cdn.debian.net/debian')
container.set(:nameserver => '8.8.8.8',
:ipadd => '10.0.0.2',
:hostname => 'foo.ono.at')
container.start
# Update the system
container.command('aptitude update ; aptitude -y upgrade ; apt-key update')
# Install puppet
container.command('aptitude -o Aptitude::Cmdline::ignore-trust-violations=true -y install puppet')
# Run puppet
container.command('puppetd -t --server=puppet.ono.at')
</pre>
Upgrading HP Firmwarehpfirmwareinfra-talk/blog/2011/03/03/upgrading-hp-firmware/2011-03-03T19:15:00Z2011-03-03T19:15:00ZStefan Schlesinger<p>Lately we bought a new HP blade chassis to replace a customer’s old database
server. All it’s services run on ~15 blades, splitted cross two HP C7000
chassis.</p>
<p>The Proliant BL460 G6 we bought came with much newer firmware revisions than
all the existing G1 – part of the infrastructure didn’t receive much sysadmin
love over quite some time. :-)</p>
<p>Blades, ILO, chassis and controllers where all running way outdated firmware
and upgrading was highly recommended. The arising firmware combinations haven’t
been tested and the new blade wouldn’t even be detected, so HP. They offered us</p>
<p>Lately we bought a new HP blade chassis to replace a customer’s old database
server. All it’s services run on ~15 blades, splitted cross two HP C7000
chassis.</p>
<p>The Proliant BL460 G6 we bought came with much newer firmware revisions than
all the existing G1 – part of the infrastructure didn’t receive much sysadmin
love over quite some time. :-)</p>
<p>Blades, ILO, chassis and controllers where all running way outdated firmware
and upgrading was highly recommended. The arising firmware combinations haven’t
been tested and the new blade wouldn’t even be detected, so HP. They offered us
an upgrade for about $2000 and 6 hours of downtime per chassis.</p>
<p>Here are some handsome findings, to do the upgrade on your own:</p>
<h1>HP Firmware Comapatibility Matrix</h1>
<p>HP tested certain sets of firmware for compatibility. Take a look at their
compatibility matrix and try to stay within the tested boundaries. This could
mean to upgrade in more than one step, if you are running an older release.</p>
<p>(http://h18004.www1.hp.com/products/blades/components/c-class.html)</p>
<h1>Hp-Firmware-Catalog</h1>
<p>There is <a href="https://github.com/zeha/hp-firmware-catalog">Christian Hofstedtlers great firmware upgrade script</a>,
which automatically downloads the latest and greatest HP firmware installation
packages. Its even creating softlinks, to reference cryptic firmware package
names to their corresponding hardware components.</p>
<p>(https://github.com/zeha/hp-firmware-catalog)</p>
<p>You can run them from your OS as an online upgrade. Certain components still
might require rebooting, to finish the “delayed upgrade”.</p>
<p>I would love to see HP maintaining this, since the approach provides a good
example of providing customers with a modern and automated way to upgrade and
monitor firmware for more recent releases.</p>
<h1>ILO Shell</h1>
<p>When upgrading many machines it will save you a lot of time, if you just use
the SSH shell for configuring a boot device and rebooting the server.</p>
<h2>Connect to ILO using SSH</h2>
<p>Make sure you send the right username, AFAIK it’s case sensitive on the ILO:</p>
<pre>:::shell
ssh phx-vnode03.oob.ono.at -l Administrator
</pre>
<h2>Set an ILO Advanced Licence key</h2>
<pre>:::shell
cd /map1
set license=YOUR-LICENCE-KEY
</pre>
<p>The advanced licence key is required to enable virtual device firmware
features. Eg. to make use of the remote console or a virtual disk boot drive.</p>
<h2>Mount and configure a network hosted ISO image as boot device</h2>
<pre>:::shell
cd /map1
vm cdrom insert http://10.0.10.21/FW920B.2010_1129.2.iso
vm cdrom set boot_always
</pre>
<p>…be it a firmware upgrade or an OS installation disk. Make sure you run the following command to “eject” it again:</p>
<pre>:::shell
cd /map1
vm cdrom eject
</pre>
<h1>Monitoring</h1>
<p>To please your monitoring system as well, check out checkmk. They wrote a
couple of good <a href="http://mathias-kettner.de/checkmk_checks.html">SNMP checks</a> for
your HP or IBM bladecenter.</p>
<p>In the end I can highly recommend to keep your hardware firmware up to date. At
least HP, my vendor of choice, they add a lot of useful bug fixes.</p>
<p>HP currently informs customers by a e-mail newsletter about updates, I would
love to see this in my monitoring system too, like all the other security
upgrades.</p>
<p>Try to plan the upgrade a bit or use existing downtimes to boot the HP Firmware
Maintenance image.</p>
munin-host-renamemonitoringmunininfra-talk/blog/2011/02/16/rename-munin-nodes/2011-02-16T08:32:00Z2011-02-16T08:32:00ZStefan Schlesinger<p>Recently we decided upon a new host naming convention for our infrastructure
at my $dayjob. So I will soon be renaming a few hosts.</p>
<p>I noticed that <a href="http://munin-monitoring.org/">munin</a> still doesn't provide an
easy way, to rename an existing node without losing historical data. So I wrote
a small shell script which does that for me and might be of use to everybody
else in the same situation.</p>
<p>Recently we decided upon a new host naming convention for our infrastructure
at my $dayjob. So I will soon be renaming a few hosts.</p>
<p>I noticed that <a href="http://munin-monitoring.org/">munin</a> still doesn't provide an
easy way, to rename an existing node without losing historical data. So I wrote
a small shell script which does that for me and might be of use to everybody
else in the same situation.</p>
Synchronize Puppet with Gitpuppetgitinfra-talk/blog/2010/12/22/synchronize-puppet-with-git/2010-12-22T17:29:00Z2010-12-22T17:29:00ZStefan Schlesinger<p>Puppet really shines at automating infrastructures. You will notice a sudden
change of working methodology, once you manage the first systems with it.</p>
<p>Instead of manually logging on to each single system for updating a certain
part of configuration by issuing shell commands, you will stop to repeat
yourself and just update a single piece of code, which describes the desired
config state for all systems.</p>
<p><a href="http://projects.puppetlabs.com/projects/1/wiki/Advanced_Puppet_Pattern">As recommended in the Puppet documentation</a>
you are well advised to keep your Puppet manifests under revision control.</p>
<p>Puppet really shines at automating infrastructures. You will notice a sudden
change of working methodology, once you manage the first systems with it.</p>
<p>Instead of manually logging on to each single system for updating a certain
part of configuration by issuing shell commands, you will stop to repeat
yourself and just update a single piece of code, which describes the desired
config state for all systems.</p>
<p><a href="http://projects.puppetlabs.com/projects/1/wiki/Advanced_Puppet_Pattern">As recommended in the Puppet documentation</a>
you are well advised to keep your Puppet manifests under revision control.</p>
<p>I wrote a small script which will come in handy, to ease your life with keeping
your repository and the manifests on the master in sync and should fit to most
of the environments out there.</p>
<p>Once installed, you can store the manifests for each Puppet environment in its
own GIT branch and every time you commit a new version to one of your branches,
it will automatically sync the most recent version and inform the Puppet
master process.</p>
<p>BTW. this could also be used to keep the manifests on multiple Puppet instances
in sync.</p>
<h1>Puppet-Sync</h1>
<p>Puppet-sync is a Ruby based command line tool to synchronize every commit from
a central GIT repository to your Puppet master instance. You should install it
on your Puppet master and configure a GIT hook which calls the script over
ssh.</p>
<p>Puppet-sync takes some parameters to specify how the environment on the master
looks like. Run it with '--help' to get a list of available options. Here is an
example:</p>
<pre>:::shell
puppet-sync --branch master \
--passenger \
--destination /etc/puppet/environments \
--repository ssh+git://git.ono.at/srv/git/puppet.git
</pre>
<p>By running the command above, the script connects to the git repository,
fetches the manifests from the master branch and puts it into
/etc/puppet/environments/production. Since we use the master branch for
production, I added the logic to translate "master"-branch to
"production"-environment.</p>
<h1>Installation on the master</h1>
<p>Instead of listing each and every shell command needed to install the
environment for the script, I'd like to provide a simple manifest instead.
I think this is more readable and you can either use it or figure out
the appropriate shell commands. ;-)</p>
<pre>:::ruby
file { "/usr/local/bin/pupept-sync":
ensure => present,
source => "file:///puppet-sync",
}
file { "/home/psync/.ssh":
ensure => directory,
owner => "psync",
mode => 700,
require => User["psync"],
}
file { "/etc/puppet/environments":
ensure => directory,
owner => "psync",
mode => 775,
require => User["psync"],
}
user { "psync":
ensure => present,
home => "/home/psync"
managehome => true,
}
ssh_authorized_key { "puppet-sync-ssh-key":
ensure => present,
key => "AAAAB3.....lVBp0nPLNcs=",
type => "ssh-rsa"
user => "psync",
require => File["${homeroot}/$name/.ssh"],
}
</pre>
<p>I have the following configuration in my puppet.conf to make puppet aware of
each of the directories in /etc/environments:</p>
<pre>:::ini
...
[master]
.....
templatedir = /etc/puppet/environments/$environment/
modulepath = /etc/puppet/environments/$environment/modules/
manifest = /etc/puppet/environments/$environment/manifests/site.pp
manifestdir = /etc/puppet/environments/$environment/manifests
</pre>
<h2>Git Hook</h2>
<p>The only thing left is to create a Git Hook in your repository. Here is the one
i use. I also created a psync user on the master, so i just need to store the
private ssh key in psync's home directory.</p>
<pre>:::shell
#!/bin/sh
#
# An example hook script to prepare a packed repository for use over
# dumb transports.
#
# To enable this hook, make this file executable by "chmod +x post-update".
branch=`echo $1 | awk -F/ {'print $3'}`
sudo -u psync ssh puppet.ono.at /usr/local/bin/puppet-sync \
--passenger \
--branch $branch
exec git-update-server-info
</pre>
<p>Perfect its installed and you are ready to use it. Go ahead an try to commit a
new version.</p>
Monitoring Puppet and Baculamonitoringcheckmkinfra-talkpuppet/blog/2010/12/07/checkmk-update/2010-12-07T16:51:00Z2010-12-07T16:51:00ZStefan Schlesinger<p>Rescently I blogged about my <a href="/blog/2010/10/10/checkmk-apt">CheckMK APT plugin</a>,
capable of checking for upgradeable packages.</p>
<p>Meanwhile I wrote two more plugins, both adopting existing checks and
migrated all plugins into one single <a href="https://github.com/sts/checkmk">'checkmk' repository</a> over at GitHub.</p>
<h1>Bacula</h1>
<p>This check adapts the idea of <a href="https://github.com/bmiklautz/bacula-utils">Bernhard Miklautz bacula-utils</a>.
The plugin will define four different services to be monitored on each Bacula server:</p>
<h2>bacula.errorvols</h2>
<p>Rescently I blogged about my <a href="/blog/2010/10/10/checkmk-apt">CheckMK APT plugin</a>,
capable of checking for upgradeable packages.</p>
<p>Meanwhile I wrote two more plugins, both adopting existing checks and
migrated all plugins into one single <a href="https://github.com/sts/checkmk">'checkmk' repository</a> over at GitHub.</p>
<h1>Bacula</h1>
<p>This check adapts the idea of <a href="https://github.com/bmiklautz/bacula-utils">Bernhard Miklautz bacula-utils</a>.
The plugin will define four different services to be monitored on each Bacula server:</p>
<h2>bacula.errorvols</h2>
<p>Checks the state of available backup volumes and will report critical,
if errounos volumes are found. In case of an error, it will as well
return the names of these volumes.</p>
<p>Use the bacula-clear-errvols script to resolve these issues.</p>
<h2>bacula.freshness</h2>
<p>Checks whether all clients have an associated backup within the last
30 hours.</p>
<h2>bacula.fullbackups</h2>
<p>Checks whether all clients have an associated full-backup.</p>
<h2>bacula.fullbackupspool</h2>
<p>Checks whether any volumes used for full-backups come from a full-backup pool.</p>
<h1>Puppet</h1>
<p>The second plugin will monitor the status of the puppet agent. It integrates
nicely with puppetstatus.py. This script was initially written by TMZ from the
Fedora Infrastructure Team and can be used to enable or disable the puppet
agent on a specific host.</p>
<p>You are able to disable Puppet on a server, by running the following command:</p>
<pre>:::shell
sudo puppetstatus -d "Interesting things are about to happen."
</pre>
<p>The monitoring system will change the state of the Puppet check to WARNING and
will display this message and the username of the person who was disabling the
agent.</p>
<p>When the agent is not disabled on a host, the check will just change to WARNING
after 3 and to CRITICAL after 4 hours.</p>
<p>I'd be glad to get some feedback on my plugins, please report any bugs by
sending me an e-mail or leave a comment below.</p>
APT Plugin for Check MKdebianmonitoringcheckmkinfra-talk/blog/2010/10/10/checkmk-apt/2010-10-10T08:10:00Z2010-10-10T08:10:00ZStefan Schlesinger<p>We are using <a href="http://mathias-kettner.de/check_mk.html">Check_MK</a> for
monitoring at work. It features a quite nice replacement for NRPE agents and
automatic Nagios configuration generation.</p>
<p>I wrote an APT Plugin which will refresh the package cache on every agent,
every 60 minutes and check for new Debian upgrades or security updates.
Depending on the severity it will return different Nagios status codes:</p>
<ul>
<li>OK - No upgrades are available.</li>
<li>WARNING - Only non-security upgrades are available.</li>
<li>CRITICAL - Security upgrades are available (might also involve normal upgrades).</li>
</ul>
<p>We are using <a href="http://mathias-kettner.de/check_mk.html">Check_MK</a> for
monitoring at work. It features a quite nice replacement for NRPE agents and
automatic Nagios configuration generation.</p>
<p>I wrote an APT Plugin which will refresh the package cache on every agent,
every 60 minutes and check for new Debian upgrades or security updates.
Depending on the severity it will return different Nagios status codes:</p>
<ul>
<li>OK - No upgrades are available.</li>
<li>WARNING - Only non-security upgrades are available.</li>
<li>CRITICAL - Security upgrades are available (might also involve normal upgrades).</li>
</ul>
<p>its hosted at <a href="https://github.com/sts/checkmk-apt">GitHub</a>.</p>
<h1>Agent installation</h1>
<p>Install the python-apt package and copy plugins/apt to your servers. You would
properly want to add this to puppet.</p>
<pre>:::shell
aptitude install python-apt
git clone git://github.com/sts/checkmk-apt.git
checkmk-apt/plugins/apt /usr/lib/check_mk_agent/plugins/apt
chmod a+x /usr/lib/check_mk_agent/plugins/apt
</pre>
<h1>Installing on your Nagios Server</h1>
<p>Copy checks/apt to your checks directory and run a Check MK inventarize and
config update.</p>
<pre>:::shell
git clone git://github.com/sts/checkmk-apt.git
cp checkmk-apt/checks/apt /usr/local/share/check_mk/scripts/
chmod a+x /usr/local/share/check_mk/scripts/apt
# Check_MK Inventory+Generate Nagios Configuration
check_mk -I alltcp
check_mk -U
invoke-rc.d nagios3 restart
</pre>
Puppet + Passengerpuppetinfra-talk/blog/2010/08/31/debian-puppet-passenger/2010-08-31T19:09:00Z2010-08-31T19:09:00ZStefan Schlesinger<p>Puppet is a configuration management tool, its been under heavy development for
almost five years now. It became a major open source project in the last few
years, surrounded by a large community.</p>
<p>In most of the current environments Puppet Masters will either run on webrick
or for the larger environments mongrel was quite standard for a while now.</p>
<p>But Puppet is as well able to be run from within Apache or Nginx. Then you
would be using mod_rails (aka. <a href="http://www.modrails.com/">Phusion
Passenger</a>). This solution is known to scale best, but was always a bit</p>
<p>Puppet is a configuration management tool, its been under heavy development for
almost five years now. It became a major open source project in the last few
years, surrounded by a large community.</p>
<p>In most of the current environments Puppet Masters will either run on webrick
or for the larger environments mongrel was quite standard for a while now.</p>
<p>But Puppet is as well able to be run from within Apache or Nginx. Then you
would be using mod_rails (aka. <a href="http://www.modrails.com/">Phusion
Passenger</a>). This solution is known to scale best, but was always a bit
bulky to install.</p>
<p><img alt="puppet-passenger-diagram" src="/images/post-2010-08-31-debian-puppet-passenger.png" /></p>
<p>Debian's Puppet package maintainers have prepared the puppetmaster package in
Squeeze for an easy installation with mod_rails, so I think this could get the
new standard way to install the puppet server.</p>
<p><a href="http://lists.debian.org/debian-announce/2010/msg00009.html">Rescently</a> the
Debian project announced the freeze on its testing branch (codename "Squeeze")
which in consequence means that: no more new features will be added and all work
will be consentrated on polishing testing up to production level.</p>
<p>I thought it to be a good time then to prepare the "Squeeze" upgrade of my
puppet servers and write a short article about it.</p>
<p>Install the packages:</p>
<pre>:::shell
aptitude install puppetmaster libapache2-mod-passenger
</pre>
<p>Enable the Apache Modules:</p>
<pre>:::shell
a2enmod headers
a2enmod ssl
</pre>
<h2>Manually change config.ru</h2>
<p class="note">Update: Meanwhile you can skip this step.</p>
<p>I had to manually fix the Rackup-file which comes with the squeeze
puppetmaster package. This file contains logic for initializing puppetmaster
as a rack application. It still tries to initialize <em>puppetmaster</em>,
although the puppet server component was renamed to <em>master</em>.
<br/>
There is already a open Debian bug for this, and should get fixed until
the final release.</p>
<p><a href="http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=593557">See bug #593557</a></p>
<p>Now change /usr/share/puppet/rack/puppetmasterd/config.ru to:</p>
<pre>:::ruby
# a config.ru, for use with every rack-compatible webserver.
# SSL needs to be handled outside this, though.
# if puppet is not in your RUBYLIB:
# $:.unshift('/opt/puppet/lib')
$0 = "puppetmasterd"
require 'puppet'
# if you want debugging:
# ARGV << "--debug"
ARGV << "--rack"
require 'puppet/application/master'
# we're usually running inside a Rack::Builder.new {} block,
# therefore we need to call run *here*.
run Puppet::Application[:master].run
</pre>
<h1>Configure Puppet</h1>
<p>When you lounch Puppet the first time, it will generate the SSL certificates. Make
sure you configured the puppet <em>certname</em> option to fit your hostname.</p>
<p>If puppetmaster started before you configured it, you can simply delete the ssl
directory (<em>/var/lib/puppet/ssl</em>) and restart the puppet master. It will
regenerate the directory automatically.</p>
<p>The following represents my puppet configuration in <em>/etc/puppet/puppet.conf</em>.</p>
<pre>:::ini
[main]
logdir=/var/log/puppet
vardir=/var/lib/puppet
ssldir=/var/lib/puppet/ssl
rundir=/var/run/puppet
factpath=$vardir/lib/facter
templatedir=$confdir/templates
[master]
certname=puppet.ono.at
ssl_client_header=SSL_CLIENT_S_DN
ssl_client_verify_header=SSL_CLIENT_VERIFY
[agent]
server=puppet.ono.at
[cert]
autosign=false
</pre>
<p>After the certificates are generated, you should disable the puppetmaster daemon in
<em>/etc/default/puppetmaster</em> by setting $START from yes to <em>no</em>.</p>
<h1>Apache Configuration</h1>
<p>Finally configure an Apache virtual host to listen on port <em>8140</em> and point
it to the ssl certificates generated by puppet. Put the following configuration
in <em>/etc/apache2/sites-available/puppetmaster</em>:</p>
<pre>
## Puppetmaster Apache Vhost Configuration
## Passenger Limits
PassengerHighPerformance on
PassengerMaxPoolSize 12
PassengerPoolIdleTime 1500
# PassengerMaxRequests 1000
PassengerStatThrottleRate 120
RackAutoDetect Off
RailsAutoDetect Off
Listen 8140
<VirtualHost *:8140>
ServerName puppet.ono.at
SSLEngine on
SSLCipherSuite SSLv2:-LOW:-EXPORT:RC4+RSA
SSLCertificateFile /var/lib/puppet/ssl/certs/puppet.ono.at.pem
SSLCertificateKeyFile /var/lib/puppet/ssl/private_keys/puppet.ono.at.pem
SSLCertificateChainFile /var/lib/puppet/ssl/ca/ca_crt.pem
SSLCACertificateFile /var/lib/puppet/ssl/ca/ca_crt.pem
## CRL checking should be enabled; if you have problems with
## Apache complaining about the CRL, disable the next line
SSLCARevocationFile /var/lib/puppet/ssl/ca/ca_crl.pem
SSLVerifyClient optional
SSLVerifyDepth 1
SSLOptions +StdEnvVars
## The following client headers allow the same configuration
## to work with Pound.
RequestHeader set X-SSL-Subject %{SSL_CLIENT_S_DN}e
RequestHeader set X-Client-DN %{SSL_CLIENT_S_DN}e
RequestHeader set X-Client-Verify %{SSL_CLIENT_VERIFY}e
RackAutoDetect On
DocumentRoot /usr/share/puppet/rack/puppetmasterd/public
<Directory "/usr/share/puppet/rack/puppetmasterd">
Options None
AllowOverride None
Order allow,deny
allow from all
</Directory>
</VirtualHost>
</pre>
<p>Now enable the virtual host configuration, enable all required modules and restart the Apache daemon:</p>
<pre>:::shell
a2ensite puppetmaster
a2enmod header
a2enmode passenger
apache2ctl configtest
apache2ctl restart
</pre>
<p><h2>Final Step - Test</h2></p>
<p>After everything is in place, please test your setup. Open up a web browser
and point it to the following Url [adapt the hostname]:</p>
<p>https://puppet.ono.at:8140</p>
<p>You should see a line stating:</p>
<p>"The environment must be purely alphanumeric, not ''"</p>
Happy Birthday!infra-talk/blog/2010/08/16/happy-birthday/2010-08-16T13:12:00Z2010-08-16T13:12:00ZStefan Schlesinger<p>I was a bit surprised when I opened my inbox today. Today is my birthday, but
today is also Debian's Birthday! </p>
<p><a href="http://thank.debian.net/">thank.debian.net</a>
<br/>
Happy Birthday Debian!</p>
<p>I was a bit surprised when I opened my inbox today. Today is my birthday, but
today is also Debian's Birthday! </p>
<p><a href="http://thank.debian.net/">thank.debian.net</a>
<br/>
Happy Birthday Debian!</p>