Nagios Monitoring
Nagios Monitoring
Nagios is a system that can be used to monitor your services and server resources, to ensure you known when maintenance needs to be scheduled in response to an outage, or to prevent a warning becoming a critical problem.
1. To start with update you apt repository list.
apt-get update
2. Search for available Nagios3 packages using the command given below
apt-cache search nagios3
3. Install only the below nagios3 packages and avoid the rest shown in the search
apt-get install nagios3 nagios3-common nagios3-core nagios3-doc check-mk-config-nagios3
4. Open a browser and type in server ip address followed by backlash Nagios to see the Nagios GUI.
http://192.168.8.115/nagios3
5. Username and Password pops up asking for credentials. Enter nagiosadmin as Username and Password:( password entered while installing the nagios3 packages in step 3)
Setting up services
Setting up NTP
Go to /etc/nagios-plugins/config, edit the ntp.cfg file and add these lines.
# 'check_ntp_ntpq' command definition
define command{
command_name check_ntp_time
command_line /usr/lib/nagios/plugins/check_ntp_time -H $ARG1$ -c $ARG2$ -w $ARG3$
}
Now go to /etc/nagios3/conf.d find and edit the localhost_nagios2.cfg file by adding these lines.
# Define a service to check the load on the local machine. define service{ use generic-service ; Name of service template to use
host_name localhost
service_description NTP Time Difference
check_command check_ntp_time!ntp.bict.lab!30!60
Check your configuration is correct using the following command. If you have any errors or warnings, resolve these before continuing
nagios3 –v /etc/nagios3/nagios.cfg
After all the changes have been made, restart Nagios by issuing this command
service nagios3 restart
Return to your Nagios monitoring webpage, and open the services section. Initially it will say that the check is pending, then after about a minute, it will update with the latest response data. Confirm that this service remains OK
Debian Updates
The Nagios Exchange (http://exchange.nagios.org) for the suitable plugin for checking the Debian server has all the latest security updates. Plugin is downloaded (check_debian_packages) and placed in the \usr\lib\nagios\plugins directory.
Edit /etc/crontab add these lines
crontab -e
# Minute Hour Day of Month Month Day of Week Command
* 1 * * * apt-get update >> update
Next is install sudo.
apt-get install sudo
Add this line in /etc/sudoers
nagios ronald=NOPASSWD:/usr/lib/nagios/plugins/check_debian_packages
Now go to /etc/nagios3/conf.d find and edit the localhost_nagios2.cfg file by adding these lines. This checks Debian update service is running.
# Define a service to check the Debian package update
define service{
use generic-service ; Name of service template to use
host_name localhost
service_description Debian Updates
check_command check_debian_packages
}
After all the changes have been made, restart Nagios by issuing this command
service nagios3 restart
Bind9
Create a file named check_bind.sh in /usr/lib/nagios/plugins and make it executable. Add the following lines to the script.
#!/bin/bash
STATUS=`service bind9 status`
if [ "$STATUS" = "bind9 is running." ] && [ "$?" = "0" ];
then
echo "$STATUS"
exit 0
else
echo "$STATUS"
exit 2
fi
Do this command to make it executable.
chmod +x check_bind.sh
Now go to /etc/nagios3/conf.d find and edit the localhost_nagios2.cfg file by adding these lines. This checks Bind9 service is running.
# Define a service to check that BIND9 is working
define service{
use generic-service ; Name of service template to use
host_name localhost
service_description BIND9
check_command check_bind.sh
}
After all the changes have been made, restart Nagios by issuing this command
service nagios3 restart
Check DNS
The plugin (check_dns) is in the /usr/lib/nagios/plugins. Adding the below mentioned lines in the dns.cfg file which is in the /etc/nagios-plugins/config .
# 'check_dns' command definition
define command{
command_name check_dns_hostname
command_line /usr/lib/nagios/plugins/check_dns -H '$ARG1$'
}
# 'check_dns' command definition
define command{
command_name check_dns_hostname_IP
command_line /usr/lib/nagios/plugins/check_dns -H '$ARG1$' -a '$ARG2$'
Now go to /etc/nagios3/conf.d find and edit the localhost_nagios2.cfg file by adding these lines. This checks DNS hostname and IP address checks are done.
define service{
use generic-service ; Name of service template to use
host_name localhost
service_description DNS Hostname
check_command check_dns_hostname!_ldap._tcp.patan.t311
}
# Define a service to check the DNS hostname and IP Address
define service{
use generic-service ; Name of service template to use
host_name localhost
service_description DNS Hostname and IP Address
check_command check_dns_hostname_IP!gc._msdcs.patan.t311!192.168.8.115
}
After all the changes have been made, restart Nagios by issuing this command
service nagios3 restart
Return to your Nagios monitoring webpage, and open the services section. Initially it will say that the check is pending, then after about a minute, it will update with the latest response data. Confirm that this service remains OK
Nagios Notifications (Enable Outbound Email)
Run the command
dpkg-reconfigure exim4-config
Accept all the default values except the three mentioned below.
For: -General type of mail configuration: Select
mail sent by smarthost; received via SMTP or fetchmail
For: -Domains to relay mail for: Select
ucol.ac.nz
For: -IP address or host name of the outgoing smarthost:
203.98.31.34(UCOL’s mail server IP address)
Now e-mail can be sent from the server, to test it adopt the following command on the server
echo “Test email Nagios” | mail –s Testing 1273356@ucol.ac.nz
This command had sent a message on my student mail so this means the email service is working fine on the server
Now to create and add some new contacts into the nagios contacts file, add the following lines to /etc/Nagios/config.d/contacts_nagios2.cfg
define contact{
contact_name ronald
alias ronald
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r
host_notification_options d,r
service_notification_commands notify-service-by-email
host_notification_commands notify-host-by-email
email 1273356@ucol.ac.nz
}
define contactgroup{
contactgroup_name admins
alias Nagios Administrators
members root, ronald
}
After all the changes have been made, restart Nagios by issuing this command
service nagios3 restart
To test Nagios monitoring is able to send emails, I stop Bind9 service by issuing this command.
service bind9 stop
I waited for Nagios to do 4 checks before it sends email, when Nagios started showing critical I went to check if email was sent to my student account and yes it did. I issued bind9 restart command and this also sends a recovery email that service is up and running.
Intercom Notification
Now to create and add some new contacts into the nagios contacts file, add the following lines to /etc/Nagios/config.d/contacts_nagios2.cfg
define contact{
contact_name intercom
alias ronaldp
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r
host_notification_options d,r
service_notification_commands notify-service-by-phone
host_notification_commands notify-host-by-phone
email 1273356@ucol.ac.nz
}
define contactgroup{
contactgroup_name admins
alias Nagios Administrators
members root, intercom, ronald
}
The below mentioned commands are given in the contacts_nagios2.cfg. The notification should escalate 5 minutes after your contact was initially notified.
Define servicescalation{
host_name localhost
service_description BIND9
first_notifaication 2
last_notification 3
notification_interval 1
contact_name intercom
}
Now we need to convert text to voice for intercom. Do this command:
apt-get install festival
To test these use this command. This will convert text to .wav file.
text2wav ronald.txt –o ronald.wav
Now we need to install PJsua, software used to make VOIP calls and play a specified audio file.
apt-get update apt-get install -y emacs tshark gcc make python git gnustep-gui-runtime alsa-utils linux-sound-base sox beep alsa-utils gcc build-essential make g++ gpp alsaplayer-daemon libasound2 libasound2-dev gcc build-essential make g++ gpp libpulse-dev alsaplayer-daemon libasound2 libasound2-dev portaudio19-dev libportaudio2 pulseaudio alsa-utils libasound2-plugins libasound2 binutils binutils-dev
The above command will generate an error, you will need to update the package list with the latest package name.
apt-get install oss4-base
The above command will override the error caused by linux-sound-base. The package name has been changed to the latest name.
wget http://www.pjsip.org/release/2.0.1/pjproject-2.0.1.tar.bz2
tar jxf pjproject-2.0.1.tar.bz2
cd pjproject-2.0.1
sudo make clean
./configure
make dep
make
sudo make install
ln -s ~/pjproject-2.0.1/pjsip-apps/bin/pjsua-i686-pc-linux-gnu ~/pjsua
cd
./pjsua --help
cat <<EOF >> ~/commands.txt
sleep 40000
q
EOF
cat ~/commands.txt | ~/pjsua --id sip:311@10.150.44.3 --registrar sip:10.150.44.3 --realm=10.150.44.3 --add-codec=gsm --srtp-secure=0 --duration=40 --auto-conf --auto-play --local-port=5061 --null-audio sip:3111@10.150.44.3 –username=311 –password=blah –play-file=/var/ronald.wav
The above command makes a call using the parameter specified. Now the service and host notification commands need to be updated. /etc/nagios3/commands.cfg file needs to be edited.
# 'notify-host-by-phone' command definition
define command{
command_name notify-host-by-phone
command_line /var/lib/nagios/ronald.sh "%b***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$\n"
}
# 'notify-service-by-phone' command definition
define command{
command_name notify-service-by-phone
command_line /var/lib/nagios/ronald.sh "%b***** Nagios ***** \n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$\n"
}
Now copy the PJsua and commands.txt place it into /var/lib/Nagios/. Create a script and make it executable and the parameters should be like this.
#!/bin/bash
echo $1 $2 $3 $4 |text2wave -o /var/tmp/output.wav
cat /var/lib/nagios/commands.txt | /var/lib/nagios/pjsua --id sip:311@10.150.44.3 --registrar sip:10.150.44.3 --username=311 --password=blah --play-file=/var/tmp/output.wav --realm=10.150.44.3 --add-codec=gsm --srtp-secure=0 --duration=10 --auto-conf --auto-play --null-audio sip:3111@10.150.44.3
To test this do service bind9 down which will escalate a voice call when service goes down through phone.
Downtime
If you are planning maintenance on a service, you can schedule downtime in Nagios, which will prevent notifications being sent out.
1. From the Services menu, select the service you are going to schedule downtime for
2. On the right hand side, select “Schedule downtime for this service”
3. Complete the form, then press commit to add it to the schedule.
4. By default you will receive and error, follow these steps to fix it.
5. You can fix this by running the following commands:
Edit /etc/nagios3/nagios.cfg
check_external_commands=1
Run this commands in the terminal.
/etc/init.d/nagios3 stop
dpkg-statoverride --update --add nagios www-data 2710 /var/lib/nagios3/rw
dpkg-statoverride --update --add nagios nagios 751 /var/lib/nagios3
/etc/init.d/nagios3 start
Now a downtime schedule can be committed.
Dependency
When you configure monitoring of hosts and services, you will have hosts and services depending on each other. As a result of this, you can become swamped with notifications if a master host or service fails. In your current configuration NTP_Time_Difference is dependent on bind9 to resolve ntp.bict.lab, and a failure in bind9 will result in an unknown NTP_Time_Difference notification. Add the following to /etc/nagios3/config.d/localhost_nagios2.cfg
define servicedependency{
host_name localhost
service_description bind9
dependent_host_name localhost
dependent_service_description NTP_Time_Difference
execution_failure_criteria c
notification_failure_criteria c
}
Nagios Integration
SNMP
SNMP is commonly used to obtain information from network devices, in the lab switch indicated, the connection details are as follows:
•IP = 10.150.200.83 (/16)
•Community = public
Your task for this section is to configure a Nagios host called “LabSwitch” and a service check for each port that will provide a warning when the port reaches 75mbit/second and will provide a critical notification when it exceeds 90mbit/second. You will need to check the numerical unit provided by the switch and convert the requirements to appropriate values for the switch
1. apt-get install snmp
2. Run the following command to view all object ID’s (OID) available on the lab switch: snmpwalk -v 2c -c public -m all -Os 10.150.200.83 iso
3. Visit the following site and search for the switch model number:
http://pro-networking-h17007.external.hp.com/us/en/support/converter/index.aspx
4. Select software downloads, and then download the MIB’s file.
5. Extract all of the .mib files into /root/.snmp/mibs
6. We need to create a .snmp file and mibs file. To do this run this commands:
mkdir .snmp than cd .snmp
mkdir mibs than cd mibs
Re-run the snmpwalk command from step 1, notice now that the MIB descriptions are available, the output is much easier to follow.
The output will look like this:
ifHCOutOctets.1 is the important bit.
In order for Nagios to understand and translate that information we need to translate the info in numbers. To do this we can run snmptranslate, here how you run it.
snmptranslate -On IF-MIB::ifHCOutOctets.1
The output of the result should look like this:
.1.3.6.1.2.1.31.1.1.1.10.1
Go to /etc/nagios-plugins/config, edit the snmp.cfg file and add these lines.
# 'check_snmp' command definition
define command{
command_name check_snmp
command_line /usr/lib/nagios/plugins/check_snmp -H '$ARG1$' -o '$ARG2$' -w '$ARG3$' --rate -c '$ARG4$' -C '$ARG5$'
}
Go to /etc/nagios3/conf.d/ , edit the localhost_nagios2.cfg file and add these lines.
# Define a service rule to monitor the bandwidth
define service{
use generic-service ; Name of service template to use
host_name localhost
service_description SNMP
check_command check_snmp!10.150.200.83!.1.3.6.1.2.1.2.2.1.10.2!9834375!11801250!public
}
After all the changes have been made, restart Nagios by issuing this command
service nagios3 restart
Domain Authentication
Enable authnz_ldap.load module by giving command
a2enmod authnz_ldap
After that do
Service apache2 restart
Edit Nagios web configuration file nagios.conf
The location of the web config file is: /etc/apache2/conf.d/nagios.conf
<DirectoryMatch (/usr/share/nagios3/htdocs|/usr/lib/cgi-bin/nagios3|/etc/nagios3/stylesheets)>
Options FollowSymLinks
DirectoryIndex index.php index.html
AllowOverride AuthConfig
Order Allow,Deny
Allow From All
#AuthName "Nagios Access"
#AuthType Basic
#AuthUserFile /etc/nagios3/htpasswd.users
# nagios 1.x:
#AuthUserFile /etc/nagios/htpasswd.users
#require valid-user
AuthType Basic
AuthName "Nagios Domain"
AuthBasicProvider ldap
AuthLDAPURL "ldap://localhost:389/cn=Users,dc=patan,dc=t311?sAMAccountName?sub?(objectClass=*)"
AuthLDAPBindDN "cn=nagios,cn=Users,dc=patan,dc=t311"
AuthLDAPBindPassword PASSW0RD
AuthLDAPGroupAttribute member
AuthLDAPGroupAttributeIsDN on
require ldap-group CN=Administrators,CN=Builtin,DC=patan,DC=t311
</DirectoryMatch>
<Directory /usr/share/nagios3/htdocs>
Options +ExecCGI
</Directory>
Edit cgi authentication file cgi.cfg
nagiosinstallationdir = /etc/nagios/
Run the following two commands:
grep “#” /etc/nagios/cgi.cfg | grep -i “nagiosadmin”
#authorized_for_system_information=nagiosadmin
#authorized_for_configuration_information=nagiosadmin
#authorized_for_system_commands=nagiosadmin
#authorized_for_all_services=nagiosadmin
#authorized_for_all_hosts=nagiosadmin
#authorized_for_all_service_commands=nagiosadmin
#authorized_for_all_host_commands=nagiosadmin
grep “*” /etc/nagios/cgi.cfg | grep -v “*”
authorized_for_system_information=*
authorized_for_configuration_information=*
authorized_for_system_commands=*
authorized_for_all_services=*
authorized_for_all_hosts=*
authorized_for_all_service_commands=*
authorized_for_all_host_commands=*
The commands above will replace “nagiosadmin” with “*” in order to give any LDAP user full functionality of Nagios.
4. Restart apache2 and nagios using the commands:
service apache2 restart
service nagios3 restart
Now the nagios web interface can be logged in using the new LDAP authenticated user from Active Directory.