Revision [206]

This is an old revision of AuditingWG made by JulianDemarchi on 2007-06-28 23:53:36.

 


Monitors OpenNIC systems and files bugs against any which aren't functioning properly

Members
* JulianDemarchi, coordinator
* AvoYager

Status
The Audit team has been very busy organizing and testing the Auditing systems. We have finally
decided on the Auditing package we will be using. This email is purely to inform the mailing
list of the status of the Audit team, and hopefully bring some more life to OpenNIC.

The Audit team has decided on using a packaged named Nagios [0] for the auditing backend. We
believe that Nagios is the perfect package to use, as it gives us more power to montior the
Opennic network, and pinpoint problems more accurately.

By using Nagios, we are now able to let the system email the appropriate teams in case of a
failure, and then a follow up email when the system returns to normal.

So as you can see, the finished system is going to be very powerful in regards to reporting and
monitoring the overall health of the opennic network.

What is Nagios
Nagios is a host and service monitor designed to inform you of network problems before your clients, end-users or managers do. It has been designed to run under the Linux operating system, but works fine under most *NIX variants as well. The monitoring daemon runs intermittent checks on hosts and services you specify using external "plugins" which return status information to Nagios. When problems are encountered, the daemon can send notifications out to administrative contacts in a variety of different ways (email, instant message, SMS, etc.). Current status information, historical logs, and reports can all be accessed via a web browser.
(Extract from the website - http://www.nagios.org/about/)


But what features does it offer?
Nagios has a lot of features, making it a very powerful monitoring tool. Some of the major features are listed below:
* Monitoring of network services (SMTP, POP3, HTTP, NNTP, PING, etc.)
* Monitoring of host resources (processor load, disk and memory usage, running processes, log files, etc.)
* Monitoring of environmental factors such as temperature
* Simple plugin design that allows users to easily develop their own host and service checks
* Ability to define network host hierarchy, allowing detection of and distinction between hosts that are down and those that are unreachable
* Contact notifications when service or host problems occur and get resolved (via email, pager, or other user-defined method)
* Optional escalation of host and service notifications to different contact groups
* Ability to define event handlers to be run during service or host events for proactive problem resolution
* Support for implementing redundant and distributed monitoring servers
* External command interface that allows on-the-fly modifications to be made to the monitoring and notification behavior through the use of event handlers, the web interface, and third-party applications
* Retention of host and service status across program restarts
* Scheduled downtime for suppressing host and service notifications during periods of planned outages
* Ability to acknowledge problems via the web interface
* Web interface for viewing current network status, notification and problem history, log file, etc.
* Simple authorization scheme that allows you restrict what users can see and do from the web interface
(Extract from the website - http://www.nagios.org/about/)

To Do

* An email from all TLD admins to confirm they want alert emails sent
* Robin to setup our DNS records for the Audit servers
* Nagios howto (Debian, Fedore, FreeBSD and more)
* Nagios conf examples
* AuditingWG plan (In detail)


How you can help?

* Join the team. Send email to the OpenNIC mailing list expressing your interest
* Setup a monitoring server

The best help anyone can offer is by using OUR dns servers. Help support OpenNIC and spread the word.

[0] - http://www.nagios.org


CategoryAuditing
There are no comments on this page.
Valid XHTML :: Valid CSS: :: Powered by WikkaWiki