Airwave - Identify and delete corrupt RRD files
Introduction : This article explains the process involved in identifying corrupt RRD files and how to get rid of them.
Due to presence of corrupt RRD files many web pages on Airwave might end up with a page crash.
A sample error message is shown below.
The server has encountered an error while performing your request.
Please try again or email a copy of this message along with the
<https://18.104.22.168/display_log?log=/var/log/httpd/error_log> web server
log to customer support. Contact Aruba Technical Support at
<mailto:email@example.com> firstname.lastname@example.org or
Time: Tue Dec 17 13:47:06 2013
[Tue Dec 17 13:47:06 2013] [error] [client 127.0.0.1]
AWRRDtool last error: (last: Failed to read any or all header bytes:
Successlast: see prior output for details)
One or more of your RRD files seems to be corrupted.
Please run: scripts/identify_broken_rrds at Mercury/Utility/Assert.pm line 45
at Mercury/Utility/Assert.pm line 45
59 ( Mercury/AWRRD/Base.pm: 151) ASSERT
58 ( Mercury/AWRRD/Base.pm: 121) Mercury::AWRRD::Base::check_for_errors([arg
57 ( Mercury/AWRRD/BandwidthBase.pm: 92) Mercury::AWRRD::Base::_tool([arg
56 ( Mercury/AWRRD/BandwidthBase.pm: 153)
55 ( Mercury/AWRRD/BandwidthBase.pm: 140)
54 ( Mercury/Client.pm: 520)
53 ( Mercury/Client.pm: 523)
From the error message one can see that the script to identify broken RRD's is to be executed to identify broken RRD files.
Before we start identifying the corrupt or broken RRD files it is always good to take a backup of the server in case something goes wrong during the process mentioned below.
To take a manual backup execute the command
Once the backup process is complete extract the backup file named as databackup.tar.gz from the /alternative directory.
To run the script to identify broken RRD files, execute the following command.
./scripts/identify_broken_rrds > /tmp/rrd.txt
The above command will save the list of broken RRD files in rrd.txt file located in the /tmp directory.
Create a folder corruptedfiles in /tmp using mkdir /tmp/corruptedfiles
Move the broken RRD files to the /tmp/corruptedrrdfiles
Next identify any known good rrd file and replace the broken rrd file with the known good file using the copy command.
To identify a known good RRD file, one can pick a file which is not in the list extracted in Step2.
04_46_65_46_FD_1A.in_bps.awrrd is a non corrupt file in the folder client_bandwidth_1A. This file should be over written on the corrupt file which is 04_46_65_03_CB_1A.in_bps.awrrd using the command below
It is best to move into the same directory where the broken RRD file is present and pick a file which is not in the list obtained from Step2.
Once all the broken RRD files are replaced with known good files the page crash would not occur anymore.