04-09-2013 10:40 AM
We have two hardware appliances in a HA cluster running Amigopod/Clearpass 3.9.7. For the last few years I have had the DB maintenance set on the primary to run Saturday at 3:00AM. On the secondary box, it was set to run Saturday at 3:15AM. I was thinking (??) that the primary DB maint. would sync the purges/changes over to the secondary, then 15 minutes later, the secondary would run the DB maint. (there shouldn't be any data purges).
The last month or so, when the secondary DB maint. runs, it gets hung up trying to purge accounts from the DB. This makes sense, because the DB is read-only although there shouldn't be any accounts to purge on the secondary. This caused the secondary server to create many syslogs and have higher than normal utilization. This may have started after we went from 3.7 to 3.9.
What is the correct way to configure the DB maint with a HA cluster? Can I disable the DB maint. on the secondary server since it is part of a cluster? I am unsure if the DB maint. just purges old data from the DB or if there's any other useful tasks such that we would want to leave it enabled on the secondary.
04-10-2013 06:25 AM
Greetings Bryan. We have seen a number of high load deplyments where the maintenance run can take a very long time. This is primarily due to the re-indexing involved in selective deletes. I would start by separating these entries by a few hours if possible. 15 minutes is very short.
You are correct that the re-index on the primary will make its way to the node. The only thing that is actually going to be indexed is the logs. All configuration and RADIUS history is only writable on the primary.
Having just peaked at the code, it does not seem the retention code is aware of the HA status or not so it may try and do its purges. This makes it vital to have the primary completed, and synced over prior to the node trying to do it again.
Look on the primary for these two log entries:
- Processing scheduled database maintenance.
- Finished processing scheduled database maintenance
Technically you can just search 'database maintenance' and catch them. Look at the time delta and that will be your starting point for setting node's schedule.