06-22-2016 05:34 AM
Well this is fun. Looking for advice ... see end
Upgrading from CPPM 6.5.6 to 6.6 - plan is to migrate 1 CPPM appliance at a time from a 6.5.6 cluster to a 6.6 cluster So ....
build 6.5.6 master publisher vm and restore a backup of our real master publisher into it
upgrade to 6.6. and check everything works
leave for week with some auths going through it
suddenly at 14:22 last thursday event log shows CPU IOWait errors
mention it to systems - and the clearpass vm is flatlining the CPU on our netapps
shutdown clearpass, netapp back to normal, restart VM netapp squeaks
backup 6.6. config and destroy VM
build new VM (clearpassm0) using the 6.6 OVA
test for normal operation and pass users to it, everything works.
install (new) hardware appliance running 6.6, make subscriber to above system (clearpass5)
everything works - now got cluster of 2 machines running 6.6
# Now for the bad bit
remove 6.5.6 subscriber from other cluster. (clearpass4)
Download 6.6 update
At home - 4:15 this morning click install on update, told it to keep its database configs. BTW backup file ~8Gbytes
6:30 - hasn't finished yet
At work 9:15 - ready for reboot - reboot VM
11:15 - comes back from reboot
Device working standalone just fine
Tell it to become subscriber to clearpassm0
# This go apes**t!
master publisher stops authenticating and clearpass5 says cleapassm0 is down
clearpass4 says its synching
clearpass5 says its now out of sync.
ssh into clearpassm0
type "cluster list"
ctl-C out of it
type another command - that hangs
VMWare says lots of CPU activity
can SSh into it and commands respond
no web interface on clearpassm0 after hours
web interface on clearpass4 tries to do something ( at welcome action page) but nothing appears
can ssh into clearpass4
What do I do now? Just leave it for another few hours?
Feeling is to start again and rebuild all the VMS and ditch the sesion data and insight stuff ..... which is what we actually want to keep!
06-22-2016 08:52 AM
o.k. got things working, thank god for a command line. However have lost all insight data and what was my new master publisher was totally trashed, none of the cluster ... commands would work including the reset database one. Ended up rebuilding that VM.
oh well, started the day with a 2 dev cluster and a standalone device and ended day with 2 dev cluster and a standalone device .... without any insight or historic data ...