Monday, October 12, 2009

For the last few years, there was this mysterious phenomenon in which the load average on the boxes would be higher for a day or two after updating to a new version of the software. I didn't play any part in figuring out the reason for it, but I found it mildly interesting. There are bunch of configurations that needed to get sent to the clients whenever they changed, and doing so was a relatively expensive operation. To determine if those configurations changed, the timestamps of the configuration files were compared with the timestamps (or hashes based on them) sent from the clients. Normally, the clients would be up to date, and new configurations would not need to be sent. However, after the updates, there would be new timestamps on all the configuration files, even if they hadn't been changed, and all the clients would have to be updated, which wouldn't mostly occur over the next day or two. New procedures have been put in place to avoid updating the configuration files unless they have actually been changed.

No comments:

Post a Comment