As munin is currently architected, if munin-cron does not finish in < 5 minutes, you will have gaps in your graphs (universally) for the next 5 minute period. The first workaround to this problem is breaking munin-cron into two steps; one for munin-node and munin-limits, the other for munin-graph and munin-html. This solves the issue in the 90% case, as making the graphs takes far more time (normally) than munin-update does.
However, in a sufficiently large infrastructure, you will probably have some nodes in distress. In this case, a single node may take more than 5 minutes to respond in munin-update... and still blocking out the next 5 minute period in the graphs.
To solve this problem, I propose a few changes to munin:
1. We should make the poller (munin-cron) no longer run from cron. This will solve the issue of Cron outsmarting the poller. In it's place, we should have a well timed daemon that handles munin's polling and update tasks.
2. We should make munin-update have a per-node lock, instead of a single global lock. This will enable munin-update to run multiple times in parallel, with only slow-responding nodes being afflicted with gaps in the graphs (as opposed to the entire infrastructure.)
Thoughts?