Debugging Munin plugins
0. Restart munin-node (new plugins require munin-node to be restarted to register the plugin)
# sudo /etc/init.d/munin-node restart
1. On the host where Munin runs, run munin-update as the Munin user account.
This step will tell you whether munin (the server) is able to communicate with munin-node (the agent).
# su -s /bin/bash munin # /usr/share/munin/munin-update --debug --nofork --stdout --host foo.example.com --service df
You should get a line like this:
Aug 11 22:39:51 - [6846] Updating /var/lib/munin/example.com/foo.example.com-df-_dev_hda1-g.rrd with 57
After this, replace df with the service you want to check (e.g. hddtemp_smartctl). If one of these steps does not work, something is probably wrong with the plugin or how munin-node talks to the plugin.
2. On the host where munin-node runs, check to see whether the plugin runs through munin-run. Test with and without config, and with and without --debug.
Regular run:
# munin-run df _dev_hda1.value 83
Config run:
# munin-run df config graph_title Filesystem usage (in %) graph_args --upper-limit 100 -l 0 graph_vlabel % graph_category disk graph_info This graph shows disk usage on the machine. _dev_hda1.label / _dev_hda1.info / (ext3) -> /dev/hda1 _dev_hda1.warning 92 _dev_hda1.critical 98
3. If not, does the plugin run when executed directly? If it runs when executed as root and not through munin-run (as described in bullet point 1), the plugin has a permission problem. See the article on environment files.
4. Does the plugin run through munin-node, with and without config? Hint: Telnet to port 4949.
Regular run:
# telnet foo.example.com 4949 Trying foo.example.com... Connected to foo.example.com. Escape character is '^]'. # munin node at foo.example.com fetch df _dev_hda1.value 83 [...] .
With config:
# telnet foo.example.com 4949 Trying foo.example.com... Connected to foo.example.com. Escape character is '^]'. # munin node at foo.example.com config df graph_title Filesystem usage (in %) graph_args --upper-limit 100 -l 0 graph_vlabel % graph_category disk graph_info This graph shows disk usage on the machine. _dev_hda1.label /boot _dev_hda1.info /boot (ext3) -> /dev/hda1 _dev_hda1.warning 92 _dev_hda1.critical 98 [...] .
If the plugin does run with munin-run but not through telnet, you probably have a PATH problem. Tip: Set env.PATH for the plugin in the plugin's environment file.
5. Does the plugin output contain too few, too many and/or illegal characters?
6. Does Munin (munin-cron and its children) write values into RRD files? Hint: rrdtool fetch [rrd file] AVERAGE
7. Does the plugin use legal field names? See Notes on Field names.
8. In case you loan data from other graphs, check that the {fieldname}.type is set properly. See Munin file names for a quick reference on what any error messages in the logs might indicate.
Cases
SELinux sometimes break Munin plugins
- See the documentation start page for links to SELinux rules for Munin.
munin-node seems to show sane values, but RRD files are filled with 0
- The plugin's output values are GAUGE values, but the plugin thinks they are COUNTER or DERIVE. Note that by default, a plugin thinks the values are GAUGE values.
munin-node seems to show sane values, but RRD files are filled with 'nan'
- Check that there are no invalid characters in the plugin's output.
- For new plugins let munin gather data for about 20 minutes and things will unwrinkle
munin-node is configured properly, but won't give any data
- Check that the plugin's field name(s) has/have the .value directive on each field name (yes, I managed to forget that recently).
munin-node some times returns valid data, some times not
- Check that no race conditions occur. A typical race condition is updating a file with crontab while the plugin is trying to read the file.
The graphs are empty
- The plugin's output values are GAUGE values, but the plugin thinks they are COUNTER or DERIVE. Note that by default, a plugin thinks the values are GAUGE values.
- The files to be updated by Munin are owned by root or another user account
Other mumbo-jumbo
- Run the different stages in munin-cron manually, using --debug, --nofork, --stdout, something like this:
su - munin -c "/usr/lib/munin/munin-update --debug --nofork --stdout --host foo.example.com --service df"
