Zabbix: How to check a modification of a web page

To check if the content of a certain web page changed is not as obvious as it should be in Zabbix but my experience with Zabbix is limited so I needed some time to figure out how to do this.

First of all you need a host, lets say "java-cup.de". On this host you define an item with a key that is constructed like this:

web.page.get[https://java-cup.de/]

 

This key just retrieves the content of the web page, including all headers. There are more parameters to web.page.get possible but we do not need them, if we supply a complete URL on the first parameter.

There is also the key web.page.regexp but I prefer to collect the full web page in this step because of being more flexible afterwards.

Then we need a trigger that references the just created item. It uses the function changes to tell if the page was modified since the last retrieval.

With just this setup a change is detected on every retrieval, because of the headers being included. There are always the headers date and expires that contain a date and time of the retrieval/expiry of the page, so that changes on every usage of the item.

To avoid this we have to parse the content from the page that we want to watch. This can easily been done by using the Preprocessing tab on the item. Here we can employ a regexp to extract the data that we want to watch.

Even better, we can do this in multiple stages so that we can do the extraction process in small steps. Of course, it isn't optimal to use regexps on HTML but there is no HTML parser available and the XPath parser that we coud use here instead of the regexp will not work on most HTML pages since they are not well defined XML.