PDML or PCAP: An Arpwatch replacement using MySQL
In order to facilitate the use of tracemac at my place of employment, I used arpwatch on a SPAN port to keep track of IP/MAC mappings and imported them into a database every couple minutes. The database was then used by a web utility that spawned tracemac, allowing the user to enter an IP address instead of the MAC address and VLAN that would be required by tracemac alone. This functionality was never built into tracemac since Cisco doesn't seem to have a MIB for looking at the ARP cache of a router, and even if they did, an ARP cache would not be the most reliable source for the information. So, I was stuck with arpwatch, which isn't the most pleasant thing to configure for multiple interfaces and to not bombard root with tons of e-mail. I set out to write a replacement with the basic functionality I needed.
My first thought was to use PDML, a packet export format using XML. PDML is something that ethereal is capable of generating while monitoring a live interface or reading from a capture file instead of the typical human-readable output. I wrote a PHP script (arpmon.php) to wrap around tethereal, allowing tethereal to snarf up the ARP packets and the PHP script would pick out the relevant bits from the easily-parsible PDML output. In theory, this sounded pretty good. I could stick with a high-level scripting language with easy access to XML and database functions, parsing data that should remain more consistent than direct console output from tcpdump or ethereal. Unfortunately, there were problems. tethereal, although generally efficient, consumed a fair about of CPU time to produce PDML from the ARP data. This isn't the worst of it though, as I have boxes speedy enough to handle a little processing. tethereal also has memory leak issues. Ethereal keeps track of state information as it goes, in order to make more meaningful decodings of packets based on the state of the connections. Unfortunately, the side effect is that over time, this data accumulates, eventually leading to memory exhaustion. tethereal outputting PDML (or any other cooked output type for that matter) is not suitable for use as a continuous input stream to a network monitoring script.
So, with ruling PDML out, I was left with having to code a lower-level solution. Using libpcap and libmysqlclient, I wrote a simple C program (arpmon.cpp) to do what needed done. And it does it fast.
Source Code Archive
All code referenced here is licensed under the GPL.
top output from PHP/PDML solution
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 29315 root 16 0 15864 1744 1448 R 7.7 0.1 19:29.39 arpmon.php 29318 root 17 0 1727m 967m 5496 R 3.7 63.7 9:40.73 tethereal 29311 root 16 0 14256 1744 1448 S 1.0 0.1 5:21.29 arpmon.php 29314 root 15 0 479m 453m 5500 S 0.7 29.9 2:34.82 tethereal
top output from C/PCAP solution
679 root 15 0 5536 1336 1068 S 0.7 0.1 0:00.31 arpmon 1631 balleman 16 0 2984 940 756 R 0.3 0.1 0:00.40 top 1685 root 15 0 4452 1332 1068 S 0.3 0.1 0:00.03 arpmon