myNMS

Maintenance

Housekeeping

There is at present no automatic mechanism for removing old records from any of the tables which keep historical information. In the case of the IP_MAC table this is intentional as the information kept on past usage of IP addresses is considered useful, and the table is relatively small.
myNMS> echo "select count(*) from IP_MAC" | mysql -t NMS
+----------+
| count(*) |
+----------+
|    16145 |
+----------+

myNMS> echo "select count(*) from IP_MAC where time2<`DB_maint --showtime NOW - QUARTER`" | mysql NMS
+----------+
| count(*) |
+----------+
|     8824 |
+----------+

In the case of the IP_MAC_user table however the table itself is relatively large, and historical records form a greater proportion of the whole:

myNMS> echo "select count(*) from IP_MAC_user" | mysql NMS
+----------+
| count(*) |
+----------+
|   398065 |
+----------+

myNMS> echo "select count(*) from IP_MAC_user where mtime<`DB_maint --showtime NOW - QUARTER`" | mysql NMS
+----------+
| count(*) |
+----------+
|   270258 |
+----------+

myNMS>  echo "select count(*) from IP_MAC_user where mtime<`DB_maint --showtime NOW - QUARTER - QUARTER`" | mysql -t NMS
+----------+
| count(*) |
+----------+
|   208687 |
+----------+

For this table old records should be deleted periodically, e.g.:
myNMS>  echo "delete from IP_MAC_user where mtime < `DB_maint --showtime NOW - QUARTER - QUARTER`" | mysql NMS 


IP_MAC_user duplicates (or consecutive records)

For some reason the IP_MAC_user table may acquires duplicate records (with the same IP, MAC, user and time). It will also record the same IP, MAC and user with consecutive times where the times are separated by more than a week (this is due to the way LOG_proc works). A bad example of the problem looked like this:

This can be fixed using

myNMS>  DB_maint --undup NMS IP_MAC_user 
Whilst working DB_maint will report duplicates and consecutive time records it finds e.g.:
Duplicate records for IP/MAC/user/time: 192.171.166.6   0800208694df sgshoges 20010215170500
Duplicate records for IP/MAC/user/time: 192.171.166.6   0800208694df sgshoges 20010215170500
Duplicate records for IP/MAC/user/time: 192.171.166.6   0800208694df sgshoges 20010215170500
Duplicate records for IP/MAC/user/time: 192.171.166.6   0800208694df sgshoges 20010215170500
Duplicate records for IP/MAC/user/time: 192.171.166.6   0800208694df sgshoges 20010215170500
Duplicate records for IP/MAC/user/time: 192.171.166.6   0800208694df sgshoges 20010215170500
Duplicate records for IP/MAC/user/time: 192.171.166.6   0800208694df sgshoges 20010215170500
Duplicate records for IP/MAC/user/time: 192.171.166.6   0800208694df sgshoges 20010215170500
Duplicate records for IP/MAC/user/time: 192.171.166.6   0800208694df sgshoges 20010215170500
Duplicate records for IP/MAC/user/time: 192.171.166.6   0800208694df sgshoges 20010215170500
Duplicate records for IP/MAC/user/time: 192.171.166.6   0800208694df sgshoges 20010215170500
Consecutive records for IP, MAC, user : 192.171.166.6   0800208694df sgshoges
        mtime(1): 20010215170500
        mtime(2): 20010216180820
Consecutive records for IP, MAC, user : 192.171.166.6   0800208694df sgshoges
        mtime(1): 20010216180820
        mtime(2): 20010219180500
Consecutive records for IP, MAC, user : 192.171.166.6   0800208694df sgshoges
        mtime(1): 20010219180500
        mtime(2): 20010220121320
Consecutive records for IP, MAC, user : 192.171.166.6   0800208694df sgshoges
        mtime(1): 20010220121320
        mtime(2): 20010221181820
Consecutive records for IP, MAC, user : 192.171.166.6   0800208694df sgshoges
        mtime(1): 20010221181820
        mtime(2): 20010222134820
If this has to be done repeatedly it could be performed by a cron job daily, weekly or monthly.

The result will be that the query should produce the correct result:


mySQL table errors

Occasionally errors may occur in some of mySQL's tables. These may be indicated by errors in results returned by Query for e.g. SNMP devices, such as several records for the same IP address of a device's interface:

Another indication is messages in log files like:

DBD::mysql::db do failed: Duplicate entry '947188840-20' for key 2 at ./SNMP_info line 762.

DBD::mysql::db do failed: Incorrect key file for table: 'SNMP_ipAddrTable'. Try to repair it at /usr/local/myNMS/bin/SNMP_info line 788.
	
When these occur the tool isamchk can be used to check for and repair errors. To use the tool specify (or change into) the directory containing the files mySQL uses to store the affected table, and specify the table name (file name without any extension) e.g.:
myNMS> isamchk NMS/SNMP_ipAddrTable
Checking ISAM file: NMS/SNMP_ipAddrTable
Data records:    1356   Deleted blocks:       3
- check file-size
- check delete-chain
- check index reference
- check data record references index: 1
isamchk: error: Found 1350 keys of 1356
ISAM-table 'NMS/SNMP_ipAddrTable' is corrupted
Fix it using switch "-r" or "-o"

myNMS> isamchk -r NMS/SNMP_ipAddrTable
- recovering ISAM-table 'NMS/SNMP_ipAddrTable.ISM'
Data records: 1356
- Fixing index 1
- Fixing index 2
- Fixing index 3

myNMS> isamchk NMS/SNMP_ipAddrTable
Checking ISAM file: NMS/SNMP_ipAddrTable
Data records:    1356   Deleted blocks:       0
- check file-size
- check delete-chain
- check index reference
- check data record references index: 1
- check data record references index: 2
- check data record references index: 3
	

Another example:

DBD::mysql::st execute failed: Got error 127 from table handler at /export/www/noc/Query.cgi line 1637.
	
Examining the code of the Query script (symlinked as Query.cgi) shows that one of the tables being accessed is SNMP_idAddrTable, so isamchk is run to check this table.
myNMS> isamchk NMS/SNMP_ipAddrTable
Checking ISAM file: NMS/SNMP_ipAddrTable
Data records:    1859   Deleted blocks:      30
- check file-size
isamchk: error: Size of datafile is: 83028            Should be: 83116
- check delete-chain
isamchk: error: delete-link-chain corrupted
- check index reference
- check data record references index: 1
isamchk: error: Found key at page 57344 that points to record outside datafile
ISAM-table 'NMS/SNMP_ipAddrTable' is corrupted
Fix it using switch "-r" or "-o"
	

Alternatively isamchk can been run against all tables, using the -s or --silent flag to report only errors found :

myNMS> isamchk -s NMS/*.ISM
isamchk: ISAM file /DATA/mySQLdata/NMS/SNMP_ipAddrTable.ISM
isamchk: error: Size of datafile is: 58300            Should be: 112992
isamchk: error: Found key at page 1024 that points to record outside datafile
ISAM-table '/DATA/mySQLdata/NMS/SNMP_ipAddrTable.ISM' is corrupted
Fix it using switch "-r" or "-o"
	

Sometimes even the -r (--recover) option will not fix the problem (as shown by continuation of the symptoms, and by isamchk continuing to report errors in the table) and the -o or --safe-recover old recovery method ("slower than '-r' but can handle a couple of cases that '-r' cannot handle") is required:

myNMS> isamchk -o /DATA/mySQLdata/NMS/SNMP_ipAddrTable
- recovering ISAM-table '/DATA/mySQLdata/NMS/SNMP_ipAddrTable.ISM'
Data records: 1350
Data records: 1325
scinfo@sunm2> isamchk /DATA/mySQLdata/NMS/SNMP_ipAddrTable
Checking ISAM file: /DATA/mySQLdata/NMS/SNMP_ipAddrTable
Data records:    1325   Deleted blocks:       0
- check file-size
- check delete-chain
- check index reference
- check data record references index: 1
- check data record references index: 2
- check data record references index: 3
In extreme cases it may be necessary to shut down and restart mysql:
mysqladmin --user=root shutdown
then as unix root:
/etc/init.d/mysql.server start


Inconsistent data in database

There have been instances where data in a switch's SNMP_system and SNMP_ipAddrTable does not correspond with its SNMP_ifTable and dot1d table, as seen by comparing IP addresses in the SNMP_ipAddrTable and those in the dot1d table e.g.

This is probably caused by bugs in the SNMP_info functions updating devices' SNMP and dot1d table info when some aspect of the device's configuration (e.g. IP address) changed. To remedy this delete the device in the mysql tool:

myNMS> DB_maint --delete-device NMS 947197967
Deleting device 947197967 from database NMS tables:
        SNMP_system
        SNMP_ifTable
        SNMP_ipAddrTable
        dot1d
        MAC_connections
 - DONE

	
Then re-discover the device manually:
	SNMP_info --snmp {IP address of switch}
	SNMP_info --dot1d {IP address of switch}