Deleting NLM log and var log entries

Hi,

I would like to confirm an observed behaviour. Maybe I’m wrong here. But this is how it goes. Sorry for being long.

When traps are sent, they are logged in both the nlm log and var log tables. The nlm log table entry OID ends with a counter, and the corresponding var log entry has an OID with the same counter plus an index. This setup works as expected.

However, an issue arises during deletion. In the nlmLogEntry::check_limits function, when the notification log exceeds its set limit, it removes the oldest entry. When an nlm log entry is deleted, the corresponding var log entry should also be removed. For this, the nlmLogVariableEntry::row_delete function is called with the nlm log entry’s OID, ending with the counter.

Within nlmLogVariableEntry::row_delete, the OidListCursor::lookup function attempts to find the matching OID in var log using an OID like x.100. This is matched against entries in the var log table, which have OIDs like x.100.0. The function uses OidxPtrEntryPtrAVLMap::seek_inexact to find the closest match, typically locating x.100.0. However, a subsequent comparison in OidListCursor::lookup checks if the found OID is greater than the target OID (x.100). This comparison always returns true due to the length difference (according to Oid::nCompare logic), leading it to select the previous entry, x.99.y, resulting in a failed match. Consequently, the corresponding entry in var log is not deleted, causing memory to grow over time.

One possible solution is to append a zero to the OID being looked up in var log. For instance, if the lookup key is x.100, change it to x.100.0 before matching.

Is there an alternative solution to this issue?

From the index definition of nlmLogVariableIndex which is

nlmLogVariableIndex OBJECT-TYPE
        SYNTAX  Unsigned32 (1..4294967295)
        MAX-ACCESS not-accessible
        STATUS  current
        DESCRIPTION
                "A monotonically increasing integer, starting at 1 for a given
                nlmLogIndex, for indexing variables within the logged
                Notification."
        -- 1.3.6.1.2.1.92.1.3.2.1.1
        ::= { nlmLogVariableEntry 1 }

an entry with 100.0 suffix should never be there. I am currently verifying if and how the code needs to be fixed. I will get back to you soon…

The nlmLogVariableIndex rule to start with 1 is violated by AGENT++ prior the upcoming version 4.6.2.

I am now verifying the comparison on deletion.

Appending .0 on the nlmLogEntry index, is not sufficient and does not really fix the bug. I have developed a fix, that covers the case, that the lower-bound search really hits the an entry that has a lower OID than the first nlmLogVariableEntry of the deleted nlmLogEntry.
I will test it now, include it in the next release, and then post the patch here.

Interesting. I didn’t read the definition so keen.
Is there a patch for the appending .0 issue fix as well?

Thank you so much. Will wait for the fix.

Sure, in line 1290 of notification_log_mib.cpp add “+1” to the index sub identifier “i”:

		for (int i=0; i<vbcount; i++) {
			logVariableEntry->add_variable(newIndex, i+1, vbs[i]);
		}

1 Like