TableUtils, TableListener.next and rowCache

steffen · October 17, 2022, 3:31pm

We use the TableUtils class to retrieve SNMP tables. To be able to perform some specific checks while retrieving a table, we implemented our own TableListener. The additional checks are conducted in next(TableEvent).

A few weeks ago, Java produced a heap dump (-XX:+HeapDumpOnOutOfMemoryError) because our process was taking up too much RAM. One of my colleagues found out that TableRequest’s internal row cache (rowCache) contained a lot of references to TableRow instances (some millions). Our custom TableListener has mechanisms to abort table retrieval, for example, when a table becomes too large. But it has also some mechanisms that may pause table retrieval for a certain time under certain conditions. However, the heap dump gave us the impression that it is code-technically possible for TableUtils to continue table retrieval in the background before next(TableEvent) returns either true (to retrieve more data) or false (to finish table retrieval).

Is it possible that TableUtils keeps retrieving data without sending rows to our TableListener?

AGENTPP · October 17, 2022, 6:32pm

Yes, of course, it is possible that TableUtils never sends a row to the TableRowListener until the agent returned at least all requested variable bindings for the first row. That is because TableUtils ensure that table events are returned in lexicographic ordering.
If the agent fails to return objects in correct lexicographic ordering, the table retrieval could get in an endless loop. To avoid this, the third lexicographic ordering error let TableUtils stop the retrieval and return an error status.
For large tables however, the detection of such lexicographic errors might be too late.

Are you encountering the issue with sparse or dense tables on the agent side?

steffen · October 18, 2022, 11:18pm

Thanks for your answer! Yes, actually we’ve been talking about that same idea. According to the heap dump, the TableUtils$TableRequest.lexicographicErrors is 0. Today, we collected additional data from the table request:

org.snmp4j.util.TableUtils.TableRequest takes 11G
org.snmp4j.util.TableUtils.TableRequest.rowCache > 8,000,000 entries
org.snmp4j.util.TableUtils.TableRequest.spareTableMode = sparseTable
org.snmp4j.util.TableUtils.TableRequest.numLexicographicErrors = 0

But we too count the number of variable bindings we receive by next(TableEvent), and it seems we got far less variable bindings. To me, this makes only sense if TableUtils keeps on retrieving more data even though both numLexicographicErrors = 0 and next() hasn’t finished. But this behavior would seem a bit strange to me.

AGENTPP · October 19, 2022, 12:11am

Of course, a lexicographic error can only be detected by the manager if an index less or equal than the requested index is being received. If the agent “generates” indexes on the fly or has millions of rows in a table (which I doubt), then of course the cache could contain a lot of rows.
The rows are only propagated to the caller of TableUtils.getTable if the first row in the cache is complete. Complete means, that there was a response (or timeout) for all columns requested.

In theory, an agent could return less VariableBindings than requested, so that some column instances will never be returned.

You can set the retrieval mode to denseTableDoubleCheckIncompleteRows in order to make sure, that missing column values are really missing and not returned because the agent is not able to process GETNEXT or GETBULK correctly. In this mode, the row cache should be always quite small.

In any case, it would surprise me, if this abnormal cache behaviour occurs with a standard conforming agent. Therefore, it would be great if you could share the first couple of rows in the cache together with the column OIDs of the request. Even better would be the returned PDUs as hex-dump from the SNMP4J debug log

AGENTPP · October 22, 2022, 9:47am

Do you have an update for me about your recent findings yet?

Meanwhile I have created an additional Unit test to better understand how this issues could occur and what possible improvements option are.

Here is the test configuration:

    /**
     * The tested columns.
     */
    private static final int[] TEST_EXTREMELY_SPARSE_TABLE_COLUMS = { 1, 2, 3, 4, 5, 6 };
    /**
     * These are the requests that are send on behalf of TableUtils when doing a table walk for
     * {@link #TEST_EXTREMELY_SPARSE_TABLE_RESPONSE_PDUS} simulated table data. For better readability, only the
     * suffixes of the OIDs are included. The required prefixes are added programmatically in the test.
     */
    private static final int[][][] TEST_EXTREMELY_SPARSE_TABLE_REQUEST_PDUS =
            {
                    { { 1 }, { 2 }, { 3 }, { 4 }, { 5 }, { 6 } },
                    { { 1,103 }, { 2,104 }, { 3,103 }, { 5,104 }, { 6,103 } },
                    { { 1,105 }, { 2,106 }, { 3,105 } },
            };
    /**
     * This table contains a GETBULK response PDU per first array dimension. In the second, each 
     * {@link VariableBinding}'s {@link OID} is given which is returned by the simulated agent to
     * that tested {@link TableUtils}. Again, only the suffixes of the OIDs are included here.
     */
    private static final int[][][] TEST_EXTREMELY_SPARSE_TABLE_RESPONSE_PDUS =
            {
                    { { 1,100 }, { 2,100 }, { 3,100 }, { 5,100 }, { 5,100 }, { 6,101 } ,
                      { 1,102 }, { 2,103 }, { 3,102 }, { 5,101 }, { 5,101 }, { 6,102 } ,
                      { 1,103 }, { 2,104 }, { 3,103 }, { 5,104 }, { 5,104 }, { 6,103 } },
                    { { 1,104 }, { 2,105 }, { 3,104 }, /*{ 5,105 },*/ { 5,106 }, { 6,104 },
                      { 1,105 }, { 2,106 }, { 3,105 }, /*{ 5,106 },*/ { 6,100 }, { 8,105 },
                      { 2,107 }, { 3,100 }, { 4,102 }, /*{ 6,100 },*/ { 7,101 } },
                    { { 2,107 }, { 3,105 }, { 4,104 },
                      { 2,108 }, { 3,106 }, { 4,105 },
                      { 2,109 }, { 3,107 }, { 4,106 } },

            };
    /**
     * The {@link #TEST_EXTREMELY_SPARSE_TABLE_EXPECTED_ROWS} array contains the {@link TableEvent} row structure
     * expected to be returned by the tested {@link TableUtils}.
     */
    private static final int[][][] TEST_EXTREMELY_SPARSE_TABLE_EXPECTED_ROWS = {
            { { 1,100 }, { 2,100 }, { 3,100 }, null, { 5,100 }, null },
            { null,      null,      null,      null, { 5,101 }, { 6,101 } },
            { { 1,102 }, null,      { 3,102 }, null, null,      { 6,102 } },
            { { 1,103 }, { 2,103},  { 3,103 }, null, null,      { 6,103 } },
            { { 1,104 }, { 2,104},  { 3,104 }, null, { 5,104 }, { 6,104 } },
            { { 1,105 }, { 2,105 }, { 3,105 }, null, null,      null },
            { null,      { 2,106 }, null,      null, { 5,106 }, null },
    };

The test succeeds and shows, that TableUtils is optimising pretty well if columns are early finished. Like column 4 after the first request and then 5 and 6 after the second.

There are a couple of further improvements possible but all of them further increase the complexity.

For your issue with the large row cache, mostly relevant is the fact, that if the last column of the first row in the cache is not returned by the agent (like in this test case), the row cache grows until the last row is received.
The above behaviour can be improved as follows:

Whenever a column is recognised to be finished, TableUtils must check if TableEvents in the cache can be released to the listener because that row can be now implicitly taken as finished.
A global row cache limit could be implemented, that ensures row TableEvent release if there are more >2 times (configurable) more rows in the cache than maxRepetitions for the GETBULK configured.

The second optimisation configuration might cause sparse rows in large tables dynamically created during retrieval to be returned incomplete, when column retrieval is distributed over several PDUs. But this is really an edge case and can be mitigated by the denseTableDoubleCheckIncompleteRows option.

I am going to implement the first proposed optimisation first…

steffen · October 24, 2022, 5:06pm

Hi Frank

Thanks for your investigating on this. And sorry for not answering right away: In order to avoid giving you vague or unrelated information, we try to do some analysis up front. Here are some more details:

If the agent “generates” indexes on the fly or has millions of rows in a table (which I doubt), then of course the cache could contain a lot of rows.

The table we’ve been retrieving was a TimeFilter table (RFC 4502: Remote Network Monitoring Management Information Base Version 2). This standard actually allows a device to generate indexes on the fly.

In general, we have two different methods for retrieving tables. The naive method simply asks for the entire table. The second method is bound to index new OID(0)…new OID(1). The naive approach is good for normal SNMP tables (including time-filtered tables where the firmware simply returns TimeFilter=0 for all rows). It also works well for devices that implement TimeFilter skipping redundant values (as suggested by the RFC).

In our case, however, the naive approach was used and the device basically produced one row per table and per SysUpTime. Needless to say, technically devices can always return more data that we can process, e.g. because (strictly monotonously growing) index values are generated on the fly. To deal with both problems (1 - applying the wrong method for time-filtered tables and 2 - firmware bugs), we count the variable bindings we receive in next().

After seeing the huge rowCache, we actually looked for a TableUtils.setRowLimit() method. This is very close to what you proposed. But I’m not sure, max-repetition-count plays a role here: In our heap dump, rowCache grew to > 8,000,000 entries.

I didn’t understand what you meant by “whenever a column is recognized to be finished”. (I assume you meant “row” instead of “column”.) I thought this is already the case. I’m also not sure if switching to a DenseTableRequest would help. If TableUtils endlessly receives the first n columns of a row, it will never get to the next columns, would it?

Best regards
Steffen

AGENTPP · October 25, 2022, 2:10pm

Hi Steffen,

I really meant column in the following snippet:

TableUtils retrieves table data by column. The row index is something that not really exists in SNMP table concept because you can only retrieve NEXT instances and not complete rows.

The latest SNMP4J 3.7.4 snapshot contains code to to release rows in the cache that are already finished by SNMP standard rules at a certain time of previously requested table data (i.e., if rows are dynamically added by the agent, the released rows might still be incomplete, but that cannot be improved by SNMP means except using the denseTableDoubleCheckIncompleteRows.

The TableUtils.setRowLimit method is implemented to but not tested yet by a unit test (will be added soon).

steffen · October 27, 2022, 4:13pm

Very cool, thanks!

the released rows might still be incomplete

If that means that additional row data could come later (next() is called twice for the same row), this would not be a problem, since we already support merging row data. (That is due to another problem of a non-standards-conforming agent).

AGENTPP · October 27, 2022, 5:01pm

Yes, the missing row data would be returned later TableEvents.

BTW, the row limit is now ready too. Unit test works. See the latest 3.7.4 snapshot.