Possible Memory Leak in SocketTimeout - TimerTask instances accumulate in Timer queue

Hello Frank / SNMP4j Community,

We are facing a potential memory leak in SocketTimeout when using DefaultTcpTransportMapping [SNMP4J Version: 3.8.2] with connection timeout enabled. Over time, SocketTimeout instances accumulate in the Timer queue, leading to increased heap usage and eventually OutOfMemoryError (~30d) in out long running app.

sh-4.4$ date
Thu Jan 22 10:39:54 UTC 2026
sh-4.4$ jcmd 27 GC.class_histogram | grep -i SocketTimeout
  23:          5633         270384  org.snmp4j.transport.SocketTimeout
sh-4.4$ date
Thu Jan 22 11:04:57 UTC 2026
sh-4.4$ jcmd 27 GC.class_histogram | grep -i SocketTimeout
  23:          5633         270384  org.snmp4j.transport.SocketTimeout

On a high-level it seems to like when rescheduling (else branch), a new SocketTimeout instance is created and scheduled, but the current SocketTimeout instance (the one executing run()) is not cancelled or the entry is not set to null in the run() method.

public void run() {
    long now = System.nanoTime();
    if ((transportMapping.getSocketCleaner() == null) ||
            ((now - entry.getLastUse()) / SnmpConstants.MILLISECOND_TO_NANOSECOND >=
                    transportMapping.getConnectionTimeout())) {
        // Socket timed out- close it
    } else {
        // Socket still active - reschedule
        long nextRun = System.currentTimeMillis() +
                (now - entry.getLastUse()) / SnmpConstants.MILLISECOND_TO_NANOSECOND +
                transportMapping.getConnectionTimeout();
        
        SocketTimeout<A> socketTimeout = new SocketTimeout<A>(transportMapping, entry);
        entry.setSocketTimeout(socketTimeout);
        transportMapping.getSocketCleaner().schedule(socketTimeout, nextRun);
    }
}

As a workaround, we are thinking to disable socket cleaner by setting connectionTimeout to 0 for DefaultTcpTransportMapping. This we believe can have other issues like orphan connection accumulation etc. Please share your recommendation.

Could you please help to validate our analysis and identify any gaps in our understanding. Does snmp4j have a mechanism to clean up these tasks that we haven’t found?

Another minor thing, is that in the same class, the below log is printed for milliseconds but the now variable is calculated for for nanoseconds long now = System.nanoTime();

Regards,

Nihal

From my understanding the SocketTimout objects cannot accumulate because the DefaultTimeFactory uses java.util.Timer which removes TimerTasks (= SocketTimeout) from the internal queue when they are executed. These one-time-execution tasks do not need to be explicitly canceled.

So I guess that the memory leak is caused by some other code. Maybe in your application code?

1 Like

Thanks @AGENTPP for taking a look at this. While re-reviewing the SocketTimeout logic, I noticed an additional issue that may explain the observed accumulation of SocketTimeout instances, where absolute timestamp is passed as relative delay.

In the rescheduling path, the following code computes nextRun:

long nextRun = System.currentTimeMillis() +
        (now - entry.getLastUse()) / SnmpConstants.MILLISECOND_TO_NANOSECOND +
        transportMapping.getConnectionTimeout();

transportMapping.getSocketCleaner().schedule(socketTimeout, nextRun);

However, CommonTimer.schedule(TimerTask task, long delay) ultimately delegates to java.util.Timer.schedule(task, delay), where delay is expected to be a relative delay in milliseconds, not an absolute timestamp.

Internally, Timer does:

sched(task, System.currentTimeMillis() + delay, 0);

As a result, the effective execution time becomes:

System.currentTimeMillis() + nextRun

which is effectively causing the task to be scheduled far in the future. This explains why:

  • Previously scheduled SocketTimeout tasks remain in the Timer queue

  • They are not executed nor eligible for cleanup

  • Heap usage grows over time in long-running systems

Please refer the below screenshot. This is in my local debugging setup where we can see the next SocketTimeout task is rescheduled with a delay of 3538312883774 milliseconds which is very huge.

I request you to check this and share your thoughts. Let me know if any additional information is needed from my end. I can help in sharing those.

You are right, this is a regression introduced in SNMP4J 3.0.0. The SocketTimer objects accumulate if there are concurrent requests while connection timeout is larger than 0, i.e., if request frequency is higher than the timeout value.

The fixed run method of SocketTimeout is

    /**
     * Runs a timeout check and if the socket has timed out, it removes the socket from the associated
     * {@link org.snmp4j.TransportMapping}.
     */
    public void run() {
        long now = System.nanoTime();
        if ((transportMapping.getSocketCleaner() == null) ||
                ((now - entry.getLastUse()) / SnmpConstants.MILLISECOND_TO_NANOSECOND >=
                        transportMapping.getConnectionTimeout())) {
            if (logger.isDebugEnabled()) {
                logger.debug("Socket has not been used for " +
                        (now - entry.getLastUse()) +
                        " nanoseconds, closing it");
            }
            AbstractServerSocket<A> entryCopy = entry;
            try {
                transportMapping.close(entryCopy.getPeerAddress());
                logger.info("Socket to " + entryCopy.getPeerAddress() + " closed due to timeout");
            } catch (IOException e) {
                logger.error("Failed to close transport mapping for peer address " +
                        entry.getPeerAddress() + ": " + e.getMessage(), e);
            }
        } else {
            long nextRun = transportMapping.getConnectionTimeout() -
                    (now - entry.getLastUse()) / SnmpConstants.MILLISECOND_TO_NANOSECOND;
            if (nextRun < 0) {
                nextRun = transportMapping.getConnectionTimeout();
            }
            if (logger.isDebugEnabled()) {
                logger.debug("Scheduling " + new Date(System.currentTimeMillis() + nextRun));
            }
            SocketTimeout<A> socketTimeout = new SocketTimeout<A>(transportMapping, entry);
            entry.setSocketTimeout(socketTimeout);
            transportMapping.getSocketCleaner().schedule(socketTimeout, nextRun);
        }
    }

This bug will be fixed in version 3.9.7

1 Like

Thanks @AGENTPP for your response! Could you please help me know when can we expect this release?

Thanks Again!

Version 3.9.7 has been already released and includes some other important fixes regarding TLSTM.

Thanks! Are there any plans to backport this patch to previous versions? E.g.. 3.8.x

No, back-porting is not planned. You can easily upgrade to the latest 3.9.x from 3.8.x version.