We are facing a potential memory leak in SocketTimeout when using DefaultTcpTransportMapping [SNMP4J Version: 3.8.2] with connection timeout enabled. Over time, SocketTimeout instances accumulate in the Timer queue, leading to increased heap usage and eventually OutOfMemoryError (~30d) in out long running app.
sh-4.4$ date
Thu Jan 22 10:39:54 UTC 2026
sh-4.4$ jcmd 27 GC.class_histogram | grep -i SocketTimeout
23: 5633 270384 org.snmp4j.transport.SocketTimeout
sh-4.4$ date
Thu Jan 22 11:04:57 UTC 2026
sh-4.4$ jcmd 27 GC.class_histogram | grep -i SocketTimeout
23: 5633 270384 org.snmp4j.transport.SocketTimeout
On a high-level it seems to like when rescheduling (else branch), a new SocketTimeout instance is created and scheduled, but the current SocketTimeout instance (the one executing run()) is not cancelled or the entry is not set to null in the run() method.
public void run() {
long now = System.nanoTime();
if ((transportMapping.getSocketCleaner() == null) ||
((now - entry.getLastUse()) / SnmpConstants.MILLISECOND_TO_NANOSECOND >=
transportMapping.getConnectionTimeout())) {
// Socket timed out- close it
} else {
// Socket still active - reschedule
long nextRun = System.currentTimeMillis() +
(now - entry.getLastUse()) / SnmpConstants.MILLISECOND_TO_NANOSECOND +
transportMapping.getConnectionTimeout();
SocketTimeout<A> socketTimeout = new SocketTimeout<A>(transportMapping, entry);
entry.setSocketTimeout(socketTimeout);
transportMapping.getSocketCleaner().schedule(socketTimeout, nextRun);
}
}
As a workaround, we are thinking to disable socket cleaner by setting connectionTimeout to 0 for DefaultTcpTransportMapping. This we believe can have other issues like orphan connection accumulation etc. Please share your recommendation.
Could you please help to validate our analysis and identify any gaps in our understanding. Does snmp4j have a mechanism to clean up these tasks that we haven’t found?
Another minor thing, is that in the same class, the below log is printed for milliseconds but the now variable is calculated for for nanoseconds long now = System.nanoTime();
From my understanding the SocketTimout objects cannot accumulate because the DefaultTimeFactory uses java.util.Timer which removes TimerTasks (= SocketTimeout) from the internal queue when they are executed. These one-time-execution tasks do not need to be explicitly canceled.
So I guess that the memory leak is caused by some other code. Maybe in your application code?
Thanks @AGENTPP for taking a look at this. While re-reviewing the SocketTimeout logic, I noticed an additional issue that may explain the observed accumulation of SocketTimeout instances, where absolute timestamp is passed as relative delay.
In the rescheduling path, the following code computes nextRun:
However, CommonTimer.schedule(TimerTask task, long delay) ultimately delegates to java.util.Timer.schedule(task, delay), where delay is expected to be a relative delay in milliseconds, not an absolute timestamp.
As a result, the effective execution time becomes:
System.currentTimeMillis() + nextRun
which is effectively causing the task to be scheduled far in the future. This explains why:
Previously scheduled SocketTimeout tasks remain in the Timer queue
They are not executed nor eligible for cleanup
Heap usage grows over time in long-running systems
Please refer the below screenshot. This is in my local debugging setup where we can see the next SocketTimeout task is rescheduled with a delay of 3538312883774 milliseconds which is very huge.
You are right, this is a regression introduced in SNMP4J 3.0.0. The SocketTimer objects accumulate if there are concurrent requests while connection timeout is larger than 0, i.e., if request frequency is higher than the timeout value.
The fixed run method of SocketTimeout is
/**
* Runs a timeout check and if the socket has timed out, it removes the socket from the associated
* {@link org.snmp4j.TransportMapping}.
*/
public void run() {
long now = System.nanoTime();
if ((transportMapping.getSocketCleaner() == null) ||
((now - entry.getLastUse()) / SnmpConstants.MILLISECOND_TO_NANOSECOND >=
transportMapping.getConnectionTimeout())) {
if (logger.isDebugEnabled()) {
logger.debug("Socket has not been used for " +
(now - entry.getLastUse()) +
" nanoseconds, closing it");
}
AbstractServerSocket<A> entryCopy = entry;
try {
transportMapping.close(entryCopy.getPeerAddress());
logger.info("Socket to " + entryCopy.getPeerAddress() + " closed due to timeout");
} catch (IOException e) {
logger.error("Failed to close transport mapping for peer address " +
entry.getPeerAddress() + ": " + e.getMessage(), e);
}
} else {
long nextRun = transportMapping.getConnectionTimeout() -
(now - entry.getLastUse()) / SnmpConstants.MILLISECOND_TO_NANOSECOND;
if (nextRun < 0) {
nextRun = transportMapping.getConnectionTimeout();
}
if (logger.isDebugEnabled()) {
logger.debug("Scheduling " + new Date(System.currentTimeMillis() + nextRun));
}
SocketTimeout<A> socketTimeout = new SocketTimeout<A>(transportMapping, entry);
entry.setSocketTimeout(socketTimeout);
transportMapping.getSocketCleaner().schedule(socketTimeout, nextRun);
}
}