I am a snmp4j and snmp4j-agent user, currently I found a in ThreadPool, let’s talk about the background.
We built up a snmp agent process to handle outer snmp request, but customer report that out process will no response after 10 hours, the only way to recover the system is restart the process. there is no any logs can help debug this issue. after 2 months struggle we found the root cause, there has an Exception by our product code but it not will handled, the exception popup until to TaskManger, there is also no try…catch in TaskManager’s run method, so it caused the thread dead. The worst is ThreadPool cannot recreate a new TaskManager process to replace the dead thread, so when the exceptions raise, there reduce a live thread from ThreadPool until the pool is empty, after that snmp cannot response outer requests.
the first issue I found from ThreadPool.execute. line 93-96, if set respawnThreads = true, the pool will create a new TaskManger to replace the old one, but the issue is not call the start() method.
The second issue is in TaskManager.run, there should have a try…catch to surround task.run(), and print out the exceptions when there has something wrong. you cannot trust sub-processes are always right.
My JAR package info:
Catching exceptions does not help, because the worker task cannot continue anyway. Only if you catch exceptions within the worker task you might have a chance to handle the issue and continue.
Therefore it is intended behavior to not catch exceptions of worker tasks in the ThreadPool in order to not hide such problems.
Regarding the respawning issue, I will check the code. Respawning is only possible, if a TaskManager ended regularly. I think this requirement was not met in your case, but I will nevertheless check it in detail.
Thanks for you reply. In this case, we implement the interface CommandProcessor to add our own logic in processRequest, but not well handled the exceptions make it throw out of the current task. So I agree with you that the task should handle its exceptions.
Please check the code in ThreadPool.execute when set respawnThreads = true, if a TaskManager thread was stopped, the execute method only create a new TaskManager instance to replace the dead one, but the thread still not in running state.
Yes, the respawned threads are not started. Because the respawning is by default deactivated, this probably was never recognised. Unfortunately, the JavaDoc contract did not explicitly mention, that the respawned thread is started too. That would be logical and I have fixed it that way for version 3.5.1.