RequestList::add_request not responding

Hello Frank,

We have a requirement to send continuous snmp requests. What happens now is the thread get hangs in Agentpp::RequestList::add_request(Agentpp::Request * req) method and didn’t respond.
Could you please let us know if we need to provide any timeout between each request?
How to avoid duplicate requests while adding?
How to print the logs in the console?

Regards
Rahul

Hi Rahul,

You can check the example agents in the “examples” directory of the AGENT++ source code on how to activate logging and how the RequestList::add_request method is being used.
The question about “timeout between requests” is not clear to me? Are you calling add_request yourself programmatically (i.e, not based on decoded SNMP message from network)?
In my opinion, RequestList::add_request will never hang. It might wait for a thread in the ThreadPool to be available to process the request. But that is all. You can increase the number of threads in the pool to get more concurrency.

Best regards,
Frank

Hi Frank,
Thanks for your quick response.
What I meant by timeout is, Should we add any minimum delay between processing each request?
How do we get the current thread pool size and which API shall be used to increase the number of threads?
What’s the maximum thread pool size?

Regards
Rahul

Hi Rahul,

You can use Mib::set_thread_pool. There is not maximum thread pool size besides the available resources of your system.

Best regards,
Frank

Hi Frank,

Thanks for your reply. The default thread pool size is 4. Now I have changed it to use our own thread pool of size 8. Looks like its performance is better now.

Regards
Rahul

1 Like

Hello Frank,

The issue got reproduced after some more trials. I have enabled the agent pp logging.
It shows the below warning,

19700101.00:30:55: -1388104624: (9)DEBUG  : Synchronized: try lock busy (id)(ptr): (354), (21119904)

Please find the last few portions of the logs below.

19700101.00:30:51: -1431309232: (7)DEBUG  : Vacm: Access requested for: (viewName) (oid): (defaultViewNtcip), (1.3.6.1.4.1.1206.4.2.7.4.1.0)
19700101.00:30:51: -1431309232: (7)DEBUG  : Vacm: isInMibView: (viewName) (subtree): (defaultViewNtcip), (1.3.6.1.4.1.1206.4.2.7.4.1.0)
19700101.00:30:51: -1431309232: (8)DEBUG  : VacmViewTreeFamilyTable: isInMibView: (viewName) (match): (defaultViewNtcip), (v1ReadView)
19700101.00:30:51: -1431309232: (8)DEBUG  : VacmViewTreeFamilyTable: isInMibView: (viewName) (match): (defaultViewNtcip), (v1WriteView)
19700101.00:30:51: -1431309232: (8)DEBUG  : VacmViewTreeFamilyTable: isInMibView: (viewName) (match): (defaultViewNtcip), (v1NotifyView)
19700101.00:30:51: -1431309232: (8)DEBUG  : VacmViewTreeFamilyTable: isInMibView: (viewName) (match): (defaultViewNtcip), (newView)
19700101.00:30:51: -1431309232: (8)DEBUG  : VacmViewTreeFamilyTable: isInMibView: (viewName) (match): (defaultViewNtcip), (testView)
19700101.00:30:51: -1431309232: (8)DEBUG  : VacmViewTreeFamilyTable: isInMibView: (viewName) (match): (defaultViewNtcip), (internet)
19700101.00:30:51: -1431309232: (8)DEBUG  : VacmViewTreeFamilyTable: isInMibView: (viewName) (match): (defaultViewNtcip), (restricted)
19700101.00:30:51: -1431309232: (8)DEBUG  : VacmViewTreeFamilyTable: isInMibView: (viewName) (match): (defaultViewNtcip), (defaultViewNtcip)
19700101.00:30:51: -1431309232: (9)DEBUG  : Vacm: isInMibView: access allowed
19700101.00:30:51: -1431309232: (9)DEBUG  : Synchronized created (id)(ptr): (3290), (-1389344736)
19700101.00:30:51: -1431309232: (9)DEBUG  : Synchronized created (id)(ptr): (3291), (-1431312296)
19700101.00:30:51: -1431309232: (2)DEBUG  : LockQueue: adding lock request (ptr): (-1389344736)
19700101.00:30:51: -1383947184: (8)DEBUG  : Synchronized: try lock success (id)(ptr): (3290), (-1389344736)
19700101.00:30:51: -1383947184: (8)DEBUG  : LockQueue: lock (ptr)(pending): (-1389344736), (1)
19700101.00:30:51: -1383947184: (9)DEBUG  : LockQueue: waiting for next event (pending): (0)

19700101.00:30:55: -1388104624: (9)DEBUG  : Synchronized: try lock busy (id)(ptr): (354), (21119904)
19700101.00:30:55: -1388104624: (9)DEBUG  : LockQueue: waiting for next event (pending): (1)
19700101.00:30:56: -1383947184: (9)DEBUG  : LockQueue: waiting for next event (pending): (0)
19700101.00:31:00: -1388104624: (9)DEBUG  : Synchronized: try lock busy (id)(ptr): (354), (21119904)
19700101.00:31:00: -1388104624: (9)DEBUG  : LockQueue: waiting for next event (pending): (1)
19700101.00:31:01: -1383947184: (9)DEBUG  : LockQueue: waiting for next event (pending): (0)

By any chance, the add_request API could wait in this scenario? Because I didn’t see any logs after that. I had put some extra logs in my application.
We are using AgentPP version 4.1.2, Is there any changes related to this scenario committed in the latest releases?
The issue here is if the thread pool size is 8 and you give more than 10 requests at a time, The thread never comes out of the add_request call.
Please let me know if you need any more information.

Hello Frank,

Can you please check my latest comment and provide your feedback?
Because I am stuck with the issue, Didn’t know how to proceed further.

Link: https://forum.snmp.app/t/requestlist-add-request-not-responding/931/6?u=rahul

Regards
Rahul

Hi,

there are a few notes regarding threads in the change log of agent++: https://agentpp.com/download/changes_agent++.txt

It would help if you can connect with a debugger to the blocked application and show us the backtrace of all stack frames that include agent++ functions. In gdb this would be the output of the command thread apply all bt.

Kind regards,
Jochen

Most likely, you are creating a dead-lock situation in your instrumentation code. That is why, even with many threads, processing blocks forever.
Thus, please follow the suggested actions by Jochen and check if you violate the lock order shown by What is the lock order for Mib objects to safely access them in a multi-threaded agent? - AGENT++ - AGENTPP in your instrumentation code.

Hello Frank,
Our instrumentation code snippet for processing the request is from the examples in agent pp.
Please find the main loop below.

	Request* req;
	while (run) {	 
		req = reqList->receive(2);
		if (req) {
		    mib->process_request(req);
		}
		else {
		    mib->cleanup();
		}
	}

The only difference is we have extended the RequestList class to override RequestList::receive and Request::answer methods to add some custom code.
So do we violate the lock order here?

Regards
Rahul

Hi Rahul,

No, (hopefully) not. That code isn’t the instrumentation of the Mib objects. That code is what you put in overwriting MibLeaf::commit_set_request and ::get_value for example.

Best regards,
Frank

Hello Frank,

We have added extra logging in the agent pp ( Version 4.1.2 ) source code. We have the following observations and queries.

  1. When we sent continuous requests, add_request() call gets blocked /waits in pthread_mutex_lock(&monitor) API in Synchronized::lock() (src/threads.cpp) function. Because of this it never enters the add_request function body.
  2. In that scenario ThreadManager::end_synch() is also not called.
  3. We have increased the thread pool size to 8 but still the thread waits for the mutex to become available. Not sure how long.
  4. How does the thread pool work if we get requests more than the thread pool size in a minimum time interval?

Could you please provide your suggestion and feedback as soon as possible?

Regards
Rahul

First of all you should update AGENT++ to the latest version (4.6.0) to avoid observing bugs that have been fixed already.
The request list is blocking until a thread of the pool gets available again. So the blocking is indeed expected if the processing within the worker thread (instrumentation code) takes too long. But it will not block forever unless there is an endless loop/waiting in the instrumentation code.

Hello Frank,

As you said we have updated agent++, agentx++, and snmp++ to the latest.
But still, the issue occurs, the add_request() call gets blocked /waits in pthread_mutex_lock(&monitor) API in Synchronized::lock() (src/threads.cpp) function. Could you please suggest a better way to handle continuous requests? What and all things do we need to take care of?

Regards
Rahul

Hi,

without knowing the stack trace of each thread of your application, we cannot help. If you want to analyse it, check for every thread that is waiting for a lock which other thread is holding that lock. The same has to be done for the threads waiting for a condition.

Kind regards,
Jochen

Hi jkatz, Hi Frank,
We have collected the stack trace of each thread. Please find the trace below.

Starting program: /mnt/Repositories/RiseFirmware_V1.70/Projects/LegacyServer/Debug/LegacyServer 
warning: File "/lib/libthread_db-1.0.so" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
To enable execution of this file add
	add-auto-load-safe-path /lib/libthread_db-1.0.so
line to your configuration file "//.gdbinit".
To completely disable this security protection add
	set auto-load safe-path /
line to your configuration file "//.gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
	info "(gdb)Auto-loading safe path"
warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.


Thread 1 "LegacyServer" received signal SIGTSTP, Stopped (user).
0xb6fb85e8 in ?? () from /lib/libpthread.so.0

Thread 39 (LWP 23767):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 38 (LWP 23746):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 37 (LWP 23709):
#0  0xb6fb6c9c in ?? () from /lib/libpthread.so.0
#1  0xb6fb004c in pthread_mutex_lock () from /lib/libpthread.so.0
#2  0xb6fb004c in pthread_mutex_lock () from /lib/libpthread.so.0
#3  0xb6595048 in Agentpp::Synchronized::lock() () from /usr/lib/libagent++.so.46
#4  0xb6593bf8 in Agentpp::ThreadManager::start_synch() () from /usr/lib/libagent++.so.46
#5  0xb6593d68 in Agentpp::ThreadSynchronize::ThreadSynchronize(Agentpp::ThreadManager&) () from /usr/lib/libagent++.so.46
#6  0xb6578e50 in Agentpp::RequestList::add_request(Agentpp::Request*) () from /usr/lib/libagent++.so.46
#7  0x003ba4cc in LegacyServerApp::RequestListEx::receive (this=0x6cdfc8, timeOut=500) at agent++/Extended/RequestListEx.hpp:280
#8  0x00438590 in LegacyServerApp::NtcIpHandler::ProcessMibMessage (this=0x6d1010) at NTCIP/NtcIpHandler.cpp:1240
#9  0x0044d478 in Poco::RunnableAdapter<LegacyServerApp::NtcIpHandler>::run (this=0x6d10e0) at ../../Libs/poco/include/Poco/RunnableAdapter.h:64
#10 0x0044d3a0 in Poco::Activity<LegacyServerApp::NtcIpHandler>::run (this=0x6d10d8) at ../../Libs/poco/include/Poco/Activity.h:179
#11 0xb6c0cac4 in Poco::PooledThread::run() () from /usr/lib/libPocoFoundation.so.60
#12 0xb6c083d8 in Poco::ThreadImpl::runnableEntry(void*) () from /usr/lib/libPocoFoundation.so.60
#13 0xb6fad0c0 in ?? () from /lib/libpthread.so.0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 36 (LWP 23708):
#0  0xb6143804 in epoll_wait () from /lib/libc.so.6
#1  0xb6a27980 in Poco::Net::SocketImpl::poll(Poco::Timespan const&, int) () from /usr/lib/libPocoNet.so.60
#2  0x0022f814 in Poco::Net::Socket::poll (this=0x1516a44, timeout=..., mode=1) at ../../Libs/poco/include/Poco/Net/Socket.h:374
#3  0x0024b0b8 in LegacyServerApp::UdpServerConnection::ReceiveData (this=0x1517130) at Connections/UdpServerConnection.hpp:223
#4  0x002618ac in Poco::RunnableAdapter<LegacyServerApp::UdpServerConnection>::run (this=0x1517160) at ../../Libs/poco/include/Poco/RunnableAdapter.h:64
#5  0x002617d4 in Poco::Activity<LegacyServerApp::UdpServerConnection>::run (this=0x1517158) at ../../Libs/poco/include/Poco/Activity.h:179
#6  0xb6c0cac4 in Poco::PooledThread::run() () from /usr/lib/libPocoFoundation.so.60
#7  0xb6c083d8 in Poco::ThreadImpl::runnableEntry(void*) () from /usr/lib/libPocoFoundation.so.60
#8  0xb6fad0c0 in ?? () from /lib/libpthread.so.0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 35 (LWP 23707):
#0  0xb6143804 in epoll_wait () from /lib/libc.so.6
#1  0xb6a27980 in Poco::Net::SocketImpl::poll(Poco::Timespan const&, int) () from /usr/lib/libPocoNet.so.60
#2  0xb6a2d1b4 in Poco::Net::TCPServer::run() () from /usr/lib/libPocoNet.so.60
#3  0xb6c083d8 in Poco::ThreadImpl::runnableEntry(void*) () from /usr/lib/libPocoFoundation.so.60
#4  0xb6fad0c0 in ?? () from /lib/libpthread.so.0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 34 (LWP 23706):
#0  0xb6fb3b44 in pthread_cond_wait () from /lib/libpthread.so.0
#1  0xb6b982f8 in Poco::EventImpl::waitImpl() () from /usr/lib/libPocoFoundation.so.60
#2  0xb6c0cb54 in Poco::PooledThread::run() () from /usr/lib/libPocoFoundation.so.60
#3  0xb6c083d8 in Poco::ThreadImpl::runnableEntry(void*) () from /usr/lib/libPocoFoundation.so.60
#4  0xb6fad0c0 in ?? () from /lib/libpthread.so.0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 33 (LWP 23697):
#0  0xb6fb3b44 in pthread_cond_wait () from /lib/libpthread.so.0
#1  0xb6594864 in Agentpp::Synchronized::cond_timed_wait(timespec const*) () from /usr/lib/libagent++.so.46
#2  0xb65947ec in Agentpp::Synchronized::wait() () from /usr/lib/libagent++.so.46
#3  0xb6596d90 in Agentpp::TaskManager::run() () from /usr/lib/libagent++.so.46
#4  0xb6595d10 in Agentpp::thread_starter(void*) () from /usr/lib/libagent++.so.46
#5  0xb6fad0c0 in ?? () from /lib/libpthread.so.0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 32 (LWP 23696):
#0  0xb6fb3b44 in pthread_cond_wait () from /lib/libpthread.so.0
#1  0xb6594864 in Agentpp::Synchronized::cond_timed_wait(timespec const*) () from /usr/lib/libagent++.so.46
#2  0xb65947ec in Agentpp::Synchronized::wait() () from /usr/lib/libagent++.so.46
#3  0xb6596d90 in Agentpp::TaskManager::run() () from /usr/lib/libagent++.so.46
#4  0xb6595d10 in Agentpp::thread_starter(void*) () from /usr/lib/libagent++.so.46
#5  0xb6fad0c0 in ?? () from /lib/libpthread.so.0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 31 (LWP 23695):
#0  0xb6fb6c9c in ?? () from /lib/libpthread.so.0
#1  0xb6fb004c in pthread_mutex_lock () from /lib/libpthread.so.0
#2  0xb6fb004c in pthread_mutex_lock () from /lib/libpthread.so.0
#3  0xb6595048 in Agentpp::Synchronized::lock() () from /usr/lib/libagent++.so.46
#4  0xb6593bf8 in Agentpp::ThreadManager::start_synch() () from /usr/lib/libagent++.so.46
#5  0xb6593d68 in Agentpp::ThreadSynchronize::ThreadSynchronize(Agentpp::ThreadManager&) () from /usr/lib/libagent++.so.46
#6  0x003ba7c8 in LegacyServerApp::RequestListEx::answer (this=0x6cdfc8, pRequest=0xad601260) at agent++/Extended/RequestListEx.hpp:309
#7  0xb656161c in Agentpp::Mib::finalize(Agentpp::Request*) () from /usr/lib/libagent++.so.46
#8  0xb63f7b2c in Agentpp::MasterAgentXMib::finalize(Agentpp::Request*) () from /usr/lib/libagentx++.so.25
#9  0xb655f694 in Agentpp::Mib::do_process_request(Agentpp::Request*) () from /usr/lib/libagent++.so.46
#10 0xb65980b4 in Agentpp::MibTask::run() () from /usr/lib/libagent++.so.46
#11 0xb6596b98 in Agentpp::TaskManager::run() () from /usr/lib/libagent++.so.46
#12 0xb6595d10 in Agentpp::thread_starter(void*) () from /usr/lib/libagent++.so.46
#13 0xb6fad0c0 in ?? () from /lib/libpthread.so.0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 30 (LWP 23694):
#0  0xb6fb6c40 in ?? () from /lib/libpthread.so.0
#1  0xb6fb6c74 in ?? () from /lib/libpthread.so.0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 29 (LWP 23692):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 28 (LWP 23691):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 27 (LWP 23690):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 26 (LWP 23689):
#0  0xb6fb3b44 in pthread_cond_wait () from /lib/libpthread.so.0
#1  0xb6594864 in Agentpp::Synchronized::cond_timed_wait(timespec const*) () from /usr/lib/libagent++.so.46
#2  0xb65947ec in Agentpp::Synchronized::wait() () from /usr/lib/libagent++.so.46
#3  0xb6596d90 in Agentpp::TaskManager::run() () from /usr/lib/libagent++.so.46
#4  0xb6595d10 in Agentpp::thread_starter(void*) () from /usr/lib/libagent++.so.46
#5  0xb6fad0c0 in ?? () from /lib/libpthread.so.0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 25 (LWP 23688):
#0  0xb6fb3b44 in pthread_cond_wait () from /lib/libpthread.so.0
#1  0xb6594864 in Agentpp::Synchronized::cond_timed_wait(timespec const*) () from /usr/lib/libagent++.so.46
#2  0xb65947ec in Agentpp::Synchronized::wait() () from /usr/lib/libagent++.so.46
#3  0xb6596d90 in Agentpp::TaskManager::run() () from /usr/lib/libagent++.so.46
#4  0xb6595d10 in Agentpp::thread_starter(void*) () from /usr/lib/libagent++.so.46
#5  0xb6fad0c0 in ?? () from /lib/libpthread.so.0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 24 (LWP 23687):
#0  0xb6fb3b44 in pthread_cond_wait () from /lib/libpthread.so.0
#1  0xb6594864 in Agentpp::Synchronized::cond_timed_wait(timespec const*) () from /usr/lib/libagent++.so.46
#2  0xb65947ec in Agentpp::Synchronized::wait() () from /usr/lib/libagent++.so.46
#3  0xb6596d90 in Agentpp::TaskManager::run() () from /usr/lib/libagent++.so.46
#4  0xb6595d10 in Agentpp::thread_starter(void*) () from /usr/lib/libagent++.so.46
#5  0xb6fad0c0 in ?? () from /lib/libpthread.so.0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 23 (LWP 23686):
#0  0xb6fb3b44 in pthread_cond_wait () from /lib/libpthread.so.0
#1  0xb6594864 in Agentpp::Synchronized::cond_timed_wait(timespec const*) () from /usr/lib/libagent++.so.46
#2  0xb65947ec in Agentpp::Synchronized::wait() () from /usr/lib/libagent++.so.46
#3  0xb6596d90 in Agentpp::TaskManager::run() () from /usr/lib/libagent++.so.46
#4  0xb6595d10 in Agentpp::thread_starter(void*) () from /usr/lib/libagent++.so.46
#5  0xb6fad0c0 in ?? () from /lib/libpthread.so.0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 22 (LWP 23685):
#0  0xb613c434 in select () from /lib/libc.so.6
#1  0xb6150ad8 in ?? () from /lib/libc.so.6
#2  0x006cb738 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 21 (LWP 23645):
#0  0xb6fb7c30 in nanosleep () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x000186a0 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 20 (LWP 23641):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 19 (LWP 23210):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 18 (LWP 22920):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 17 (LWP 22919):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 16 (LWP 22918):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 15 (LWP 22917):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 14 (LWP 22916):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 13 (LWP 22915):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 12 (LWP 22914):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 11 (LWP 22913):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 10 (LWP 22912):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 9 (LWP 22911):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 8 (LWP 22900):
#0  0xb6fb767c in recvfrom () from /lib/libpthread.so.0
#1  0x00215374 in System::UnixDomainServer::RunActivity (this=0x69ec20) at ../../Common/InterProcessComm/UnixDomainServer.cpp:201
#2  0x0021692c in Poco::RunnableAdapter<System::UnixDomainServer>::run (this=0x69ec80) at ../../Libs/poco/include/Poco/RunnableAdapter.h:64
#3  0x00216854 in Poco::Activity<System::UnixDomainServer>::run (this=0x69ec78) at ../../Libs/poco/include/Poco/Activity.h:179
#4  0xb6c0cac4 in Poco::PooledThread::run() () from /usr/lib/libPocoFoundation.so.60
#5  0xb6c083d8 in Poco::ThreadImpl::runnableEntry(void*) () from /usr/lib/libPocoFoundation.so.60
#6  0xb6fad0c0 in ?? () from /lib/libpthread.so.0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 7 (LWP 22896):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 6 (LWP 22895):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 5 (LWP 22894):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 4 (LWP 22892):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 3 (LWP 22870):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 2 (LWP 22869):
#0  0xb6fb3e7c in pthread_cond_timedwait () from /lib/libpthread.so.0
#1  0xb6fb6a54 in ?? () from /lib/libpthread.so.0
#2  0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 1 (LWP 22658):
#0  0xb6fb85e8 in ?? () from /lib/libpthread.so.0
#1  0xb6fb867c in sigwait () from /lib/libpthread.so.0
#2  0xb6977464 in Poco::Util::ServerApplication::waitForTerminationRequest() () from /usr/lib/libPocoUtil.so.60
#3  0x004b7560 in LegacyServerApp::MainApp::main (this=0xbefffad8, args=...) at main.cpp:172
#4  0xb695fea4 in Poco::Util::Application::run() () from /usr/lib/libPocoUtil.so.60
#5  0x004b353c in main (argc=1, argv=0xbefffcb4) at main.cpp:203

As you could see for threads 37 and 31, there are two "pthread_mutex_lock " calls. Since the “monitor” mutex is a normal mutex, trying to lock the locked mutex will cause a dead lock.
P.S : There is no crash observed in the scenario so we have killed the application manually to collect the stack trace.
Could you please provide your feedback?

Regards
Rahul

Hi Rahul

agent++/Extended/RequestListEx.hpp

is not AGENT++ code and should be checked first. Because it is a header C++ file, inconsistent PTHREAD compiler switches might cause this error too. Please use the default setting for

define NO_FAST_MUTEXES

If you compiled files using this macro (or other threading related settings) definition inconsistently, severe problems like this may appear.
Other reasons for this behaviour could be:

  1. Your code (instrumentation, RequestListEx, main, etc.) has locked the RequestList instance and not unlocked it. To find the thread/stack trace that holds the lock you can use: Technical Collection - Debugging Mutex
  2. Memory corruption

Hello Frank,

As per your feedback, we have rechecked our code and found that we were locking the RequestList instance, and it is not been unlocked properly.
Now we have modified our code and looks like it’s working perfectly. Thanks for your help and support.
Have a nice day!

Regards
Rahul

1 Like