我碰到了一个完全不相关的seg-fault并开始分析核心.使用gdb,我执行了命令信息线程,并对结果感到非常惊讶.我观察到几个线程实际上是按照预期从地图中读取的,但奇怪的是,在pthread_rwlock_rdlock()中等待rw_lock的几个线程被阻塞了.
以下是等待锁定的线程的堆栈跟踪:
#0 0xffffe430 in __kernel_vsyscall () #1 0xf76fe159 in __lll_lock_wait () from /lib/libpthread.so.0 #2 0xf76fab5d in pthread_rwlock_rdlock () from /lib/libpthread.so.0 #3 0x0804a81a in DiameterServiceSingleton::getDiameterService(void*) ()
有这么多线程,很难说有多少是在阅读,有多少是被阻止的,但我不明白为什么任何线程都会被阻塞等待阅读,考虑到其他线程已经在阅读.
所以这是我的问题:当其他线程已经从中读取时,为什么有些线程被阻塞等待读取rw_lock?似乎可以同时读取的线程数有限制.
我看过pthread_rwlock_attr_t函数并没有看到任何相关的东西.
操作系统是Linux,SUSE 11.
这是相关的代码:
{ pthread_rwlock_init(&serviceMapRwLock_,NULL); } // This method is called for each request processed by the threads Service *ServiceSingleton::getService(void *serviceId) { pthread_rwlock_rdlock(&serviceMapRwLock_); ServiceMapType::const_iterator iter = serviceMap_.find(serviceId); bool notFound(iter == serviceMap_.end()); pthread_rwlock_unlock(&serviceMapRwLock_); if(notFound) { return NULL; } return iter->second; } // This method is only called when the app is starting void ServiceSingleton::addService(void *serviceId,Service *service) { pthread_rwlock_wrlock(&serviceMapRwLock_); serviceMap_[serviceId] = service; pthread_rwlock_unlock(&serviceMapRwLock_); }
更新:
正如MarkB的评论中所提到的,如果我设置了pthread_rwlockattr_getkind_np()来为编写器提供优先级,并且有一个写入器被阻塞等待,那么观察到的行为就有意义了.但是,我使用默认值,我认为是优先考虑读者.我刚刚确认没有线程被阻止等待写入.我还在评论中更新@Shahbaz建议的代码,并得到相同的结果.
解决方法
Edit: Reading the source,
glibc
useslll_lock
to protect critical sections within its own pthread library data structures. Thepthread_rwlock_rdlock
checks several flags and increments counters,so it does those things while holding a lock. Once those are done,the lock is released withlll_unlock
.
为了演示,我实现了一个在获取rwlock后休眠的短例程.主线程等待它们完成.但在等待之前,它会打印线程实现的并发性.
enum { CONC = 50 }; pthread_rwlock_t rwlock; pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; pthread_cond_t cond = PTHREAD_COND_INITIALIZER; unsigned count; void *routine(void *arg) { int *fds = static_cast<int *>(arg); pthread_rwlock_rdlock(&rwlock); pthread_mutex_lock(&mutex); ++count; if (count == CONC) pthread_cond_signal(&cond); pthread_mutex_unlock(&mutex); sleep(5); pthread_rwlock_unlock(&rwlock); pthread_t self = pthread_self(); write(fds[1],&self,sizeof(self)); return 0; }
并且主线程等待计数器达到50:
int main() { int fds[2]; pipe(fds); pthread_rwlock_init(&rwlock,0); pthread_mutex_lock(&mutex); for (int i = 0; i < CONC; i++) { pthread_t tid; pthread_create(&tid,NULL,routine,fds); } while (count < CONC) pthread_cond_wait(&cond,&mutex); pthread_mutex_unlock(&mutex); std::cout << "count: " << count << std::endl; for (int i = 0; i < CONC; i++) { pthread_t tid; read(fds[0],&tid,sizeof(tid)); pthread_join(tid,0); } pthread_rwlock_destroy(&rwlock); pthread_exit(0); }
编辑:使用C 11线程支持简化示例:
enum { CONC = 1000 }; std::vector<std::thread> threads; pthread_rwlock_t rwlock; std::mutex mutex; std::condition_variable cond; unsigned count; void *routine(int self) { pthread_rwlock_rdlock(&rwlock); { std::unique_lock<std::mutex> lk(mutex); if (++count == CONC) cond.notify_one(); } sleep(5); pthread_rwlock_unlock(&rwlock); return 0; } int main() { pthread_rwlock_init(&rwlock,0); { std::unique_lock<std::mutex> lk(mutex); for (int i = 0; i < CONC; i++) { threads.push_back(std::thread(routine,i)); } cond.wait(lk,[](){return count == CONC;}); } std::cout << "count: " << count << std::endl; for (int i = 0; i < CONC; i++) { threads[i].join(); } pthread_rwlock_destroy(&rwlock); pthread_exit(0); }