前几天在一台测试机器上遇到了rmmod失败的景象,通过lsmod能够看到它的援用计数为1。然而我能够确定曾经没有被应用了,所以这应该是一个代码中的bug。
从网上能够找到一篇写得十分好的rmmod失败的剖析文章,外面还提供了一段代码,能够编译出一个ko文件,通过加载这个module并且传入有问题的module名字,能够达到强行问题module援用计数的成果,而后就能够应用rmmod命令来删除它了。
这篇文章的地址是 https://blog.csdn.net/gatieme...
作者也给出了代码的github地址https://github.com/gatieme/LD...
可是外面提供的源代码对应的内核版本是4以上的,我的机器是3.10,所以在编译的时候会遇到语法问题
[root@controller22860 force_rmmod]# makeecho /root/force_rmmod/root/force_rmmodecho 3.10.0-957.10.2.el7.x86_643.10.0-957.10.2.el7.x86_64echo /lib/modules/3.10.0-957.10.2.el7.x86_64/build/lib/modules/3.10.0-957.10.2.el7.x86_64/buildmake -C /lib/modules/3.10.0-957.10.2.el7.x86_64/build M=/root/force_rmmod modulesmake[1]: Entering directory `/usr/src/kernels/3.10.0-957.10.2.el7.x86_64' CC [M] /root/force_rmmod/force_rmmod.o/root/force_rmmod/force_rmmod.c: In function ‘force_cleanup_module’:/root/force_rmmod/force_rmmod.c:85:17: warning: format ‘%u’ expects argument of type ‘unsigned int’, but argument 4 has type ‘long unsigned int’ [-Wformat=] mod->name ,mod->state, module_refcount(mod)); ^In file included from /root/force_rmmod/force_rmmod.c:6:0:/root/force_rmmod/force_rmmod.c:110:46: error: ‘struct module’ has no member named ‘refcnt’ local_set((local_t*)per_cpu_ptr(&(mod->refcnt), cpu), 0); ^include/asm-generic/local.h:29:43: note: in definition of macro ‘local_set’ #define local_set(l,i) atomic_long_set((&(l)->a),(i)) ^include/asm-generic/percpu.h:46:2: note: in expansion of macro ‘__verify_pcpu_ptr’ __verify_pcpu_ptr((__p)); \ ^include/linux/percpu.h:149:31: note: in expansion of macro ‘SHIFT_PERCPU_PTR’ #define per_cpu_ptr(ptr, cpu) SHIFT_PERCPU_PTR((ptr), per_cpu_offset((cpu))) ^/root/force_rmmod/force_rmmod.c:110:29: note: in expansion of macro ‘per_cpu_ptr’ local_set((local_t*)per_cpu_ptr(&(mod->refcnt), cpu), 0); ^compilation terminated due to -Wfatal-errors.
于是我简略剖析了一下3.10的代码,发现它的 struct module 外面的确是没有定义 refcnt 这个成员的。所以须要批改一下 force_rmmod.c 的代码,我把报错那段改成了上面这样:
// 革除驱动的援用计数 int ref_cnt = 0; for_each_possible_cpu(cpu) { //local_set((local_t*)per_cpu_ptr(&(mod->refcnt), cpu), 0); //local_set(__module_ref_addr(mod, cpu), 0); if (per_cpu_ptr(mod->refptr, cpu)->decs) { printk("module has dec %d on cpu %d\n", per_cpu_ptr(mod->refptr, cpu)->decs, cpu); ref_cnt -= per_cpu_ptr(mod->refptr, cpu)->decs; } if (per_cpu_ptr(mod->refptr, cpu)->incs) { //module_put(mod); printk("module has inc %d on cpu %d\n", per_cpu_ptr(mod->refptr, cpu)->incs, cpu); ref_cnt += per_cpu_ptr(mod->refptr, cpu)->incs; } } for(int i = 0; i < ref_cnt; i++) { module_put(mod); }
次要的原理就是通过计算各个cpu上对该模块的援用计数累计失去以后的计数值,而后依照计数值做module_put动作,就能够把援用计数值降到0了,随后就能够失常rmmod了。
我把作者的repo fork之后批改了一下,新的文件在 https://github.com/yzx1983/LD...