关于redis:技术干货-高负载压测下接口异常问题定位排查Redis

背景：

xx业务接口次要为获取全国范畴地区信息，实在生产场景是调用频繁数据量大，因而须要对该业务接口做性能测试，确认接口性能及高负载下承受能力。

接口解决逻辑：获取全国范畴地区信息，第一次从mysql获取信息，获取到信息后hset到redis，前面的获取信息都走redis获取并返回接口数据。

问题：

20并发对该接口进行继续加压，压力负载一直晋升，压力机端监控到返回错误信息，连贯失败（10并发失常），

应用服务抛出异样：getList:merchant:area:list error

redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool

定位排查：

从利用抛出异样信息看出，无奈获取更多的redis线程池资源，但20并发还未造成高强度的压力，进一步排查：

硬件资源：服务器资源利用率失常，cpu、内存，磁盘等比拟短缺，排查资源影响

Redis配置及性能：查看redis连接池配置redis.pool.maxIdle=300，redis.pool.maxTotal=600，设置足够大，在20并发继续压测下，该最大连接数已足够大，但依然抛出redis连接池异样，应存在其余方面因素影响，持续排查

redis连接数失常，netstat -nap |grep redis |wc -l，100多个流动连贯。

redis -info查看redis信息连贯失常，失常连贯100多个。

redis -monitor获取数据失常，get和hget数据均失常。

查看redis日志，找到问题如下问题：

WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.

优化解决：

从reid日志报错信息看出，需调节linux内核参数

vim /etc/sysctl.conf ，减少vm.overcommit_memory=1，而后sysctl -p 使配置文件失效

备注：内核参数vm.overcommit_memory代表内存调配策略，取值为0、1、2：

示意内核将查看是否有足够的可用内存供给用过程应用；

如果有足够的可用内存，内存申请容许；

否则，内存申请失败，并把谬误返回给利用过程。

1，示意内核容许调配所有的物理内存，而不论以后的内存状态如何。

2，示意内核容许调配超过所有物理内存和替换空间总和的内存

相干问题信息：

① 当jedispool中的jedis被取完期待超过你设置的 MaxWaitMillis 就会抛出Could not get a resource from the pool

从GenericObjectPool 源代码borrowObject(long borrowMaxWaitMillis)办法能够看出

if(p == null) {

if(borrowMaxWaitMillis < 0L) {

p = (PooledObject)this.idleObjects.takeFirst();

} else {

waitTime = System.currentTimeMillis();

p = (PooledObject)this.idleObjects.pollFirst(borrowMaxWaitMillis, TimeUnit.MILLISECONDS);

waitTime = System.currentTimeMillis() - waitTime; //等待时间超过 borrowMaxWaitMillis 的时候 p =null

}

if(p == null) {

throw new NoSuchElementException("Timeout waiting for idle object"); 抛出异样被pool这个类捕获

}

pool 源码

try {

return this.internalPool.borrowObject();

} catch (Exception var2) {

throw new JedisConnectionException("Could not get a resource from the pool", var2);

}

所以只有把jedis配置MaxWaitMillis 设置的大一点就能够升高因为MaxWaitMillis 导致的

Could not get a resource from the pool ，设置太大会造高负载并发下硬件性能的大量开销，可依据压测指标并发数据失去正当的参数设置，及达到较好性能也不至于服务器资源适度耗费

②、放慢从jedispool中获取get jedis 和return jedis的速度

设置 testOnBorrow、testOnReturn 都改为false

在这两个配置为true的状况下 get 、 return jedis的时候 jedis 将ping 一下redis。

从GenericObjectPool 源代码borrowObject(long borrowMaxWaitMillis)办法能够看出：

if(p != null && (this.getTestOnBorrow() || create && this.getTestOnCreate())) {

boolean validate = false;

Throwable validationThrowable1 = null;

try {

validate = this.factory.validateObject(p);

获取间接先验证是否能够用

} catch (Throwable var13) {

PoolUtils.checkRethrow(var13);

validationThrowable1 = var13;

}

JedisFactory 源代码validateObject(PooledObjectpooledJedis)办法能够看出

BinaryJedis jedis = (BinaryJedis)pooledJedis.getObject();

try {

return jedis.isConnected() && jedis.ping().equals("PONG");

} catch (Exception var4) {

return false;

}

根本以上这么批改就可能解决Could not get a resource from the pool

操作jedis 的时候设置 testOnBorrow、testOnReturn 都改为false ，要比true 快上1.4倍，但可能带来的问题可进一步钻研。