关于docker:容器的swap内存

Linux上swap内存对应一块磁盘分区，当内存缓和时，能够把内存中的数据写入到swap分区；当须要读写这部分数据时，又能够将其从swap分区读入内存。

一.容器swap内存的弊病

容器应用swap内存，会导致Memory CGroup对容器内存的限度生效。

比方，在一个开启swap的节点上，启动一个容器：

容器的Memory Cgroup限度=100MB；
容器中的过程继续申请内存，共申请1GB内存；

容器的内存尽管限度在100MB，然而因为应用了swap，导致容器申请1GB内存胜利(应用swap换入换出)，也不会被OOM Kill。

也就是说，在应用Swap的场景下，Memory CGroup对容器的内存限度limit就生效了。

二.容器敞开swap分区（举荐）

启动容器时，减少启动参数--memory-swappiness=0，即可禁止容器应用swap；

--memory-swappiness details
A value of 0 turns off anonymous page swapping.
A value of 100 sets all anonymous pages as swappable.
By default, if you do not set --memory-swappiness, the value is inherited from the host machine.

同时，配置 --memory的值与 --memory-swap的值相等，也能够阻止容器应用swap：

Prevent a container from using swap
If --memory and --memory-swap are set to the same value, this prevents containers from using any swap.
This is because --memory-swap is the amount of combined memory and swap that can be used, while --memory is only the amount of physical memory that can be used.

三.容器应用swap分区

某些程序，可能须要应用Swap空间，能力避免因为偶然的内存忽然减少，而被OOM Kill杀死。

那么，容器过程的内存有哪些组成部分，在内存缓和时，优先选择哪局部写入swap分区呢？

1. 容器过程内存

容器过程内存次要有2局部：

RSS: Resident Set Size
- RSS中的内存，次要是malloc()申请失去的内存，也称为匿名内存(Anonymous memory)；
- 当Swap开启后，在内存缓和时，能够将RSS的内存写入swap空间；
Page Cache:
- 在有磁盘文件拜访的时候，Linux会尽量把零碎的闲暇内存用作Page Cache来提供文件的读写性能；
- 一旦内存不够，就会产生内存回收，回收Page Cache；

那么，当内存缓和时，Linux零碎是如何抉择 开释Page Cahce 还是 将匿名内存写入Swap ？

这由容器的swappiness参数决定。

2. 容器的swappiness参数

Memory CGroup下，每个容器都有memory.swappiness参数：

# cat memory.swappiness60

该参数的值，决定了开释Page Cahce、将匿名内存写入Swap的优先级：

默认=60：
- 优先选择 开释Page Cahce；
当swappiness=100时：
- 开释Page Cahce 和 将匿名内存写入Swap 有雷同的优先级，等比例开释；
当swappiness=0时：
- 禁止应用swap；

参考

1.docker swap参数：https://docs.docker.com/config/containers/resource_constraint...