环境描述

自行下载rpm包本地安装或者wget创建PostgreSQL的源文件。本次记录为内网服务器,rpm本地安装,版本为PostgreSQL-9.6,经测试,9.6主从流复制可切换,9.4测试未果。

位于/share/rpms/psql-need/目录下。

机器规划

Master server

系统版本机器配置主机名IP
CentOS Linux release 7.4.170848C/256G/33TBBus-App-21190.176.35.170
CentOS Linux release 7.4.170848C/256G/33TBBus-App-22190.176.35.171

Master Server

以下命令在rpm包目录下执行。

# yum install ./postgresql96\*# mkdir -p /data/pgsql-9.6/archive# chown postgres.postgres /data/pgsql-9.6# chmod 0700 /data/pgsql-9.6# vim /usr/lib/systemd/system/postgresql-9.6.serviceEnvironment=pgsql-9.6=/data/pgsql-9.6/# /usr/pgsql-9.6/bin/initdb -D /data/pgsql-9.6# systemctl enable postgresql-9.6# systemctl start postgresq-9.6# passwd postgres# su postgres$ psql –U postgres=$ alter user postgres with password ‘Ur passwd’;=$\q# vim /data/pgsql-9.6/postgresql.conflisten_addresses = '*'max_connections = 500superuser_reserved_connections = 5shared_buffers = 128MBdynamic_shared_memory_type = posixwal_level = hot_standbysynchronous_commit = localarchive_mode = onarchive_command = 'cp %p /data/pgsql-9.6/archive/%f'max_wal_senders = 5# vim /data/pgsql-9.6/pg_hba.confhost    replication     postgres        127.0.0.1/32              md5host    replication     postgres        190.176.35.0/24                 md5# systemctl restart postgresql-9.6

Slave Server

以下命令在rpm目录下执行。

# yum install postgresql96*# vim /usr/lib/systemd/system/postgresql-9.6.serviceEnvironment=pgsql-9.6=/data/pgsql-9.6/# mkdir -p /data/pgsql-9.6/archive# chown postgres.postgres /data/pgsql-9.6#chmod 0700 /data/pgsql-9.6# su postgres$ pg_basebackup -h 190.176.35.170 -U postgres -D /data/pgsql-9.6 -P --xlog$ vim /data/pgsql-9.6/postgresql.confhot_standby = onmax_connections = 510 #于主库数值max_wal_senders = 5$ cp /usr/pgsql-9.6/share/recovery.conf.sample /data/pgsql-9.6/$ mv /data/pgsql-9.6/recovery.conf.sample /data/pgsql-9.6/recovery.conf$ vim /data/pgsql-9.6/recovery.confrecovery_target_timeline = 'latest'standby_mode = onprimary_conninfo = 'host=主库’s IP port=5432 user=postgres password=密码'trigger_file = '/data/pgsql-9.6/trigger.pg$ exit# systemctl enable postgresql-9.6# systemctl start postgresql-9.6

测试

在Master所在服务器查看状态:

$ psql -c "select application_name,state,sync_priority,sync_state from pg_stat_replication"application_name |   state   | sync_priority | sync_state------------------+-----------+---------------+------------ Bus-App-22       | streaming |             1 | sync(1 row)

测试复制功能

Keepalived-1.2

自行安装软件并测试使用

配置文件配置exm如下

# cat /etc/keepalived/keepalived.confglobal_defs {   smtp_server 127.0.0.1   smtp_connect_timeout 30   router_id pg}vrrp_script chk_pgsql {   script "/etc/keepalived/scripts/pgsql_check.sh"   interval 1   weight 2}vrrp_instance VI_1 {    state MASTER #备机为BACKUP    interface enp2s0f0    virtual_router_id 61    priority 100 #备机为80    nopreempt    advert_int 1    authentication {        auth_type PASS        auth_pass 1111    }    track_script {       chk_pgsql    }    virtual_ipaddress {        190.176.35.210    }}

测试vip的浮动

Keepalived脚本

脚本路径可自己在keeplived.conf中定义,注意脚本权限。

# cat /etc/keepalived/scripts/pgsql_check.sh#!/bin/bash#判断pg是否活着A=`ps -C postmaster --no-header | wc -l`#判断vip浮到哪里B=`ip a | grep 190.176.35.210 | wc -l`#判断是否是从库处于等待的状态C=`ps -ef | grep postgres | grep 'startup process' | wc -l`#判断从库链接主库是否正常D=`ps -ef | grep postgres | grep 'receiver' | wc -l`#判断主库连接从库是否正常E=`ps -ef | grep postgres | grep 'sender' | wc -l`#如果pg死了,将消息写入日记并且关闭keepalivedif [ $A -eq 0 ];then    echo "\`date "+%Y-%m-%d--%H:%M:%S"\` postgresql stop so vip stop " >> /etc/keepalived/log/check_pg.log    systemctl stop keepalivedelse        #判断出主挂了,vip浮到了从,创建trigger,pg备机进入recovery模式,赋予读写权限        if [ $B -eq 1 -a $C -eq 1 -a $D -eq 0 \];then                su - postgres -c "touch /data/pgsql-9.6/[trigger.pg](http://trigger.pg/)"                echo "`date "+%Y-%m-%d--%H:%M:%S"` standby promote " >> /etc/keepalived/log/check_pg.log        fi        #判断出自己是主并且和从失去联系        if [ $B -eq 1 -a $C -eq 0 -a $D -eq 0 -a $E -eq 0 ];then                                sleep 10                echo "`date "+%Y-%m-%d--%H:%M:%S"` can't find standby " >> /etc/keepalived/log/check_pg.log        fifi

以下命令为手动指定keepalived配置文件,并启动的命令

# /usr/local/keepalived/sbin/keepalived -D -f /usr/local/keepalived/etc/keepalived/keepalived.conf

异步主从流复制主从切换与恢复MASTER

以下红色标记的命令为流复制主从切换,与恢复MASTER的操作

模拟关闭主库,演练MASTER服务宕机

# systemcrl stop postgresql-9.6

SLAVE上操作

在从库创建trigger(此步骤部署于高可用脚本中,触发自动推读写数据库切换)

$ touch /data/pgsql-9.6/trigger.pg

旧备库recovery.conf变为recovery.done,旧备库成为新主库

恢复操作

MASTER上操作

$ mv /data/pgsql-9.6/recovery.done /data/pgsql-9.6/recovery.conf

如第一次触发HA,则将recovery.done拷贝到MASTER,并修改以下内容

primary_conninfo = 'host=新主库’s IP port=5432 user=postgres password=密码'

如第一次出发HA,则修改MASTER配置文件,MASTER现在为新备库,所以postgersql.conf中的max_connections应该大于新主库

$ vim /data/pgsql-9.6/postgresql.confhot_standby = on #检查是否开启max_connections = (每次恢复都必须修改新备机的这条配置,并比新主机的大)

启动新备库与新备库的keepalived

# systemctl restart postgresql-9.6

以上操作完后应该主备流复制MASTER为备receiver,SLAVE为主sender。使用命令# ps -ef | grep postgres可查看

以下一条操作VIP从SLAVE跳至MASTER

# systemctl restart keepalived# ip a #查看vip应跳回MASTER

SLAVE上操作

关闭新主库postgresql服务

# systemctl stop postgresql-9.6$ mv /data/pgsql-9.6/recovery.done /data/pgsql-9.6/recovery.conf$ vim /data/pgsql-9.6/postgresql.confmax_connections = (每次恢复都必须修改新备机的这条配置,并比新主机的大)# systemctl start postgresql-9.6# systemctl start keepalived

以上操作完后应该主备流复制MASTER为备sender,SLAVE为主receiver。使用命令# ps -ef | grep postgres可查看

在MASTER上查看流复制状态

$ psqlpostgres=# select client_addr,state,sync_state from pg_stat_replication;  client_addr   |   state   | sync_state----------------+-----------+------------ (主库’s IP) | streaming | async(1 row)

MASTER为发送端,不存在startup process并存在sender进程判定为主库

$ ps -ef | grep postgrespostgres   378     1  0 11:32 pts/0    00:00:00 /usr/pgsql-9.6/bin/postgres -D /data/pgsql-9.6postgres   379   378  0 11:32 ?        00:00:00 postgres: logger process                   postgres   381   378  0 11:32 ?        00:00:00 postgres: checkpointer process             postgres   382   378  0 11:32 ?        00:00:00 postgres: writer process                   postgres   383   378  0 11:32 ?        00:00:00 postgres: wal writer process               postgres   384   378  0 11:32 ?        00:00:00 postgres: autovacuum launcher process      postgres   385   378  0 11:32 ?        00:00:00 postgres: archiver process                 postgres   387   378  0 11:32 ?        00:00:00 postgres: stats collector process          postgres   464   378  0 11:47 ?        00:00:00 postgres: wal sender process postgres 192.168.40.185(50096) streaming 0/90038A0root      1293  1258  0 14:13 pts/1    00:00:00 su postgrespostgres  1294  1293  0 14:13 pts/1    00:00:00 bashpostgres  1417  1294  0 14:38 pts/1    00:00:00 ps -efpostgres  1418  1294  0 14:38 pts/1    00:00:00 grep postgres

SLAVE为接收端,存在startup process与wal receiver进程并存则判定为备库

$ ps -ef | grep postgresroot      1293  1258  0 14:13 pts/1    00:00:00 su postgrespostgres  1294  1293  0 14:13 pts/1    00:00:00 bashpostgres  1448     1  0 14:43 pts/1    00:00:00 /usr/pgsql-9.6/bin/postgres -D /data/pgsql-9.6postgres  1449  1448  0 14:43 ?        00:00:00 postgres: logger process                   postgres  1450  1448  0 14:43 ?        00:00:00 postgres: startup process   recovering 00000003000000000000000Apostgres  1451  1448  0 14:43 ?        00:00:00 postgres: checkpointer process             postgres  1452  1448  0 14:43 ?        00:00:00 postgres: writer process                   postgres  1453  1448  0 14:43 ?        00:00:00 postgres: stats collector process          postgres  1454  1448  0 14:43 ?        00:00:00 postgres: wal receiver process   streaming 0/A001060postgres  1457  1294  0 14:44 pts/1    00:00:00 ps -efpostgres  1458  1294  0 14:44 pts/1    00:00:00 grep postgres

测试新主库读写

测试数据同步功能

注意事项

  • 修改默认data目录时,要注意新目录权限必须为0700,属主属组必须为postgres。
  • 务必设置postgres用户的数据库密码:postgres=# alter user postgres with password ‘Ur passwd’;,否则配置主从流复制,备机pg_basebackup会出现认证错误。
  • 注意pg_hba.conf 文件的权限问题。
  • psql: FATAL: the database system is starting up.修改配置文件hot_standby = on。
  • Keepalived脚本中ps -C postmaster,如果是使用postgres用户启动则是ps -C postgres,其他用户为postmaster。