概述
有很多场景需要在一系列节点间共享内存数据. 如, 有一系列水平对等的网关, 可以在任意网关节点上拿到所有网关的特定内存信息.
一般的做法是使用 zookeeper, etcd 等提供了分布式一致性保证的服务.
使用 zookeeper, etcd 做节点间的数据同步当然没有问题. 但是:
- erlang 内置数据类型需要额外的序列化 / 反序列化处理. 如 pid.
- 不想引入一个复杂系统.
我最终使用了 gossip protocol 共享数据. 因为它非常简单可控, 能解决上面的痛点. 也可以实现节点间的最终一致性.
erlang 原生的 mnesia 看起来也很适合上述场景. 在最初做选型的时候, 对 mnesia 的实现没有透彻了解, 这里探讨一下使用 mnesia 的可行性, 以及 mnesia 是如何实现的:
- 分布式事务是如何实现的?
- 有新节点加入时, 数据是如何同步的?
- 有没有主节点概念? 网络分区后如何恢复?
- 提供什么级别的一致性保证?
使用 mnesia 在节点间共享数据
~/platform/launcher(master*) » iex --sname t1
Erlang/OTP 21 [erts-10.3.5.6] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [hipe]
Interactive Elixir (1.7.3) - press Ctrl+C to exit (type h() ENTER for help)
iex(t1@ubuntu)1> alias :mnesia, as: Mnesia
:mnesia
iex(t1@ubuntu)2> Mnesia.start()
:ok
iex(t1@ubuntu)3> Mnesia.create_table(Person, [attributes: [:id, :name, :job]])
{:atomic, :ok}
iex(t1@ubuntu)4> Mnesia.dirty_write({Person, 1, "Seymour Skinner", "Principal"})
:ok
iex(t1@ubuntu)5> Mnesia.dirty_read({Person, 1})
[{Person, 1, "Seymour Skinner", "Principal"}]
iex(t1@ubuntu)6> Mnesia.table_info(Person, :all)
[
access_mode: :read_write,
active_replicas: [:t1@ubuntu],
all_nodes: [:t1@ubuntu],
arity: 4,
attributes: [:id, :name, :job],
checkpoints: [],
commit_work: [],
cookie: {{1593853684922256987, -576460752303423391, 1}, :t1@ubuntu},
cstruct: {:cstruct, Person, :set, [:t1@ubuntu], [], [], [], 0, :read_write,
false, [], [], false, Person, [:id, :name, :job], [], [], [],
{{1593853684922256987, -576460752303423391, 1}, :t1@ubuntu}, {{2, 0}, []}},
disc_copies: [],
disc_only_copies: [],
external_copies: [],
frag_properties: [],
index: [],
index_info: {:index, :set, []},
load_by_force: false,
load_node: :t1@ubuntu,
load_order: 0,
load_reason: {:dumper, :create_table},
local_content: false,
majority: false,
master_nodes: [],
memory: 321,
ram_copies: [:t1@ubuntu],
record_name: Person,
record_validation: {Person, 4, :set},
size: 1,
snmp: [],
storage_properties: [],
storage_type: :ram_copies,
subscribers: [],
type: :set,
user_properties: [],
version: {{2, 0}, []},
where_to_commit: [t1@ubuntu: :ram_copies],
where_to_read: :t1@ubuntu,
where_to_wlock: {[:t1@ubuntu], false},
where_to_write: [:t1@ubuntu],
wild_pattern: {Person, :_, :_, :_}
]
启动 t2
~/platform/launcher(master*) » iex --sname t2
Erlang/OTP 21 [erts-10.3.5.6] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [hipe]
Interactive Elixir (1.7.3) - press Ctrl+C to exit (type h() ENTER for help)
iex(t2@ubuntu)1> alias :mnesia, as: Mnesia
:mnesia
iex(t2@ubuntu)2> Mnesia.start()
:ok
copy table 至 t2 并在 t2 验证.
iex(t1@ubuntu)9> Mnesia.add_table_copy(Person, :t2@ubuntu, :ram_copies)
{:atomic, :ok}
iex(t1@ubuntu)10> Mnesia.change_config(:extra_db_nodes, [:t2@ubuntu])
iex(t2@ubuntu)4> Mnesia.dirty_read({Person, 1})
[{Person, 1, "Seymour Skinner", "Principal"}]
t2 的写入也在 t1 可读
iex(t2@ubuntu)5> Mnesia.dirty_write({Person, 2, "Homer Simpson", "Safety Inspector"})
:ok
iex(t1@ubuntu)11> Mnesia.dirty_read({Person, 2})
[{Person, 2, "Homer Simpson", "Safety Inspector"}]
若 t2 重启, 需要 t1 重新 change_config, t2 才会重新从 t1 同步数据.
~/platform/launcher(master*) » iex --sname t2
Erlang/OTP 21 [erts-10.3.5.6] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [hipe]
Interactive Elixir (1.7.3) - press Ctrl+C to exit (type h() ENTER for help)
iex(t2@ubuntu)1> alias :mnesia, as: Mnesia
:mnesia
iex(t2@ubuntu)2> Mnesia.start()
:ok
iex(t2@ubuntu)3> Mnesia.dirty_read({Person, 1})
** (exit) {:aborted, {:no_exists, [Person, 1]}}
(mnesia) mnesia.erl:355: :mnesia.abort/1
iex(t1@ubuntu)12> Mnesia.change_config(:extra_db_nodes, [:t2@ubuntu])
{:ok, [:t2@ubuntu]}
iex(t2@ubuntu)3> Mnesia.dirty_read({Person, 1})
[{Person, 1, "Seymour Skinner", "Principal"}]
iex(t2@ubuntu)4> Mnesia.dirty_read({Person, 2})
[{Person, 2, "Homer Simpson", "Safety Inspector"}]
若有新节点 t3, t1/t2 执行 change_config 即可. 并不需要 add_table_copy, 不 add_table_copy 的 node, 无法写入. 在关闭 t1, t2 节点后. 节点 t3 写入失败. 如果重新启动 t1 节点, 并在 t3 change_config, 可以将 schema 数据拷贝回 t1. 重新可以提交. 但之前的数据全部丢失了.
iex(t3@ubuntu)6> Mnesia.dirty_write({Person, 4, "Person 4", "Safety Inspector"})
** (exit) {:aborted, {:no_exists, Person}}
(mnesia) mnesia.erl:355: :mnesia.abort/1
(mnesia) mnesia_tm.erl:1061: :mnesia_tm.dirty/2
iex(t3@ubuntu)6> Mnesia.table_info(Person, :all)
[
access_mode: :read_write,
active_replicas: [],
all_nodes: [:t2@ubuntu, :t1@ubuntu],
arity: 4,
attributes: [:id, :name, :job],
checkpoints: [],
commit_work: [],
cookie: {{1593853684922256987, -576460752303423391, 1}, :t1@ubuntu},
cstruct: {:cstruct, Person, :set, [:t2@ubuntu, :t1@ubuntu], [], [], [], 0,
:read_write, false, [], [], false, Person, [:id, :name, :job], [], [], [],
{{1593853684922256987, -576460752303423391, 1}, :t1@ubuntu},
{{3, 0}, {:t1@ubuntu, {1593, 853838, 540527}}}},
disc_copies: [],
disc_only_copies: [],
external_copies: [],
frag_properties: [],
index: [],
index_info: {:index, :set, []},
load_by_force: false,
load_node: :unknown,
load_order: 0,
load_reason: :unknown,
local_content: false,
majority: false,
master_nodes: [],
memory: 0,
ram_copies: [:t2@ubuntu, :t1@ubuntu],
record_name: Person,
record_validation: {Person, 4, :set},
size: 0,
snmp: [],
storage_properties: [],
storage_type: :unknown,
subscribers: [],
type: :set,
user_properties: [],
version: {{3, 0}, {:t1@ubuntu, {1593, 853838, 540527}}},
where_to_commit: [],
where_to_read: :nowhere,
where_to_wlock: {[], false},
where_to_write: [],
wild_pattern: {Person, :_, :_, :_}
]
iex(t3@ubuntu)7> Mnesia.change_config(:extra_db_nodes, [:t1@ubuntu])
{:ok, [:t1@ubuntu]}
iex(t3@ubuntu)8> Mnesia.table_info(Person, :all)
[
access_mode: :read_write,
active_replicas: [:t1@ubuntu],
all_nodes: [:t2@ubuntu, :t1@ubuntu],
arity: 4,
attributes: [:id, :name, :job],
checkpoints: [],
commit_work: [],
cookie: {{1593853684922256987, -576460752303423391, 1}, :t1@ubuntu},
cstruct: {:cstruct, Person, :set, [:t2@ubuntu, :t1@ubuntu], [], [], [], 0,
:read_write, false, [], [], false, Person, [:id, :name, :job], [], [], [],
{{1593853684922256987, -576460752303423391, 1}, :t1@ubuntu},
{{3, 0}, {:t1@ubuntu, {1593, 853838, 540527}}}},
disc_copies: [],
disc_only_copies: [],
external_copies: [],
frag_properties: [],
index: [],
index_info: {:index, :set, []},
load_by_force: false,
load_node: :unknown,
load_order: 0,
load_reason: :unknown,
local_content: false,
majority: false,
master_nodes: [],
memory: 0,
ram_copies: [:t2@ubuntu, :t1@ubuntu],
record_name: Person,
record_validation: {Person, 4, :set},
size: 0,
snmp: [],
storage_properties: [],
storage_type: :unknown,
subscribers: [],
type: :set,
user_properties: [],
version: {{3, 0}, {:t1@ubuntu, {1593, 853838, 540527}}},
where_to_commit: [t1@ubuntu: :ram_copies],
where_to_read: :t1@ubuntu,
where_to_wlock: {[:t1@ubuntu], false},
where_to_write: [:t1@ubuntu],
wild_pattern: {Person, :_, :_, :_}
]
在关闭 t1, t3 节点后, t2 节点仍然可以写入成功.
推测
- mnesia 使用多主节点 lock and commit.
- 有新节点加入时, 通过 change_config 主动从特定节点上复制数据.
- 无一致性保证, 有脑裂问题. (写入时没有要求多于半数节点存活)
总结
mnesia 不适合有一致性要求的场景.