概述
有很多场景需要在一系列节点间共享内存数据. 如, 有一系列水平对等的网关, 可以在任意网关节点上拿到所有网关的特定内存信息.
一般的做法是使用zookeeper, etcd等提供了分布式一致性保证的服务.
使用zookeeper, etcd做节点间的数据同步当然没有问题. 但是:
- erlang内置数据类型需要额外的序列化/反序列化处理. 如pid.
- 不想引入一个复杂系统.
我最终使用了 gossip protocol 共享数据. 因为它非常简单可控, 能解决上面的痛点. 也可以实现节点间的最终一致性.
erlang原生的mnesia看起来也很适合上述场景. 在最初做选型的时候, 对mnesia的实现没有透彻了解, 这里探讨一下使用mnesia的可行性, 以及mnesia是如何实现的:
- 分布式事务是如何实现的?
- 有新节点加入时, 数据是如何同步的?
- 有没有主节点概念? 网络分区后如何恢复?
- 提供什么级别的一致性保证?
使用mnesia在节点间共享数据
~/platform/launcher(master*) » iex --sname t1Erlang/OTP 21 [erts-10.3.5.6] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [hipe]Interactive Elixir (1.7.3) - press Ctrl+C to exit (type h() ENTER for help)iex(t1@ubuntu)1> alias :mnesia, as: Mnesia:mnesiaiex(t1@ubuntu)2> Mnesia.start():okiex(t1@ubuntu)3> Mnesia.create_table(Person, [attributes: [:id, :name, :job]]) {:atomic, :ok}iex(t1@ubuntu)4> Mnesia.dirty_write({Person, 1, "Seymour Skinner", "Principal"}):okiex(t1@ubuntu)5> Mnesia.dirty_read({Person, 1})[{Person, 1, "Seymour Skinner", "Principal"}]iex(t1@ubuntu)6> Mnesia.table_info(Person, :all)[ access_mode: :read_write, active_replicas: [:t1@ubuntu], all_nodes: [:t1@ubuntu], arity: 4, attributes: [:id, :name, :job], checkpoints: [], commit_work: [], cookie: {{1593853684922256987, -576460752303423391, 1}, :t1@ubuntu}, cstruct: {:cstruct, Person, :set, [:t1@ubuntu], [], [], [], 0, :read_write, false, [], [], false, Person, [:id, :name, :job], [], [], [], {{1593853684922256987, -576460752303423391, 1}, :t1@ubuntu}, {{2, 0}, []}}, disc_copies: [], disc_only_copies: [], external_copies: [], frag_properties: [], index: [], index_info: {:index, :set, []}, load_by_force: false, load_node: :t1@ubuntu, load_order: 0, load_reason: {:dumper, :create_table}, local_content: false, majority: false, master_nodes: [], memory: 321, ram_copies: [:t1@ubuntu], record_name: Person, record_validation: {Person, 4, :set}, size: 1, snmp: [], storage_properties: [], storage_type: :ram_copies, subscribers: [], type: :set, user_properties: [], version: {{2, 0}, []}, where_to_commit: [t1@ubuntu: :ram_copies], where_to_read: :t1@ubuntu, where_to_wlock: {[:t1@ubuntu], false}, where_to_write: [:t1@ubuntu], wild_pattern: {Person, :_, :_, :_}]
启动t2
~/platform/launcher(master*) » iex --sname t2Erlang/OTP 21 [erts-10.3.5.6] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [hipe]Interactive Elixir (1.7.3) - press Ctrl+C to exit (type h() ENTER for help)iex(t2@ubuntu)1> alias :mnesia, as: Mnesia:mnesiaiex(t2@ubuntu)2> Mnesia.start():ok
copy table至t2并在t2验证.
iex(t1@ubuntu)9> Mnesia.add_table_copy(Person, :t2@ubuntu, :ram_copies){:atomic, :ok}iex(t1@ubuntu)10> Mnesia.change_config(:extra_db_nodes, [:t2@ubuntu]) iex(t2@ubuntu)4> Mnesia.dirty_read({Person, 1})[{Person, 1, "Seymour Skinner", "Principal"}]
t2的写入也在t1可读
iex(t2@ubuntu)5> Mnesia.dirty_write({Person, 2, "Homer Simpson", "Safety Inspector"}):okiex(t1@ubuntu)11> Mnesia.dirty_read({Person, 2})[{Person, 2, "Homer Simpson", "Safety Inspector"}]
若t2重启, 需要t1重新change_config, t2才会重新从t1同步数据.
~/platform/launcher(master*) » iex --sname t2Erlang/OTP 21 [erts-10.3.5.6] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [hipe]Interactive Elixir (1.7.3) - press Ctrl+C to exit (type h() ENTER for help)iex(t2@ubuntu)1> alias :mnesia, as: Mnesia:mnesiaiex(t2@ubuntu)2> Mnesia.start():okiex(t2@ubuntu)3> Mnesia.dirty_read({Person, 1})** (exit) {:aborted, {:no_exists, [Person, 1]}} (mnesia) mnesia.erl:355: :mnesia.abort/1iex(t1@ubuntu)12> Mnesia.change_config(:extra_db_nodes, [:t2@ubuntu]) {:ok, [:t2@ubuntu]}iex(t2@ubuntu)3> Mnesia.dirty_read({Person, 1})[{Person, 1, "Seymour Skinner", "Principal"}]iex(t2@ubuntu)4> Mnesia.dirty_read({Person, 2})[{Person, 2, "Homer Simpson", "Safety Inspector"}]
若有新节点t3, t1/t2执行change_config即可. 并不需要add_table_copy, 不add_table_copy的node, 无法写入. 在关闭t1, t2节点后. 节点t3写入失败. 如果重新启动t1节点, 并在t3 change_config, 可以将schema数据拷贝回t1. 重新可以提交. 但之前的数据全部丢失了.
iex(t3@ubuntu)6> Mnesia.dirty_write({Person, 4, "Person 4", "Safety Inspector"})** (exit) {:aborted, {:no_exists, Person}} (mnesia) mnesia.erl:355: :mnesia.abort/1 (mnesia) mnesia_tm.erl:1061: :mnesia_tm.dirty/2iex(t3@ubuntu)6> Mnesia.table_info(Person, :all) [ access_mode: :read_write, active_replicas: [], all_nodes: [:t2@ubuntu, :t1@ubuntu], arity: 4, attributes: [:id, :name, :job], checkpoints: [], commit_work: [], cookie: {{1593853684922256987, -576460752303423391, 1}, :t1@ubuntu}, cstruct: {:cstruct, Person, :set, [:t2@ubuntu, :t1@ubuntu], [], [], [], 0, :read_write, false, [], [], false, Person, [:id, :name, :job], [], [], [], {{1593853684922256987, -576460752303423391, 1}, :t1@ubuntu}, {{3, 0}, {:t1@ubuntu, {1593, 853838, 540527}}}}, disc_copies: [], disc_only_copies: [], external_copies: [], frag_properties: [], index: [], index_info: {:index, :set, []}, load_by_force: false, load_node: :unknown, load_order: 0, load_reason: :unknown, local_content: false, majority: false, master_nodes: [], memory: 0, ram_copies: [:t2@ubuntu, :t1@ubuntu], record_name: Person, record_validation: {Person, 4, :set}, size: 0, snmp: [], storage_properties: [], storage_type: :unknown, subscribers: [], type: :set, user_properties: [], version: {{3, 0}, {:t1@ubuntu, {1593, 853838, 540527}}}, where_to_commit: [], where_to_read: :nowhere, where_to_wlock: {[], false}, where_to_write: [], wild_pattern: {Person, :_, :_, :_}]iex(t3@ubuntu)7> Mnesia.change_config(:extra_db_nodes, [:t1@ubuntu]) {:ok, [:t1@ubuntu]}iex(t3@ubuntu)8> Mnesia.table_info(Person, :all) [ access_mode: :read_write, active_replicas: [:t1@ubuntu], all_nodes: [:t2@ubuntu, :t1@ubuntu], arity: 4, attributes: [:id, :name, :job], checkpoints: [], commit_work: [], cookie: {{1593853684922256987, -576460752303423391, 1}, :t1@ubuntu}, cstruct: {:cstruct, Person, :set, [:t2@ubuntu, :t1@ubuntu], [], [], [], 0, :read_write, false, [], [], false, Person, [:id, :name, :job], [], [], [], {{1593853684922256987, -576460752303423391, 1}, :t1@ubuntu}, {{3, 0}, {:t1@ubuntu, {1593, 853838, 540527}}}}, disc_copies: [], disc_only_copies: [], external_copies: [], frag_properties: [], index: [], index_info: {:index, :set, []}, load_by_force: false, load_node: :unknown, load_order: 0, load_reason: :unknown, local_content: false, majority: false, master_nodes: [], memory: 0, ram_copies: [:t2@ubuntu, :t1@ubuntu], record_name: Person, record_validation: {Person, 4, :set}, size: 0, snmp: [], storage_properties: [], storage_type: :unknown, subscribers: [], type: :set, user_properties: [], version: {{3, 0}, {:t1@ubuntu, {1593, 853838, 540527}}}, where_to_commit: [t1@ubuntu: :ram_copies], where_to_read: :t1@ubuntu, where_to_wlock: {[:t1@ubuntu], false}, where_to_write: [:t1@ubuntu], wild_pattern: {Person, :_, :_, :_}]
在关闭t1, t3节点后, t2节点仍然可以写入成功.
推测
- mnesia使用多主节点lock and commit.
- 有新节点加入时, 通过change_config主动从特定节点上复制数据.
- 无一致性保证, 有脑裂问题. (写入时没有要求多于半数节点存活)
总结
mnesia 不适合有一致性要求的场景.