乐趣区

使用mnesia在节点间共享内存

概述

有很多场景需要在一系列节点间共享内存数据. 如, 有一系列水平对等的网关, 可以在任意网关节点上拿到所有网关的特定内存信息.
一般的做法是使用 zookeeper, etcd 等提供了分布式一致性保证的服务.
使用 zookeeper, etcd 做节点间的数据同步当然没有问题. 但是:

  • erlang 内置数据类型需要额外的序列化 / 反序列化处理. 如 pid.
  • 不想引入一个复杂系统.

我最终使用了 gossip protocol 共享数据. 因为它非常简单可控, 能解决上面的痛点. 也可以实现节点间的最终一致性.
erlang 原生的 mnesia 看起来也很适合上述场景. 在最初做选型的时候, 对 mnesia 的实现没有透彻了解, 这里探讨一下使用 mnesia 的可行性, 以及 mnesia 是如何实现的:

  • 分布式事务是如何实现的?
  • 有新节点加入时, 数据是如何同步的?
  • 有没有主节点概念? 网络分区后如何恢复?
  • 提供什么级别的一致性保证?

使用 mnesia 在节点间共享数据

~/platform/launcher(master*) » iex --sname t1
Erlang/OTP 21 [erts-10.3.5.6]  [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [hipe]

Interactive Elixir (1.7.3) - press Ctrl+C to exit (type h() ENTER for help)
iex(t1@ubuntu)1> alias :mnesia, as: Mnesia
:mnesia
iex(t1@ubuntu)2> Mnesia.start()
:ok
iex(t1@ubuntu)3> Mnesia.create_table(Person, [attributes: [:id, :name, :job]])                
{:atomic, :ok}
iex(t1@ubuntu)4> Mnesia.dirty_write({Person, 1, "Seymour Skinner", "Principal"})
:ok
iex(t1@ubuntu)5> Mnesia.dirty_read({Person, 1})
[{Person, 1, "Seymour Skinner", "Principal"}]
iex(t1@ubuntu)6> Mnesia.table_info(Person, :all)
[
  access_mode: :read_write,
  active_replicas: [:t1@ubuntu],
  all_nodes: [:t1@ubuntu],
  arity: 4,
  attributes: [:id, :name, :job],
  checkpoints: [],
  commit_work: [],
  cookie: {{1593853684922256987, -576460752303423391, 1}, :t1@ubuntu},
  cstruct: {:cstruct, Person, :set, [:t1@ubuntu], [], [], [], 0, :read_write,
   false, [], [], false, Person, [:id, :name, :job], [], [], [],
   {{1593853684922256987, -576460752303423391, 1}, :t1@ubuntu}, {{2, 0}, []}},
  disc_copies: [],
  disc_only_copies: [],
  external_copies: [],
  frag_properties: [],
  index: [],
  index_info: {:index, :set, []},
  load_by_force: false,
  load_node: :t1@ubuntu,
  load_order: 0,
  load_reason: {:dumper, :create_table},
  local_content: false,
  majority: false,
  master_nodes: [],
  memory: 321,
  ram_copies: [:t1@ubuntu],
  record_name: Person,
  record_validation: {Person, 4, :set},
  size: 1,
  snmp: [],
  storage_properties: [],
  storage_type: :ram_copies,
  subscribers: [],
  type: :set,
  user_properties: [],
  version: {{2, 0}, []},
  where_to_commit: [t1@ubuntu: :ram_copies],
  where_to_read: :t1@ubuntu, 
  where_to_wlock: {[:t1@ubuntu], false},
  where_to_write: [:t1@ubuntu],
  wild_pattern: {Person, :_, :_, :_}
]

启动 t2

~/platform/launcher(master*) » iex --sname t2
Erlang/OTP 21 [erts-10.3.5.6]  [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [hipe]

Interactive Elixir (1.7.3) - press Ctrl+C to exit (type h() ENTER for help)
iex(t2@ubuntu)1> alias :mnesia, as: Mnesia
:mnesia
iex(t2@ubuntu)2> Mnesia.start()
:ok

copy table 至 t2 并在 t2 验证.

iex(t1@ubuntu)9> Mnesia.add_table_copy(Person, :t2@ubuntu, :ram_copies)
{:atomic, :ok}
iex(t1@ubuntu)10> Mnesia.change_config(:extra_db_nodes, [:t2@ubuntu]) 
iex(t2@ubuntu)4> Mnesia.dirty_read({Person, 1})
[{Person, 1, "Seymour Skinner", "Principal"}]

t2 的写入也在 t1 可读

iex(t2@ubuntu)5> Mnesia.dirty_write({Person, 2, "Homer Simpson", "Safety Inspector"})
:ok
iex(t1@ubuntu)11> Mnesia.dirty_read({Person, 2})
[{Person, 2, "Homer Simpson", "Safety Inspector"}]

若 t2 重启, 需要 t1 重新 change_config, t2 才会重新从 t1 同步数据.

~/platform/launcher(master*) » iex --sname t2
Erlang/OTP 21 [erts-10.3.5.6]  [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [hipe]

Interactive Elixir (1.7.3) - press Ctrl+C to exit (type h() ENTER for help)
iex(t2@ubuntu)1> alias :mnesia, as: Mnesia
:mnesia
iex(t2@ubuntu)2> Mnesia.start()
:ok
iex(t2@ubuntu)3> Mnesia.dirty_read({Person, 1})
** (exit) {:aborted, {:no_exists, [Person, 1]}}
    (mnesia) mnesia.erl:355: :mnesia.abort/1
iex(t1@ubuntu)12> Mnesia.change_config(:extra_db_nodes, [:t2@ubuntu])   
{:ok, [:t2@ubuntu]}
iex(t2@ubuntu)3> Mnesia.dirty_read({Person, 1})
[{Person, 1, "Seymour Skinner", "Principal"}]
iex(t2@ubuntu)4> Mnesia.dirty_read({Person, 2})
[{Person, 2, "Homer Simpson", "Safety Inspector"}]

若有新节点 t3, t1/t2 执行 change_config 即可. 并不需要 add_table_copy, 不 add_table_copy 的 node, 无法写入. 在关闭 t1, t2 节点后. 节点 t3 写入失败. 如果重新启动 t1 节点, 并在 t3 change_config, 可以将 schema 数据拷贝回 t1. 重新可以提交. 但之前的数据全部丢失了.

iex(t3@ubuntu)6> Mnesia.dirty_write({Person, 4, "Person 4", "Safety Inspector"})
** (exit) {:aborted, {:no_exists, Person}}
    (mnesia) mnesia.erl:355: :mnesia.abort/1
    (mnesia) mnesia_tm.erl:1061: :mnesia_tm.dirty/2
iex(t3@ubuntu)6> Mnesia.table_info(Person, :all)                                
[
  access_mode: :read_write,
  active_replicas: [],
  all_nodes: [:t2@ubuntu, :t1@ubuntu],
  arity: 4,
  attributes: [:id, :name, :job],
  checkpoints: [],
  commit_work: [],
  cookie: {{1593853684922256987, -576460752303423391, 1}, :t1@ubuntu},
  cstruct: {:cstruct, Person, :set, [:t2@ubuntu, :t1@ubuntu], [], [], [], 0,
   :read_write, false, [], [], false, Person, [:id, :name, :job], [], [], [],
   {{1593853684922256987, -576460752303423391, 1}, :t1@ubuntu},
   {{3, 0}, {:t1@ubuntu, {1593, 853838, 540527}}}},
  disc_copies: [],
  disc_only_copies: [],
  external_copies: [],
  frag_properties: [],
  index: [],
  index_info: {:index, :set, []},
  load_by_force: false,
  load_node: :unknown,
  load_order: 0,
  load_reason: :unknown,
  local_content: false,
  majority: false,
  master_nodes: [],
  memory: 0,
  ram_copies: [:t2@ubuntu, :t1@ubuntu],
  record_name: Person,
  record_validation: {Person, 4, :set},
  size: 0,
  snmp: [],
  storage_properties: [],
  storage_type: :unknown,
  subscribers: [],
  type: :set,
  user_properties: [],
  version: {{3, 0}, {:t1@ubuntu, {1593, 853838, 540527}}},
  where_to_commit: [],
  where_to_read: :nowhere,
  where_to_wlock: {[], false},
  where_to_write: [],
  wild_pattern: {Person, :_, :_, :_}
]
iex(t3@ubuntu)7> Mnesia.change_config(:extra_db_nodes, [:t1@ubuntu])   
{:ok, [:t1@ubuntu]}
iex(t3@ubuntu)8> Mnesia.table_info(Person, :all)                       
[
  access_mode: :read_write,
  active_replicas: [:t1@ubuntu],
  all_nodes: [:t2@ubuntu, :t1@ubuntu],
  arity: 4,
  attributes: [:id, :name, :job],
  checkpoints: [],
  commit_work: [],
  cookie: {{1593853684922256987, -576460752303423391, 1}, :t1@ubuntu},
  cstruct: {:cstruct, Person, :set, [:t2@ubuntu, :t1@ubuntu], [], [], [], 0,
   :read_write, false, [], [], false, Person, [:id, :name, :job], [], [], [],
   {{1593853684922256987, -576460752303423391, 1}, :t1@ubuntu},
   {{3, 0}, {:t1@ubuntu, {1593, 853838, 540527}}}},
  disc_copies: [],
  disc_only_copies: [],
  external_copies: [],
  frag_properties: [],
  index: [],
  index_info: {:index, :set, []},
  load_by_force: false,
  load_node: :unknown,
  load_order: 0,
  load_reason: :unknown,
  local_content: false,
  majority: false,
  master_nodes: [],
  memory: 0,
  ram_copies: [:t2@ubuntu, :t1@ubuntu],
  record_name: Person,
  record_validation: {Person, 4, :set},
  size: 0,
  snmp: [],
  storage_properties: [],
  storage_type: :unknown,
  subscribers: [],
  type: :set,
  user_properties: [],
  version: {{3, 0}, {:t1@ubuntu, {1593, 853838, 540527}}},
  where_to_commit: [t1@ubuntu: :ram_copies],
  where_to_read: :t1@ubuntu,
  where_to_wlock: {[:t1@ubuntu], false},
  where_to_write: [:t1@ubuntu],
  wild_pattern: {Person, :_, :_, :_}
]

在关闭 t1, t3 节点后, t2 节点仍然可以写入成功.

推测

  • mnesia 使用多主节点 lock and commit.
  • 有新节点加入时, 通过 change_config 主动从特定节点上复制数据.
  • 无一致性保证, 有脑裂问题. (写入时没有要求多于半数节点存活)

总结

mnesia 不适合有一致性要求的场景.

退出移动版