この答えはどこかにあると確信していますが、何度か試しても見つけたり修正したりできません。ユースケースは次のとおりです。
1.> 同じ VPC に属しているが異なるセキュリティ グループを持つ 2 つの ec2 インスタンスがあります。
2.> 両方のセキュリティ グループに 22,80 (パブリック用) があり、CIDR ブロック 10.20.0.0/16 用に開いているすべてのポートからのすべてのトラフィック
3.> EC2 インスタンスの内部 IP は 10.20.0.51 (サーバー 1) と 10.20.0.202 (サーバー 2) です。
4.> これらの次のコマンドを使用して、2 つの Docker 化された領事サーバーを実行しています。
server-1 : docker run -it -p 8400:8400 -p 8500:8500 -p 8600:53/udp -p 8301:8301 -p 8300:8300 -h node1 progrium/consul -server -advertise 10.20.0.51 -bootstrap-expect 2
server-2 : docker run -it -p 8400:8400 -p 8500:8500 -p 8600:53/udp -p 8301:8301 -p 8300:8300 --name node2 -h node2 progrium/consul -server -advertise 10.20.0.202 -join 10.20.0.51
5.> 両方が起動し、1 秒間お互いを認識し、選択が行われ、最初のノードが選択されますが、その直後にサーバー 2 が「メンバーリスト: 疑わしいノード 1 が失敗しました。ACK が受信されませんでした」と言い始め、サーバー 1また、「memberlist: Suspect node2 has failed, no ack received」とも言います
server-1 のログは次のようになります。
2016/01/04 19:18:35 [INFO] serf: EventMemberJoin: node2 10.20.0.202
2016/01/04 19:18:35 [INFO] consul: adding server node2 (Addr: 10.20.0.202:8300) (DC: dc1)
2016/01/04 19:18:35 [INFO] consul: Attempting bootstrap with nodes: [10.20.0.51:8300 10.20.0.202:8300]
2016/01/04 19:18:35 [WARN] raft: Heartbeat timeout reached, starting election
2016/01/04 19:18:35 [INFO] raft: Node at 10.20.0.51:8300 [Candidate] entering Candidate state
2016/01/04 19:18:35 [WARN] raft: Remote peer 10.20.0.202:8300 does not have local node 10.20.0.51:8300 as a peer
2016/01/04 19:18:35 [INFO] raft: Election won. Tally: 2
2016/01/04 19:18:35 [INFO] raft: Node at 10.20.0.51:8300 [Leader] entering Leader state
2016/01/04 19:18:35 [INFO] consul: cluster leadership acquired
2016/01/04 19:18:35 [INFO] consul: New leader elected: node1
2016/01/04 19:18:35 [INFO] raft: pipelining replication to peer 10.20.0.202:8300
2016/01/04 19:18:35 [INFO] consul: member 'node1' joined, marking health alive
2016/01/04 19:18:35 [INFO] consul: member 'node2' joined, marking health alive
2016/01/04 19:18:37 [INFO] memberlist: Suspect node2 has failed, no acks received
2016/01/04 19:18:37 [INFO] agent: Synced service 'consul'
2016/01/04 19:18:39 [INFO] memberlist: Suspect node2 has failed, no acks received
2016/01/04 19:18:41 [INFO] memberlist: Suspect node2 has failed, no acks received
2016/01/04 19:18:42 [INFO] memberlist: Marking node2 as failed, suspect timeout reached
2016/01/04 19:18:42 [INFO] serf: EventMemberFailed: node2 10.20.0.202
2016/01/04 19:18:42 [INFO] consul: removing server node2 (Addr: 10.20.0.202:8300) (DC: dc1)
そしてサーバーの場合-2
2016/01/04 19:18:10 [INFO] serf: EventMemberJoin: node2 10.20.0.202
2016/01/04 19:18:10 [INFO] serf: EventMemberJoin: node2.dc1 10.20.0.202
2016/01/04 19:18:10 [INFO] raft: Node at 10.20.0.202:8300 [Follower] entering Follower state
2016/01/04 19:18:10 [INFO] agent: (LAN) joining: [10.20.0.51]
2016/01/04 19:18:10 [INFO] consul: adding server node2 (Addr: 10.20.0.202:8300) (DC: dc1)
2016/01/04 19:18:10 [INFO] consul: adding server node2.dc1 (Addr: 10.20.0.202:8300) (DC: dc1)
2016/01/04 19:18:10 [INFO] serf: EventMemberJoin: node1 10.20.0.51
2016/01/04 19:18:10 [INFO] agent: (LAN) joined: 1 Err: <nil>
2016/01/04 19:18:10 [ERR] agent: failed to sync remote state: No cluster leader
2016/01/04 19:18:10 [INFO] consul: adding server node1 (Addr: 10.20.0.51:8300) (DC: dc1)
2016/01/04 19:18:12 [INFO] memberlist: Suspect node1 has failed, no acks received
2016/01/04 19:18:14 [INFO] memberlist: Suspect node1 has failed, no acks received
2016/01/04 19:18:16 [INFO] memberlist: Suspect node1 has failed, no acks received
2016/01/04 19:18:17 [INFO] memberlist: Marking node1 as failed, suspect timeout reached
2016/01/04 19:18:17 [INFO] serf: EventMemberFailed: node1 10.20.0.51
2016/01/04 19:18:17 [INFO] memberlist: Suspect node1 has failed, no acks received
2016/01/04 19:18:17 [INFO] consul: removing server node1 (Addr: 10.20.0.51:8300) (DC: dc1)
2016/01/04 19:18:19 [INFO] serf: EventMemberJoin: node1 10.20.0.51
2016/01/04 19:18:19 [INFO] consul: adding server node1 (Addr: 10.20.0.51:8300) (DC: dc1)
2016/01/04 19:18:19 [INFO] consul: New leader elected: node1
2016/01/04 19:18:21 [INFO] memberlist: Suspect node1 has failed, no acks received
2016/01/04 19:18:22 [INFO] agent: Synced service 'consul'
2016/01/04 19:18:23 [INFO] memberlist: Suspect node1 has failed, no acks received
2016/01/04 19:18:25 [INFO] memberlist: Suspect node1 has failed, no acks received
2016/01/04 19:18:26 [INFO] memberlist: Marking node1 as failed, suspect timeout reached
2016/01/04 19:18:26 [INFO] serf: EventMemberFailed: node1 10.20.0.51
2016/01/04 19:18:26 [INFO] consul: removing server node1 (Addr: 10.20.0.51:8300) (DC: dc1)
2016/01/04 19:18:26 [INFO] memberlist: Suspect node1 has failed, no acks received
2016/01/04 19:18:40 [INFO] serf: attempting reconnect to node1 10.20.0.51:8301
2016/01/04 19:18:40 [INFO] serf: EventMemberJoin: node1 10.20.0.51
正確に私が間違っていること。私が望むのは、2 つの EC2 インスタンスで 2 つの consul docker を実行し、セキュリティ グループのポートを明示的に開かずにそれらの間で通信することだけです (明示的にそれらを開くと、もちろん動作します!)
誰か助けてください。
ありがとう