3

Windows で実行されている MongoDB 3 メンバー レプリカ セットがあります。プライマリ サーバー (S1) がダウンすると、セカンダリ サーバーが正しく選択されます。プライマリ サーバーが復旧すると、レプリカ メンバーは無効な状態のままになります。

     {
            "state" : 10,
            "stateStr" : "REMOVED",
            "uptime" : 111,
            "optime" : Timestamp(1448462710, 6),
            "optimeDate" : ISODate("2015-11-25T14:45:10Z"),
            "ok" : 0,
            "errmsg" : "Our replica set config is invalid or we are not a member of it",
            "code" : 93
     }

その後、セカンダリが数秒ごとにプライマリとセカンダリを切り替え続けるため、アプリケーションが不安定になります。

プライマリ サーバーを元に戻す唯一の方法は、rs.reconfig(c) を実行することです。

設定ファイルに問題は見つかりませんでした。

どんな助けでも大歓迎です。

更新: 現在の構成は次のとおりです。

{
    "_id" : "companyName",
    "version" : 32593,
    "protocolVersion" : NumberLong(1),
    "members" : [
            {
                    "_id" : 1,
                    "host" : "arb.companyName.com:40000",
                    "arbiterOnly" : true,
                    "buildIndexes" : true,
                    "hidden" : false,
                    "priority" : 1,
                    "tags" : {

                    },
                    "slaveDelay" : NumberLong(0),
                    "votes" : 1
            },
            {
                    "_id" : 2,
                    "host" : "m3.companyName.com:40000",
                    "arbiterOnly" : false,
                    "buildIndexes" : true,
                    "hidden" : false,
                    "priority" : 11,
                    "tags" : {

                    },
                    "slaveDelay" : NumberLong(0),
                    "votes" : 1
            },
            {
                    "_id" : 4,
                    "host" : "m2.companyName.com:40000",
                    "arbiterOnly" : false,
                    "buildIndexes" : true,
                    "hidden" : false,
                    "priority" : 3,
                    "tags" : {

                    },
                    "slaveDelay" : NumberLong(0),
                    "votes" : 1
            }
    ],
    "settings" : {
            "chainingAllowed" : true,
            "heartbeatIntervalMillis" : 2000,
            "heartbeatTimeoutSecs" : 10,
            "electionTimeoutMillis" : 10000,
            "getLastErrorModes" : {

            },
            "getLastErrorDefaults" : {
                    "w" : 1,
                    "wtimeout" : 0
            },
            "replicaSetId" : ObjectId("573dfcd0e8ae6154ff80c50d")
    }
}

ホスト名ではなく IP アドレスを使用する必要がありますか?

更新 2:

これは、プライマリ (m3.companyName.com - IP 1.1.1.1) の再起動時から、他のサーバー (m2.companyName.com - IP 2.2.2.2) に移動して手動で rs を実行するまでのログです。 .reconfig()。

2016-09-06T07:42:05.953Z I NETWORK  [HostnameCanonicalizationWorker] Starting hostname canonicalization worker
2016-09-06T07:42:05.953Z I FTDC     [initandlisten] Initializing full-time diagnostic data capture with directory 'c:/mongossl/data3/diagnostic.data'
2016-09-06T07:42:05.954Z I NETWORK  [initandlisten] waiting for connections on port 40000 ssl
2016-09-06T07:42:05.955Z W NETWORK  [ReplicationExecutor] getaddrinfo("arb.companyName.com") failed: errno:11001 No such host is known.
2016-09-06T07:42:05.955Z I NETWORK  [ReplicationExecutor] getaddrinfo("arb.companyName.com") failed: errno:11001 No such host is known.
2016-09-06T07:42:05.957Z W NETWORK  [ReplicationExecutor] getaddrinfo("m3.companyName.com") failed: errno:11001 No such host is known.
2016-09-06T07:42:05.957Z I NETWORK  [ReplicationExecutor] getaddrinfo("m3.companyName.com") failed: errno:11001 No such host is known.
2016-09-06T07:42:05.958Z W NETWORK  [ReplicationExecutor] getaddrinfo("m2.companyName.com") failed: errno:11001 No such host is known.
2016-09-06T07:42:05.959Z I NETWORK  [ReplicationExecutor] getaddrinfo("m2.companyName.com") failed: errno:11001 No such host is known.
2016-09-06T07:42:05.959Z W REPL     [ReplicationExecutor] Locally stored replica set configuration does not have a valid entry for the current node; waiting for reconfig or remote heartbeat; Got "NodeNotFound: No host described in new configuration 32592 for replica set companyName2 maps to this node" while validating { _id: "companyName2", version: 32592, protocolVersion: 1, members: [ { _id: 1, host: "arb.companyName.com:40000", arbiterOnly: true, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 2, host: "m3.companyName.com:40000", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 11.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 4, host: "m2.companyName.com:40000", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 3.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 }, replicaSetId: ObjectId('573dfcd0e8ae6154ff80c50d') } }
2016-09-06T07:42:05.959Z I REPL     [ReplicationExecutor] New replica set config in use: { _id: "companyName2", version: 32592, protocolVersion: 1, members: [ { _id: 1, host: "arb.companyName.com:40000", arbiterOnly: true, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 2, host: "m3.companyName.com:40000", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 11.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 4, host: "m2.companyName.com:40000", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 3.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 }, replicaSetId: ObjectId('573dfcd0e8ae6154ff80c50d') } }
2016-09-06T07:42:05.959Z I REPL     [ReplicationExecutor] This node is not a member of the config
2016-09-06T07:42:05.959Z I REPL     [ReplicationExecutor] transition to REMOVED
2016-09-06T07:42:05.959Z I REPL     [ReplicationExecutor] Starting replication applier threads
2016-09-06T07:42:06.651Z I NETWORK  [initandlisten] connection accepted from 2.2.2.2:53746 #1 (1 connection now open)
2016-09-06T07:42:06.760Z I NETWORK  [initandlisten] connection accepted from 2.2.2.2:53747 #2 (2 connections now open)
2016-09-06T07:42:06.864Z I NETWORK  [initandlisten] connection accepted from 2.2.2.2:53748 #3 (3 connections now open)
2016-09-06T07:42:06.993Z I ACCESS   [conn1]  authenticate db: $external { authenticate: 1, mechanism: "MONGODB-X509", user: "CN=m2.companyName.com,O=companyName,ST=ON,C=CA" }
2016-09-06T07:42:07.067Z I ACCESS   [conn2]  authenticate db: $external { authenticate: 1, mechanism: "MONGODB-X509", user: "CN=m2.companyName.com,O=companyName,ST=ON,C=CA" }
2016-09-06T07:42:07.159Z I ACCESS   [conn3]  authenticate db: $external { authenticate: 1, mechanism: "MONGODB-X509", user: "CN=m2.companyName.com,O=companyName,ST=ON,C=CA" }
2016-09-06T07:42:07.552Z I ASIO     [NetworkInterfaceASIO-0] Successfully connected to m2.companyName.com:40000
2016-09-06T07:42:07.627Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state PRIMARY
2016-09-06T07:42:08.975Z I NETWORK  [conn1] end connection 2.2.2.2:53746 (2 connections now open)
2016-09-06T07:42:08.975Z I NETWORK  [conn2] end connection 2.2.2.2:53747 (2 connections now open)
2016-09-06T07:42:08.975Z I NETWORK  [conn3] end connection 2.2.2.2:53748 (2 connections now open)
2016-09-06T07:42:09.371Z I NETWORK  [initandlisten] connection accepted from 2.2.2.2:53763 #4 (1 connection now open)
2016-09-06T07:42:09.639Z I ACCESS   [conn4]  authenticate db: $external { authenticate: 1, mechanism: "MONGODB-X509", user: "CN=m2.companyName.com,O=companyName,ST=ON,C=CA" }
2016-09-06T07:42:13.059Z I NETWORK  [initandlisten] connection accepted from 3.3.3.3:58220 #5 (2 connections now open)
2016-09-06T07:42:13.127Z I ACCESS   [conn5]  authenticate db: $external { authenticate: 1, mechanism: "MONGODB-X509", user: "CN=arb.companyName.com,O=companyName,ST=ON,C=CA" }
2016-09-06T07:42:13.292Z I ASIO     [NetworkInterfaceASIO-0] Successfully connected to arb.companyName.com:40000
2016-09-06T07:42:13.301Z I REPL     [ReplicationExecutor] Member arb.companyName.com:40000 is now in state ARBITER
2016-09-06T07:42:13.974Z I NETWORK  [initandlisten] connection accepted from 2.2.2.2:53765 #6 (3 connections now open)
2016-09-06T07:42:14.433Z I ACCESS   [conn6] Successfully authenticated as principal appUser on companyName
2016-09-06T07:42:16.629Z I NETWORK  [initandlisten] connection accepted from 1.1.1.13:49162 #7 (4 connections now open)
2016-09-06T07:42:16.853Z I ACCESS   [conn7] Successfully authenticated as principal appUser on companyName
2016-09-06T07:42:17.703Z I ASIO     [ReplicationExecutor] dropping unhealthy pooled connection to m2.companyName.com:40000
2016-09-06T07:42:17.703Z I ASIO     [ReplicationExecutor] after drop, pool was empty, going to spawn some connections
2016-09-06T07:42:18.131Z I ASIO     [NetworkInterfaceASIO-0] Successfully connected to m2.companyName.com:40000
2016-09-06T07:42:18.206Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state SECONDARY
2016-09-06T07:42:23.369Z I NETWORK  [initandlisten] connection accepted from 2.2.2.2:53767 #8 (5 connections now open)
2016-09-06T07:42:23.832Z I ACCESS   [conn8] Successfully authenticated as principal sa on admin
2016-09-06T07:42:28.356Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state PRIMARY
2016-09-06T07:42:38.431Z I ASIO     [ReplicationExecutor] dropping unhealthy pooled connection to m2.companyName.com:40000
2016-09-06T07:42:38.431Z I ASIO     [ReplicationExecutor] after drop, pool was empty, going to spawn some connections
2016-09-06T07:42:38.861Z I ASIO     [NetworkInterfaceASIO-0] Successfully connected to m2.companyName.com:40000
2016-09-06T07:42:38.936Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state SECONDARY
2016-09-06T07:42:49.086Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state PRIMARY
2016-09-06T07:42:59.161Z I ASIO     [ReplicationExecutor] dropping unhealthy pooled connection to m2.companyName.com:40000
2016-09-06T07:42:59.161Z I ASIO     [ReplicationExecutor] after drop, pool was empty, going to spawn some connections
2016-09-06T07:42:59.590Z I ASIO     [NetworkInterfaceASIO-0] Successfully connected to m2.companyName.com:40000
2016-09-06T07:42:59.665Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state SECONDARY
2016-09-06T07:43:09.814Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state PRIMARY
2016-09-06T07:43:19.889Z I ASIO     [ReplicationExecutor] dropping unhealthy pooled connection to m2.companyName.com:40000
2016-09-06T07:43:19.889Z I ASIO     [ReplicationExecutor] after drop, pool was empty, going to spawn some connections
2016-09-06T07:43:20.317Z I ASIO     [NetworkInterfaceASIO-0] Successfully connected to m2.companyName.com:40000
2016-09-06T07:43:20.392Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state SECONDARY
2016-09-06T07:43:30.542Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state PRIMARY
2016-09-06T07:43:34.054Z I NETWORK  [initandlisten] connection accepted from 1.1.1.13:49188 #9 (6 connections now open)
2016-09-06T07:43:34.106Z I ACCESS   [conn9] Successfully authenticated as principal sa on admin
2016-09-06T07:43:40.617Z I ASIO     [ReplicationExecutor] dropping unhealthy pooled connection to m2.companyName.com:40000
2016-09-06T07:43:40.617Z I ASIO     [ReplicationExecutor] after drop, pool was empty, going to spawn some connections
2016-09-06T07:43:41.045Z I ASIO     [NetworkInterfaceASIO-0] Successfully connected to m2.companyName.com:40000
2016-09-06T07:43:41.120Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state SECONDARY
2016-09-06T07:43:51.270Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state PRIMARY
2016-09-06T07:43:51.277Z I NETWORK  [initandlisten] connection accepted from 1.1.1.13:49193 #10 (7 connections now open)
2016-09-06T07:43:51.339Z I ACCESS   [conn10] Successfully authenticated as principal sa on admin
2016-09-06T07:44:01.346Z I ASIO     [ReplicationExecutor] dropping unhealthy pooled connection to m2.companyName.com:40000
2016-09-06T07:44:01.346Z I ASIO     [ReplicationExecutor] after drop, pool was empty, going to spawn some connections
2016-09-06T07:44:01.775Z I ASIO     [NetworkInterfaceASIO-0] Successfully connected to m2.companyName.com:40000
2016-09-06T07:44:01.850Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state SECONDARY
2016-09-06T07:44:12.001Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state PRIMARY
2016-09-06T07:44:22.077Z I ASIO     [ReplicationExecutor] dropping unhealthy pooled connection to m2.companyName.com:40000
2016-09-06T07:44:22.077Z I ASIO     [ReplicationExecutor] after drop, pool was empty, going to spawn some connections
2016-09-06T07:44:22.506Z I ASIO     [NetworkInterfaceASIO-0] Successfully connected to m2.companyName.com:40000
2016-09-06T07:44:22.582Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state SECONDARY
2016-09-06T07:44:32.732Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state PRIMARY
2016-09-06T07:44:42.807Z I ASIO     [ReplicationExecutor] dropping unhealthy pooled connection to m2.companyName.com:40000
2016-09-06T07:44:42.807Z I ASIO     [ReplicationExecutor] after drop, pool was empty, going to spawn some connections
2016-09-06T07:44:43.237Z I ASIO     [NetworkInterfaceASIO-0] Successfully connected to m2.companyName.com:40000
2016-09-06T07:44:43.312Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state SECONDARY
2016-09-06T07:44:53.462Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state PRIMARY
2016-09-06T07:45:03.537Z I ASIO     [ReplicationExecutor] dropping unhealthy pooled connection to m2.companyName.com:40000
2016-09-06T07:45:03.537Z I ASIO     [ReplicationExecutor] after drop, pool was empty, going to spawn some connections
2016-09-06T07:45:03.966Z I ASIO     [NetworkInterfaceASIO-0] Successfully connected to m2.companyName.com:40000
2016-09-06T07:45:04.041Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state SECONDARY
2016-09-06T07:45:14.191Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state PRIMARY
2016-09-06T07:45:24.266Z I ASIO     [ReplicationExecutor] dropping unhealthy pooled connection to m2.companyName.com:40000
2016-09-06T07:45:24.266Z I ASIO     [ReplicationExecutor] after drop, pool was empty, going to spawn some connections
2016-09-06T07:45:24.700Z I ASIO     [NetworkInterfaceASIO-0] Successfully connected to m2.companyName.com:40000
2016-09-06T07:45:24.775Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state SECONDARY
2016-09-06T07:45:34.925Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state PRIMARY
2016-09-06T07:45:45.000Z I ASIO     [ReplicationExecutor] dropping unhealthy pooled connection to m2.companyName.com:40000
2016-09-06T07:45:45.000Z I ASIO     [ReplicationExecutor] after drop, pool was empty, going to spawn some connections
2016-09-06T07:45:45.428Z I ASIO     [NetworkInterfaceASIO-0] Successfully connected to m2.companyName.com:40000
2016-09-06T07:45:45.504Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state SECONDARY
2016-09-06T07:45:55.654Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state PRIMARY
2016-09-06T07:46:05.729Z I ASIO     [ReplicationExecutor] dropping unhealthy pooled connection to m2.companyName.com:40000
2016-09-06T07:46:05.729Z I ASIO     [ReplicationExecutor] after drop, pool was empty, going to spawn some connections
2016-09-06T07:46:06.157Z I ASIO     [NetworkInterfaceASIO-0] Successfully connected to m2.companyName.com:40000
2016-09-06T07:46:06.232Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state SECONDARY
2016-09-06T07:46:16.382Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state PRIMARY
2016-09-06T07:46:26.458Z I ASIO     [ReplicationExecutor] dropping unhealthy pooled connection to m2.companyName.com:40000
2016-09-06T07:46:26.458Z I ASIO     [ReplicationExecutor] after drop, pool was empty, going to spawn some connections
2016-09-06T07:46:26.889Z I ASIO     [NetworkInterfaceASIO-0] Successfully connected to m2.companyName.com:40000
2016-09-06T07:46:26.964Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state SECONDARY
2016-09-06T07:46:37.115Z I REPL     [ReplicationExecutor] Member m2.companyName.com:40000 is now in state PRIMARY
2016-09-06T07:46:43.185Z I NETWORK  [initandlisten] connection accepted from 2.2.2.2:53847 #11 (8 connections now open)
2016-09-06T07:46:43.392Z I ACCESS   [conn11]  authenticate db: $external { authenticate: 1, mechanism: "MONGODB-X509", user: "CN=m2.companyName.com,O=companyName,ST=ON,C=CA" }
2016-09-06T07:46:43.541Z I NETWORK  [conn11] end connection 2.2.2.2:53847 (7 connections now open)
2016-09-06T07:46:44.370Z I NETWORK  [initandlisten] connection accepted from 3.3.3.3:58224 #12 (8 connections now open)
2016-09-06T07:46:44.434Z I ACCESS   [conn12]  authenticate db: $external { authenticate: 1, mechanism: "MONGODB-X509", user: "CN=arb.companyName.com,O=companyName,ST=ON,C=CA" }
2016-09-06T07:46:44.451Z I NETWORK  [conn12] end connection 3.3.3.3:58224 (7 connections now open)
2016-09-06T07:46:47.832Z I REPL     [ReplicationExecutor] New replica set config in use: { _id: "companyName2", version: 32593, protocolVersion: 1, members: [ { _id: 1, host: "arb.companyName.com:40000", arbiterOnly: true, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 2, host: "m3.companyName.com:40000", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 11.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 4, host: "m2.companyName.com:40000", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 3.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 }, replicaSetId: ObjectId('573dfcd0e8ae6154ff80c50d') } }
2016-09-06T07:46:47.832Z I REPL     [ReplicationExecutor] This node is m3.companyName.com:40000 in the config
2016-09-06T07:46:47.832Z I REPL     [ReplicationExecutor] transition to STARTUP2
2016-09-06T07:46:47.907Z I REPL     [ReplicationExecutor] Scheduling priority takeover at 2016-09-06T03:46:57.907-0400
2016-09-06T07:46:48.040Z I REPL     [ReplicationExecutor] syncing from: m2.companyName.com:40000
2016-09-06T07:46:48.545Z I REPL     [SyncSourceFeedback] setting syncSourceFeedback to m2.companyName.com:40000
2016-09-06T07:46:48.977Z I ASIO     [NetworkInterfaceASIO-0] Successfully connected to m2.companyName.com:40000
2016-09-06T07:46:50.983Z I REPL     [ReplicationExecutor] transition to RECOVERING
2016-09-06T07:46:50.985Z I REPL     [ReplicationExecutor] transition to SECONDARY
2016-09-06T07:46:51.438Z I REPL     [ReplicationExecutor] could not find member to sync from
2016-09-06T07:46:57.907Z I REPL     [ReplicationExecutor] Canceling priority takeover callback
2016-09-06T07:46:57.907Z I REPL     [ReplicationExecutor] Starting an election for a priority takeover
2016-09-06T07:46:57.907Z I REPL     [ReplicationExecutor] conducting a dry run election to see if we could be elected
2016-09-06T07:46:57.916Z I REPL     [ReplicationExecutor] dry election run succeeded, running for election
2016-09-06T07:46:57.925Z I REPL     [ReplicationExecutor] election succeeded, assuming primary role in term 244
2016-09-06T07:46:57.925Z I REPL     [ReplicationExecutor] transition to PRIMARY
2016-09-06T07:46:58.345Z I ASIO     [NetworkInterfaceASIO-0] Successfully connected to m2.companyName.com:40000
2016-09-06T07:46:58.362Z I ASIO     [NetworkInterfaceASIO-0] Successfully connected to m2.companyName.com:40000
2016-09-06T07:46:58.440Z I REPL     [rsSync] transition to primary complete; database writes are now permitted

私が気づいた最も明白なことは、「そのようなホストは認識されていません」というエラーです。Windows が名前を解決する前に、Mongo が起動しようとしている可能性がありますか?

4

2 に答える 2

2

mongoの起動を遅らせてください。これにより、この問題が解決されます。

于 2016-09-15T06:04:05.417 に答える