https://coreos.com/kubernetes/docs/latest/deploy-addons.htmlの指示に従って、kubedns rc とサービスを作成して KubeDNS を手動でインストールします。
yaml は次のとおりです。
apiVersion: v1
kind: Service metadata:
name: kube-dns
namespace: kube-system
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
kubernetes.io/name: "KubeDNS" spec: selector:
k8s-app: kube-dns clusterIP: ${DNS_SERVICE_IP} ports:
- name: dns
port: 53
protocol: UDP
- name: dns-tcp
port: 53
protocol: TCP
---
apiVersion: v1
kind: ReplicationController
metadata:
name: kube-dns-v11
namespace: kube-system
labels:
k8s-app: kube-dns
version: v11
kubernetes.io/cluster-service: "true" spec: replicas: 1 selector:
k8s-app: kube-dns
version: v11 template:
metadata:
labels:
k8s-app: kube-dns
version: v11
kubernetes.io/cluster-service: "true"
spec:
containers:
- name: etcd
image: gcr.io/google_containers/etcd-amd64:2.2.1
resources:
limits:
cpu: 100m
memory: 500Mi
requests:
cpu: 100m
memory: 50Mi
command:
- /usr/local/bin/etcd
- -data-dir
- /var/etcd/data
- -listen-client-urls
- http://127.0.0.1:2379,http://127.0.0.1:4001
- -advertise-client-urls
- http://127.0.0.1:2379,http://127.0.0.1:4001
- -initial-cluster-token
- skydns-etcd
volumeMounts:
- name: etcd-storage
mountPath: /var/etcd/data
- name: kube2sky
image: gcr.io/google_containers/kube2sky:1.14
resources:
limits:
cpu: 100m
memory: 200Mi
requests:
cpu: 100m
memory: 50Mi
livenessProbe:
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
readinessProbe:
httpGet:
path: /readiness
port: 8081
scheme: HTTP
initialDelaySeconds: 30
timeoutSeconds: 5
args:
# command = "/kube2sky"
- --domain=cluster.local
- name: skydns
image: gcr.io/google_containers/skydns:2015-10-13-8c72f8c
resources:
limits:
cpu: 100m
memory: 200Mi
requests:
cpu: 100m
memory: 50Mi
args:
# command = "/skydns"
- -machines=http://127.0.0.1:4001
- -addr=0.0.0.0:53
- -ns-rotate=false
- -domain=cluster.local.
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
- name: healthz
image: gcr.io/google_containers/exechealthz:1.0
resources:
limits:
cpu: 10m
memory: 20Mi
requests:
cpu: 10m
memory: 20Mi
args:
- -cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1 >/dev/null
- -port=8080
ports:
- containerPort: 8080
protocol: TCP
volumes:
- name: etcd-storage
emptyDir: {}
dnsPolicy: Default
ただし、リソースを作成した後、ステータスを確認すると次のように返されます。
[root@sc-master-1 pods]# kubectl get pods --namespace=kube-system NAME READY STATUS RESTARTS AGE kube-dns-v11-fyug1 2/4 CrashLoopBackOff 2 36s
etcd エラーに絞り込みました。
[root@sc-client-2 jonathan]# docker logs bf0466c30b1d 2016-05-15 08:39:26.124819 I | etcdmain: etcd Version: 2.2.1 2016-05-15
08:39:26.124851 I | etcdmain: Git SHA: 75f8282 2016-05-15
08:39:26.124857 I | etcdmain: Go Version: go1.5.1 2016-05-15 08:39:26.124860 I | etcdmain: Go OS/Arch: linux/amd64 2016-05-15 08:39:26.135982 I | etcdmain: setting maximum number of CPUs to 1, total number of available CPUs is 1 2016-05-15 08:39:26.136562 I | etcdmain: listening for peers on http://localhost:2380 2016-05-15 08:39:26.136704 I | etcdmain: listening for peers on http://localhost:7001 2016-05-15
08:39:26.136746 I | etcdmain: listening for client requests on http://127.0.0.1:2379 2016-05-15 08:39:26.136814 I | etcdmain: listening for client requests on http://127.0.0.1:4001 2016-05-15 08:39:26.136931 I | etcdmain: stopping listening for client requests on http://127.0.0.1:4001 2016-05-15 08:39:26.136943 I | etcdmain: stopping listening for client requests on http://127.0.0.1:2379 2016-05-15
08:39:26.136951 I | etcdmain: stopping listening for peers on http://localhost:7001 2016-05-15 08:39:26.136957 I | etcdmain: stopping listening for peers on http://localhost:2380 2016-05-15 08:39:26.136967 C | etcdmain: mkdir /var/etcd/data/member: permission denied
これが機能しない理由がわかりません。また、コンテナが稼働していないため、フォルダを手動で作成するためにコンテナを実行することはできません。
アップデート:
使用済み:
volumeMounts:
- name: etcd-storage
mountPath: /var/etcd/data:z
etcdエラーを回避するために emptyDir に書き込む権限がありませんが、まだ dns サービスを起動できません。以下は、kube-dns ポッド内のコンテナーの関連ログです。
[root@sc-master-1 pods]# kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
kube-dns-v11-0bkop 3/4 Running 17 20m
skydns のログ:
[root@sc-master-1 pods]# kubectl logs kube-dns-v11-0bkop skydns --namespace=kube-system
2016/05/18 17:01:11 skydns: falling back to default configuration, could not read from etcd: 100: Key not found (/skydns) [3]
2016/05/18 17:01:11 skydns: ready for queries on cluster.local. for tcp://0.0.0.0:53 [rcache 0]
2016/05/18 17:01:11 skydns: ready for queries on cluster.local. for udp://0.0.0.0:53 [rcache 0]
kube2sky のログ:
[root@sc-master-1 pods]# kubectl logs kube-dns-v11-0bkop kube2sky --namespace=kube-system
I0518 17:02:17.693959 1 kube2sky.go:462] Etcd server found: http://127.0.0.1:4001
I0518 17:02:18.697702 1 kube2sky.go:529] Using http://localhost:8080 for kubernetes master
I0518 17:02:18.698071 1 kube2sky.go:530] Using kubernetes API <nil>
I0518 17:02:18.698199 1 kube2sky.go:598] Waiting for service: default/kubernetes
I0518 17:02:18.701663 1 kube2sky.go:604] Ignoring error while waiting for service default/kubernetes: yaml: mapping values are not allowed in this context. Sleeping 1s before retrying.
healthz のログ:
[root@sc-master-1 pods]# kubectl logs kube-dns-v11-0bkop healthz --namespace=kube-system
2016/05/18 17:01:17 Worker running nslookup kubernetes.default.svc.cluster.local 127.0.0.1 >/dev/null
2016/05/18 17:02:12 Client ip 172.17.0.1:35440 requesting /healthz probe servicing cmd nslookup kubernetes.default.svc.cluster.local 127.0.0.1 >/dev/null
2016/05/18 17:03:22 Healthz probe error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local'
, at 2016-05-18 17:02:22.691812173 +0000 UTC, error exit status 1
2016/05/18 17:03:22 Client ip 172.17.0.1:35475 requesting /healthz probe servicing cmd nslookup kubernetes.default.svc.cluster.local 127.0.0.1 >/dev/null
etcd のログ:
[root@sc-master-1 pods]# kubectl logs kube-dns-v11-0bkop etcd --namespace=kube-system
2016-05-18 17:01:02.478791 I | etcdmain: etcd Version: 2.2.1
2016-05-18 17:01:02.478825 I | etcdmain: Git SHA: 75f8282
2016-05-18 17:01:02.478831 I | etcdmain: Go Version: go1.5.1
2016-05-18 17:01:02.478846 I | etcdmain: Go OS/Arch: linux/amd64
2016-05-18 17:01:02.478851 I | etcdmain: setting maximum number of CPUs to 2, total number of available CPUs is 2
2016-05-18 17:01:02.485798 I | etcdmain: listening for peers on http://localhost:2380
2016-05-18 17:01:02.485931 I | etcdmain: listening for peers on http://localhost:7001
2016-05-18 17:01:02.485984 I | etcdmain: listening for client requests on http://127.0.0.1:2379
2016-05-18 17:01:02.486070 I | etcdmain: listening for client requests on http://127.0.0.1:4001
2016-05-18 17:01:02.486300 I | etcdserver: name = default
2016-05-18 17:01:02.486324 I | etcdserver: data dir = /var/etcd/data
2016-05-18 17:01:02.486329 I | etcdserver: member dir = /var/etcd/data/member
2016-05-18 17:01:02.486333 I | etcdserver: heartbeat = 100ms
2016-05-18 17:01:02.486337 I | etcdserver: election = 1000ms
2016-05-18 17:01:02.486341 I | etcdserver: snapshot count = 10000
2016-05-18 17:01:02.486350 I | etcdserver: advertise client URLs = http://127.0.0.1:2379,http://127.0.0.1:4001
2016-05-18 17:01:02.486356 I | etcdserver: initial advertise peer URLs = http://localhost:2380,http://localhost:7001
2016-05-18 17:01:02.486365 I | etcdserver: initial cluster = default=http://localhost:2380,default=http://localhost:7001
2016-05-18 17:01:02.523097 I | etcdserver: starting member 6a5871dbdd12c17c in cluster f68652439e3f8f2a
2016-05-18 17:01:02.523157 I | raft: 6a5871dbdd12c17c became follower at term 0
2016-05-18 17:01:02.523192 I | raft: newRaft 6a5871dbdd12c17c [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
2016-05-18 17:01:02.523198 I | raft: 6a5871dbdd12c17c became follower at term 1
2016-05-18 17:01:02.523329 I | etcdserver: starting server... [version: 2.2.1, cluster version: to_be_decided]
2016-05-18 17:01:02.524093 N | etcdserver: added local member 6a5871dbdd12c17c [http://localhost:2380 http://localhost:7001] to cluster f68652439e3f8f2a
2016-05-18 17:01:03.323562 I | raft: 6a5871dbdd12c17c is starting a new election at term 1
2016-05-18 17:01:03.323722 I | raft: 6a5871dbdd12c17c became candidate at term 2
2016-05-18 17:01:03.323739 I | raft: 6a5871dbdd12c17c received vote from 6a5871dbdd12c17c at term 2
2016-05-18 17:01:03.323776 I | raft: 6a5871dbdd12c17c became leader at term 2
2016-05-18 17:01:03.323787 I | raft: raft.node: 6a5871dbdd12c17c elected leader 6a5871dbdd12c17c at term 2
2016-05-18 17:01:03.324154 I | etcdserver: setting up the initial cluster version to 2.2
2016-05-18 17:01:03.324251 I | etcdserver: published {Name:default ClientURLs:[http://127.0.0.1:2379 http://127.0.0.1:4001]} to cluster f68652439e3f8f2a
2016-05-18 17:01:03.473271 N | etcdserver: set the initial cluster version to 2.2
アップデート:
上記の dns-addon.yml の env 変数にマスターをハードコーディングすることで、一歩先に進むことができました
今私は得る:
[root@sc-master-1 pods]# kubectl logs kube-dns-v11-sgb1r -c kube2sky --namespace=kube-system
I0518 18:08:58.837758 1 kube2sky.go:462] Etcd server found: http://127.0.0.1:4001
I0518 18:08:59.839548 1 kube2sky.go:529] Using master for kubernetes master
I0518 18:08:59.839565 1 kube2sky.go:530] Using kubernetes API <nil>
I0518 18:08:59.839676 1 kube2sky.go:598] Waiting for service: default/kubernetes
更新 :8080 の代わりに fqdn、つまり http://:8080 を使用して、これを機能させることができました。
busybox ポッドを使用して実行できます。
[root@sc-master-1 jonathan]# kubectl exec busybox -- nslookup kubernetes.default.svc.cluster.local 10.254.0.2
Server: 10.254.0.2
Address 1: 10.254.0.2
Name: kubernetes.default.svc.cluster.local
Address 1: 10.254.0.1
これは機能しますが、いくつかの奇妙な動作に気付きました.dnsは、ポッドから実行している限り機能します。
[root@sc-master-1 jonathan]# nslookup kubernetes.default.svc.cluster.local 10.254.0.2
;; connection timed out; trying next origin
;; connection timed out; no servers could be reached
上記のエラーが表示されますが、kube dns がスケジュールされているノードで同じコマンドを実行すると機能します。
[jonathan@sc-client-2 ~]$ nslookup kubernetes.default.svc.cluster.local 10.254.0.2
Server: 10.254.0.2
Address: 10.254.0.2#53
Name: kubernetes.default.svc.cluster.local
Address: 10.254.0.1
クラスタ内の別のノードからテストすると、マスターと同じエラーが発生します。
[root@sc-client-1 jonathan]# nslookup kubernetes.default.svc.cluster.local 10.254.0.2
;; connection timed out; trying next origin
;; connection timed out; no servers could be reached
何が間違っている可能性がありますか?