Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
384 views
in Technique[技术] by (71.8m points)

monitoring - Sidecar containers in Kubernetes Jobs?

We use Kubernetes Jobs for a lot of batch computing here and I'd like to instrument each Job with a monitoring sidecar to update a centralized tracking system with the progress of a job.

The only problem is, I can't figure out what the semantics are (or are supposed to be) of multiple containers in a job.

I gave it a shot anyways (with an alpine sidecar that printed "hello" every 1 sec) and after my main task completed, the Jobs are considered Successful and the kubectl get pods in Kubernetes 1.2.0 shows:

NAME                                         READY     STATUS      RESTARTS   AGE
    job-69541b2b2c0189ba82529830fe6064bd-ddt2b   1/2       Completed   0          4m
    job-c53e78aee371403fe5d479ef69485a3d-4qtli   1/2       Completed   0          4m
    job-df9a48b2fc89c75d50b298a43ca2c8d3-9r0te   1/2       Completed   0          4m
    job-e98fb7df5e78fc3ccd5add85f8825471-eghtw   1/2       Completed   0          4m

And if I describe one of those pods

State:              Terminated
  Reason:           Completed
  Exit Code:        0
  Started:          Thu, 24 Mar 2016 11:59:19 -0700
  Finished:         Thu, 24 Mar 2016 11:59:21 -0700

Then GETing the yaml of the job shows information per container:

  status:
    conditions:
    - lastProbeTime: null
      lastTransitionTime: 2016-03-24T18:59:29Z
      message: 'containers with unready status: [pod-template]'
      reason: ContainersNotReady
      status: "False"
      type: Ready
    containerStatuses:
    - containerID: docker://333709ca66462b0e41f42f297fa36261aa81fc099741e425b7192fa7ef733937
      image: luigi-reduce:0.2
      imageID: docker://sha256:5a5e15390ef8e89a450dac7f85a9821fb86a33b1b7daeab9f116be252424db70
      lastState: {}
      name: pod-template
      ready: false
      restartCount: 0
      state:
        terminated:
          containerID: docker://333709ca66462b0e41f42f297fa36261aa81fc099741e425b7192fa7ef733937
          exitCode: 0
          finishedAt: 2016-03-24T18:59:30Z
          reason: Completed
          startedAt: 2016-03-24T18:59:29Z
    - containerID: docker://3d2b51436e435e0b887af92c420d175fafbeb8441753e378eb77d009a38b7e1e
      image: alpine
      imageID: docker://sha256:70c557e50ed630deed07cbb0dc4d28aa0f2a485cf7af124cc48f06bce83f784b
      lastState: {}
      name: sidecar
      ready: true
      restartCount: 0
      state:
        running:
          startedAt: 2016-03-24T18:59:31Z
    hostIP: 10.2.113.74
    phase: Running

So it looks like my sidecar would need to watch the main process (how?) and exit gracefully once it detects it is alone in the pod? If this is correct, then are there best practices/patterns for this (should the sidecar exit with the return code of the main container? but how does it get that?)?

** Update ** After further experimentation, I've also discovered the following: If there are two containers in a pod, then it is not considered successful until all containers in the pod return with exit code 0.

Additionally, if restartPolicy: OnFailure is set on the pod spec, then any container in the pod that terminates with non-zero exit code will be restarted in the same pod (this could be useful for a monitoring sidecar to count the number of retries and delete the job after a certain number (to workaround no max-retries currently available in Kubernetes jobs)).

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can use the downward api to figure out your own podname from within the sidecar, and then retrieving your own pod from the apiserver to lookup exist status. Let me know how this goes.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...