I ran into this same issue myself. I found that unfortunately the only way to do it, other than switching to QTKit, is to make a separate subtitles layer (a CATextLayer) and position it appropriately as a sublayer to the player layer. The idea is that you set up a periodic time observer to trigger every second or so and update the subtitles, along with (and this is optional) some UI element you might have that shows what the elapsed time is in the video.
I created a basic SubRip (.srt) file parser class; you can find it here: https://github.com/sstigler/SubRip-for-Mac . Be sure to check the wiki for documentation. The class is available under the terms of the BSD license.
Another challenge you might run into is how to dynamically adjust the height of the CATextLayer to account for varying lengths of subtitles, and varying widths of the containing view (if you choose to make it user-resizable). I found a great CALayoutManager subclass that does this, and made some revisions to it in order to get it to work for what I was trying to do: https://github.com/sstigler/height-for-width .
I hope this helps.