Diving into ExoPlayer: Getting more control over the framework

Mahmoud Bahaa
6 min readOct 26, 2018

Google’s ExoPlayer is a playback framework that supports a wide variety of streaming technologies on Android. Many engineers use it as a black box, and to start with, so did I. You only need to pass in a file (e.g. MP4, a manifest) or a URL and it plays. However, my goal was to make it play with a stream of audio and video access units rather than fragments (or chunks). To do this, I had to strip away much of the logic incorporated in the framework (e.g. manifest parsing, fragment demultiplexing, adaptive bitrate selection) and use it only to render frames. So, I’m writing this post to provide crucial clues on how to customize ExoPlayer. If you find yourself in the same situation, then this article is for you.

The Building Blocks

The easiest way to get started with ExoPlayer is to use SimpleExoPlayer just to see some video playing(check the guide for details). But this is the simple case described above. It would be helpful as well if you have ExoPlayer’s (2.5.4) code checked out so you can follow the article. Let’s look at my case which was more complicated.

You are going to write your own logic, so start by ditching SimpleExoPlayer. To create the player instance use:

ExoPlayer player = ExoPlayerFactory.newInstance
(renderers, trackSelector, loadControl);

Can I use SimpleExoPlayerView even when I’m not using SimpleExoPlayer? No, you can use Android’s SurfaceView or TextureView and pass it on as a parameter to the Renderer.

From the previous line of code, you realize that ExoPlayer has customizable components (Thanks Google!) that you need to inject. Google also provides good default implementations of them.

The following is an abstract overview of the main components and how they interact with each other.

The renderers are the components responsible for decoding (decryption if needed) and outputting frames. So you want to have your own renderers? Easy! Just create Renderer[] and pass it to the player (you probably need MediaCodecVideoRenderer and MediaCodecAudioRenderer). In case you need custom rendering logic, you can extend these classes. You can also use RenderersFactory.

What is a TrackSelector then? You guessed right, it selects the tracks for each renderer, and their RendererConfiguration. Similarly, either create your own class and extend TrackSelector or use the DefaultTrackSelector.

The LoadControl — as the name hints — defines the criteria on which ExoPlayer should stop or continue loading media (audio/video) data. I like to think of it as the media stream manager.

The Media Flow

To recap, I wanted to utilize ExoPlayer’s rendering and A/V synchronization components to play my access units’ stream. So, I needed:

  1. a buffer to hold the media that ExoPlayer can pull from (SampleStream).
  2. to represent this media in wrappers that ExoPlayer can understand (MediaSource, MediaPeriod).

So in this section I’ll focus on some aspects of customizing SampleStream, MediaSource and MediaPeriod.

Good, so we have a player instance and have a basic understanding of the main components of the player, what’s next? We need to prepare the player for playback. The preparation configures the codecs for the incoming stream and hooks ExoPlayer components to each other. To prepare the player, do the following:

CustomMediaSource mediaSource = new CustomMediaSource(...);
player.prepare(mediaSource)

To play content, ExoPlayer needs a Timeline. The MediaSource holds the Timeline of the media and exposes the MediaPeriod. ExoPlayer provides various implementations of MediaSource and MediaPeriod, but nothing stops you from defining your own.

If you don’t need anything complicated, you can use a SinglePeriodTimeline as your Timeline. The Timeline is exposed by the MediaSource when player.prepare() invokes prepareSource() on the injected MediaSource. To expose the Timeline, just add the following line in your prepareSource():

// In your CustomMediaSource (extends MediaSource) class
@Override
void prepareSource(ExoPlayer player, boolean isTopLevelSource, Listener listener) {
...
listener.onSourceInfoRefreshed(timeline, null);
...
}

Another important method in the MediaSource is createPeriod(). It is invoked by the player to get the MediaPeriod.

// In your CustomMediaSource (extends MediaSource) class
@Override
public MediaPeriod createPeriod(MediaPeriod id, Allocator allocator) {
// You can use any other MediaPeriod provided by ExoPlayer.
return new CustomMediaPeriod(formats, allocator);
}

Remember the LoadControl? It doesn’t poll the amount of loaded data, so you need to tell ExoPlayer explicitly to check the data you have so far and manage the media feed. Therefore, you need to use the Callback that is passed to MediaPeriod prepare() as follows:

// In your CustomMediaPeriod (extends MediaPeriod) class
@Override
public void prepare(final Callback callback, long positionUs) {
...
// Post this request to ExoPlayer when it fits.
// This depends on your usecase
final Runnable loadingRequest = new Runnable() {
@Override public void run() {
// Tells ExoPlayer to check the LoadControl
callback.
onContinueLoadingRequested(classExtendsSequenceableLoader);
}
...
}

This call tells ExoPlayer to query the LoadControl via shouldContinueLoading(). To listen to these state changes, you need to implement the player EventListener. The state change will be populated to onLoadingChanged(isLoadingChanged) which is part of your EventListener interface.

By now you’re probably thinking when is he going to stop throwing new components at us? Well, not yet. To simplify things though, MediaPeriod has a set of methods, but I want to focus on:

selectTracks(TrackSelection[] selections, boolean[] mayRetainStreamFlags, SampleStream[] streams, boolean[] streamResetFlags, long positionUs);

TrackSelection[] holds the formats (e.g. mime type, frame rate) of the media. Wait, what? How do I get the media formats? I assume that you already know the formats/configuration of your streams by other means (e.g. from the manifest). OK, how do I tell ExoPlayer the formats of the media I want to play? My bad! You need to expose it to ExoPlayer in your MediaPeriod getTrackGroups(). For this you will need to wrap them in a TrackGroupArray.

// In your CustomMediaPeriod (extends MediaPeriod) class
// The logic to configure the tracks does not have to be in the
// prepare method.
@Override
public void prepare(final Callback callback, long positionUs) {
// Pass Format[] to MediaPeriod based on your application logic
mediaTracks = buildTracks(formats);
...
}
// ExoPlayer calls this method early on to get the playable tracks
@Override
TrackGroupArray getTrackGroups() {
// These are the configured (formatted) media tracks
return mediaTracks;
}
private TrackGroupArray buildTracks() {
TrackGroup[] trackGroups = new TrackGroup[formats.length];
for (int i = 0; i < trackGroups.length; i++) {
trackGroups[i] = new TrackGroup(formats[i]);
}
return new TrackGroupArray(trackGroups);
}
@Override
public long selectTracks(TrackSelection[] selections, boolean[] mayRetainStreamFlags, SampleStream[] streams, boolean[] streamResetFlags, long positionUs) {
...
for (int i = 0; i < selections.length; i++) {
Format format = selections[i].getSelectedFormat();
CustomSampleStram stream = new CustomSampleStream(...);
stream.configure(format);
// ExoPlayer expects you to feed its SampleStream[] with what
// you want the renderers to pull from.
streams.add(stream)
}
...
}

Going back to selectTracks(), there is another crucial parameter, the SampleStream array. The SampleStream is the buffer that holds the encoded (and encrypted) frames. For ExoPlayer to properly pull the data from your SampleStream, you need to make sure that selectTracks() returns the SampleStream(s) that you’re feeding your encoded data to.

So a new component now — SampleStream! It’s the last one. Once the SampleStream(s) are exposed to ExoPlayer via selectTracks(), it will pass them to the renderers. Let’s look at what happens in MediaCodecRenderer.java: The Renderer will then pass a codec buffer (DecoderInputBuffer)to the SampleStream and expects it to be filled with an encoded frame. Not complicated, it basically calls SampleStream readData(). The frame and its presentation timestamp need to be fed to the SampleStream (take a look at SampleQueue and SampleMetadataQueue and their interaction with SampleStream), so that ExoPlayer knows when to render and how to synchronize your media (let’s say audio and video for simplicity). Finally, the Renderer will feed this buffer to the codec for decoding and clocks — with the help of the player — its output on the surface.

You have probably seen the word “customize” a lot in this article, and ExoPlayer does allow you to customize most of its components. However, you should avoid touching ExoPlayer’s sacred components (e.g. ExoPlayerImpl). If the class is package-private, then probably there is no need for you to change it, and there’s a “longer” way to achieve what you want.

Hopefully by now you understand how the media workflow in ExoPlayer works. I have not covered many of ExoPlayer’s areas, but I can write follow-up articles to zoom-in on specific parts if needed.

Notes

  1. This post is based on ExoPlayer 2.5, some of it may become irrelevant with the newer ExoPlayer versions.
  2. Many details have been abstracted for simplicity, so feel free to reach out to me or ExoPlayer for questions.

--

--