let's build your own Netflix

Anurag kumar
6 min readDec 12, 2020

In this article, I discussed Netflix, YouTube, or video and audio streaming platform technology requirements and core functionality (not AWS, Azure requirements, or other cloud providers) on, how you could make a Netflix application, which is going to be power full and efficient with fewer resources.

LET’S MAKE A NETFLIX

AN INTRO TO STREAMING MEDIA

To make a Netflix you need a Video! Of course, you need a video :P

But, that video is not going to be a .mp4 only users can upload any video format, it may be a type of .mov (used by Apple), .mkv, etc see the list of video extensions available it can be anything. So, you have to take care of each video extension format uploaded by users?

No, you don’t have to worry about different video extensions format, what you have to do just convert all of them to one format called .mp4 most preferred format and supported by almost every modern browser.

How you will convert to .mp4 format? Now, transcoding comes into the picture!

TRANSCODING

How to do this?

FFMPEG: It's an open Open Source library handling video, audio, and other multimedia files and streams.

This library helps you to achieve everything just go through the documentation.

I created a video processing node js code. It includes the function to convert different formats to one format, to get video small GIF preview, thumbnails, etc.

You can write this FFmpeg code in a language like Golang for much better performance.

But, the problem is how you will process all videos uploaded by min 100 users at the same time?

Because video processing is a time and resource-consuming task it may take 5, to 40 minutes, or maybe more we don’t know.

And, doing for different users using a signal machine is not possible it will crash your server. So, how to do this?

Put this processing code in AWS Lambda it helps you to achieve parallel processing without any queue. It’s a simple and powerful solution.

Let me know your solutions!

What next ?

Codecs are how the media data is represented or encoded. Some codecs are lossy like MP3, some are lossless like FLAC.

Now, you have an mp4 video file that looks like it's easy to show to clients right?

Just use the video HTML element and set the source URL to the video file.

Looks easy right? But, it's not :P

This will work for 1,2,3… 20 users after that your server becomes slow and hot and after some time it will go down because of server load.

You have to think about bandwidth and latency.

In this case, the video HTML element tries to fetch files at once which may be the size of 10 MB, 100 MB, 1000 MB. I think you got the problem right?

Issues: Buffering, Browser Memory issues, delay in file fetch, etc.

How to solve these big video file issues? How to send them to a browser or client efficiently?

Let me know your solutions!

Make it small in size

1000 MB video file to 2 ( Video and Audio) * 1000 * 1 MB = 2000 files of 1 MB size.

Now, split the video into smaller chunks of audio and video separate, it's easy to send them to browsers right?

Now the bigger problems arrive again:

  1. How to split the video into smaller chunks?
  2. How does the browser know which chunk number to be fetched and how to combine chunks to make it a series of video scenes in real-time?

How to split the video into smaller chunks?

There is one open-source library name Bento4 that helps you to achieve that.

And never forget the fragmentation of your video :)

How does the browser know which chunk number to be fetched and how to combine chunks to make it a series of video scenes in real-time?

1. Byte Range Requests

A browser will make Byte Range Requests on behalf of a media element to buffer content.

It will make the request with some specific headers like Range: bytes=0–100

Then the server will send you the response headers like this

  • Content-Length: 1024
  • Content-Range: 0–1023:4096

I have written the small core MSE ( Media source engine ) which will process this:

But, this library also has some problems because it works on events, and queue any wrong event with a different sequence will crash your video application and you don’t want this.

  1. This library is not able to work in the case of seeking to different time stamps.
  2. Not able to delete the watched buffered chunks from media source which will take more memory.

These are the few things this library not handle.

Writing everything from scratch is a nightmare because I tried but …..

There are different open source libraries are present which handle everything.

Shaka Player: This player support DASH and HLS are two rival formats for delivering video over the web.

Everything is almost done:

Bento4 library generates a .mpd file that contains the information of split chunks that will go through the shaka player. The player will automatically detect information of video file and their chunks sequence.

You also have to maintain the video format for mobile, desktop, Tv devices (1080p, 720p, 480p, 144p)

Summary

  1. Convert all video formats to mp4 (FFmpeg).
  2. Fragment the video file and split it into smaller chunks using (Bento4).
  3. Bento4 gives you the MPD extension file sets it to the shaka player.
  4. Shaka player will now play your video easily.

Don’t forget to see the network requests how everything is working. Furthermore use tools like load balancer, ISP level caching, machine learning to predict the next chunks to be fetched ( Used in Netflix)

Thank you very much for reading.

Let me know about this article in comments :)

--

--

Anurag kumar

SDE @Rentomojo | Co-Founder & CTO @imorph.io | GSoC | CSE, IIT-Patna