What exactly is HLS chunking?
HLS chunking is a way to fetch the video stored in an CDN in an chunked format such that the complete video file does not get fetched. The functionality of this is to just reduce the waiting/loading time of the end user. Various tools like AWS Mediaconvert are used to convert any mp4 files into respective HLS transcoded files. Libraries like FFMPEG could also be used to convert a file into their respective HLS chunks.
An HLS file consists of 2 files:
- m3u8 file => This file is the primary link of the complete video file that is being transcoded. This will contain the links/connections between all the .ts files which are essentially the chunked formatted videos that the mp4 file is converted into.
- .ts file => This file is the chunked video format which will eventually be fetched over the network and played on an media player in a particular website.
There is a big benefit of using chunking as it reduces the wait and load time for the user and improves the UX. This also benefits the costs as the complete bandwidth of the video file is not used rather only chunks are being fetched hence less bandwidth is being consumed.
Vercel's architecture:
Vercel consists of a serverless architecture which could essentially be helpful in some cases but with vercel there are a lot of limitations on how a particular process/software could be used. One of the most important libraries in the media conversion(FFMPEG) is not supported by vercel at all hence no transcoding could be possible on the cloud environment. To battle this one could use the FFMPEG static but the functionality of that library is also limited. Not to mention that vercel has server timeouts for api calls that exceed their time limits so all the transcoding process would need to be done on the frontend which would result in a longer uploading time and hence bad UX.
The reason this problem was so interesting is because of vercel. I had to change my approach several times due to vercel's limitation. Even tried running an microservice over at render just to transcode but the time was still a bit too much. Also the server function timeouts were a nuisance.
MediaConvert:
MediaConvert is a service provided by AWS which helps in transcoding a particular media over to another form. It processes the MP4 and could convert it into the appropriate HLS format. Since the application was using AWS bucket to store the data/videos, the best option was to integrate Mediaconvert internally so that the video is transcoded internally and the appropriate files are processed without the user having to wait for a long amount of time. User would only be required to wait until the uploading of the video is done to S3 as an mp4 file. After that, an API call which will be returned early triggers an Media Convert process to convert the specific video(mp4 formatted) to the respective hls formatted video and store it in the same S3 bucket. This would ensure that the HLS files are appropriately stored in the S3 bucket and would be fetched according to chunks whenever a user fetches them.
Workflow(Solution):
The final solution that was implemented while keeping vercel's limitation and the FFMPEG unavailability in mind was:
- The user uploads the video through an upload component to an S3 bucket on our AWS using the presigned URL on the frontend to avoid the server timeouts of vercel
- The URL is constructed using the appropriate parameters and stored in the mongoDB model which will be used to fetch the m3u8 file which eventually will fetch the video in chunks.
- After the uploading of the file is completed, another API call goes to the backend with the ID of the uploaded video and triggers an mediaConvert process on the uploaded mp4 file.
- The required mp4 file is converted into the appropriate format(HLS) and stored separately in another folder on the S3 bucket.
- This would make the video available after a few mins/hours depending on the length and size of the video as the video would still be in the transcoding process by MediaConvert.
- Once the transcoding is completed, whenever a user opens a video an request goes to fetch the m3u8 file which would eventually link all the chunked .ts files and fetch them appropriately such that only the required chunks are fetched and no extra bandwidth is being used.
This was the solution that was implemented in the end. While there a few downsides to this solution such as the cost factor as MediaConvert by AWS is not free. Media Convert has a pay as you use model associated hence it would incur a cost to the overall environment. Using FFMPEG could be the better solution but due to cloud limitations this was the optimal solution.
Alternative:
One alternative that could be used to this approach is to setup a transcoding and uploading service over at another cloud provider like render and then communicate the 2 servers to just propagate the required information to construct the URL and store it in the mongoDB model. This could result in the user's time to upload to increase a bit but would reduce the cost factor. It would also improve the S3 storage as the HLS formatted videos are going to be directly stored in it without the need of the original mp4 file which will be discarded once the transcoding is done successfully using the FFMPEG library.
One of the annoying results of the implemented solution is to clean out the original mp4 data/videos manually once a week as they are just not going to be used anymore. That problem could also be solved through this alternative.
Conclusion:
Implementing HLS chunking for this application was definitely an interesting challenge primarily due to the limitations of vercel. The amount of iterations that happened in the development process to come up with the optimal solution was tiring but the brainstorming aspect of it was really fun. Thanks for reading this far.