What I learned is that many years already into HTML5 video, the landscape is bleak. Solutions that work are either complex or proprietary. On desktops, flash is still the best solution because of poor support for alternatives. iOS only supports HLS. Android default browsers and chrome have buggy support for DASH so it's not really usable, so WebM is the best for the rest. So for a minimal setup, assuming you want to have users be able to view your broadcast on desktop, iOS and Android, you'll need to support RTMP / flash (desktop), HLS (iOS), and WebM (everyone else).
In terms of open source solutions that support all this, there's not much out there. Red5 streaming server is probably the closest thing to a product that supports this, it has HLS at least available as a plugin. I chose instead to go with as far as I can tell the only other open source option, nginx-rtmp, due to its simplicity. To summarize its capability, it can receive, record, broadcast, or rebroadcast one or more RTMP video streams, and it has built in HLS support. It does not support WebM so I had to use another product called stream-m to host that protocol.
I was expecting up to 100 simultaneous users viewing the stream, which is not that many but enought that I was worried about one server being able to handle all the load. So I achieved a good load-balancing solution with:
- 1 main media server
- 2 servers to serve clients using RTMP
- 2 servers to serve clients using HLS
- 2 servers to serve clients using WebM
The main server needs to do a lot: receive the incoming video stream and at a minimum run several simultaneous instances of ffmpeg to transcode it to the various bitrates for HLS streaming. It maintains the HLS playlist and also rebroadcasts the RTMP stream 4 times, one to each of the RTMP and WebM servers. I used a giant c3.8xlarge instance for this, unfortunately. I could probably have gotten away with a c3.4xlarge, which is half the size, but during testing I got a dropped stream once or twice and that was unacceptable. Dozens and dozens of ffmpeg threads running in real time just takes a lot of cycles.
Having the clients connect to the dedicated client access servers ensure that there will always be a constant and predictable load on the main media server. This is great, as there will be no chance of a traffic surge interrupting the main encoding process. For RTMP, each client access server is simply configured to rebroadcast the stream to any client that requests it. This means the media server needs only to rebroadcast the stream once for each client server and that allows the client access server to serve as many as it can handle. This allows scaling up by a large constant factor for each stream added from the media server. RTMP streams are extremely cheap with nginx-rtmp. I there was a huge broadcast, the servers could be arranged in a tree, where the media server would rebroadcast to some number of second tier servers, which could rebroadcast to some number of third tier servers which clients could connect to directly. This would allow exponential scaling for each new stream added to the media server, and I can imagine this setup going as large as needed.
HLS scaling is similar. The protocol here is HTTP plain and simple, so each client access server has a plain vanilla installation of nginx with reverse proxy and caching. So the media server servers each HLS file to each of the HLS client access servers once, the client access server caches it and serves to as many clients as it can until the next HLS files are ready. Really simple, and the same tree structure could be applied to these servers to scale to infinity.
stream-m for WebM takes an RTMP stream as input and outputs a WebM stream for clients. The media server serves one RTMP stream to each client access server and allows the client access server to serve as many as it can. All levels in a tree for this protocol would be nginx-rtmp instances passing the rtmp stream along and only the leaves would have stream-m installed.
DNS then takes care of the rest. Each client access server gets an A record pointing to either rtmp.example.com, hls.example.com, or webm.example.com and load is balanced between them that way. This doesn't provide fault tolerance, so to get that in a serious deployment would require either active monitoring that removes bad servers from DNS or an active load balancer that's fine with RTMP, HTTP and WebM. (ELB for the record is fine with RTMP).
Feel free to try out the solution on github. This is an somewhat commoditized extract of the solution I actually used, which has much better security and nicities like automatically setting DNS records but is too tightly tied to my own AWS/on-prem environment to be generally useful. It should be as easy to use as simply creating an AWS keypair and an identity, and then uploading the provided cloudformation template. Let me know your experience if you try it.