For a couple of years now, WebRTC has captivated the communications industry with high expectations. Against this constant interest, the reasons for the excitement has changed. Even the defining characteristics have changed. Initially, what was noted was that there is no need for any downloads. Slowly it was deprecated since many realized that though the codecs and media engine is not downloaded, the signaling procedure, a significant component, is being downloaded. Then the attention shifted to development ease. But subsequently, people realized many developers strong in web technologies usually lack certain communication oriented skills, like signaling protocol and use of STUN/TURN. Lately, with emphasis on mobiles and Apple’s continued silence on supporting WebRTC in iOS environment, it has become necessary to focus on native apps for communication, avoiding browser altogether. Given this necessity, it looks like any app can claim the WebRTC moniker as long as it uses just a small aspect of the original idea. Some apps who do not have any aspects of “web” are now routinely being called WebRTC-enabled. The purpose of this post is to define what is WebRTC, why WebRTC is attractive and how to use WebRTC capability. (The other 3 Ws – Who, When and Where are really not that interesting.)
(Note: In the following, “media” refers not only yo voice and video, but also data transfer.)
What is WebRTC
This section enumerates defining characteristics of WebRTC. Not all of them are mandatory. Since the use cases are varied, some characteristics are not needed for some use cases. But there are a couple of them that is mandatory for all use cases. Accordingly, the list is arranged in importance.
- The session is initiated with an HTTP GET/POST. This is the natural way to do within a browser. Though a native app is free to use any mechanism for self-contained initiation, it should allow this or a custom URI scheme for an external entity that requires the services of the app.
- Use of triangular connection model. Both the initiator and the recipient of the session are served by the same entity – “server”.
- The session control procedure is dynamically downloaded at the time of session initiation. This may not be required for some native apps that are dedicated to connect to only pre-determined server(s) which use a fixed set of procedures.
- The link that carries session control messages and the link that carries session media traffic are separate.
- It is expected that the end-points have UI elements like screen where sophisticated and context-specific information. This will of course be further dependent on the specific use case.
This section elaborates on the rationale for the requirements listed in the previous section, by pointing out the benefits.
- Traditionally, communication systems have been stand alone and monolithic entities. Consequently, it has been difficult to integrate with other processes or interwork other communication systems. For example, BECP has been touted for a long time. But communication systems required the session initiation be done within the system. If there is a need to bring in any historical information, which would naturally be contained in the application associated with the business process, then these two systems must agree on the method and format of such exchange. This results in logistic and coordination quagmire. But requirement 1 simplifies this enormously. The URI scheme allows passing of information from the business process to the communication system straightforward. Additionally, the communication system can be upgraded or changed to an alternate system with minimal disruption. For example, the business process can utilize HTTP redirect to maintain the original URI, but redirect to a different entity with minimal administrative intervention.
- Use of triangular connection coupled with dynamic download of signaling procedure eliminates any interoperability issues and feature compatibility. A server can introduce a new feature and try out a different UI without worrying whether clients will be handle the change.
- If the session control messages uses a link independent of the media link, then it becomes much easier and quicker to stop and restart media flow, without incurring session setup time, which will be longer due to the required authentication procedure.
How to use WebRTC
WebRTC apps can be designed for a wide variety of applications, requiring both high-end and low-end scalability.
- Personal communication system: A small footprint WebRTC app server that will allow the host”s friends initiate communication session from a browser as and when needed. This eliminates the need for a 3rd party provider. Here WebRTC helps to eliminate the need for network effect at the application layer.
- Contact center: A potential customer browsing a website can initiate a communication session. The website can dynamically construct the session’s reach URL based on the customer’s profile and browsing history and other cookie information. Since the reach URL is dynamically generated, the website owner can change the WebRTC app server very easily. The communication provider has been commoditized.
- Unified Communication System: An organization can decide to gradually introduce a new UC system. Unlike traditional systems, a system based on WebRTC allows for “guest access”. This way early adopters do not suffer from lack of network effect. Indeed early adopters can become advocates of the new system because they will be in a position to demonstrate the benefits to late adopters.