Feed on

To start a WebRTC session, a browser has to visit an HTTP URI. Accordingly, EnThinnai allocates one (which we call your canonical WebRTC Call URI) for you. You can share this with your friends, include it in email signature, or embed it in an iFrame in a web page, like your ID page. But using just the canonical URI is so limiting. We could add parameters to the canonical URI so the application server or your browser could use them to provide additional features or information in real time.

By design, your instance of EnThinnai server is not involved when you originate a WebRTC session. But you can store contact information of a contact in your buddy list, so when you hover the mouse over the contact’s name a link will appear for you to initiate a session with that person.

EnThinnai is effectively your portal for others to get in touch with you or access information that you have shared with them. Architecturally, EnThinnai protects your information is to operate under the philosophy called “Default Deny”. It means you have to explicitly identify who can access a piece of information or can initiate a communication session with you; requests from others will be rejected. It should be added that just because you have given permission for a person to access one piece of information does not mean that they can access all others as well. All these people are collected in a list called Buddy list.

You identify a buddy using one of three ways: the first one is of course their indieauth page (if they happen to have one); the second is with their email id so they can be authenticated using a single use password and thirdly a string that resembles a URL which is not really authenticated but depends on “security via obscurity”. The last method is not really secure; but it is their for your convenience and you need to decide on a case by case basis whether it is ok to use it or not.

An implication of the privacy oriented design is that EnThinnai server is not a position to suggest who are all potential contacts that you can include in the buddy list. You need to bootstrap and populate the list. One strategy could be to graduate a person who contacted you using an unauthenticated id and to one who is identified using an indieauth page or an email id. If you are familiar with strong/weak connections, the first two forms of ids are for strong connections and the third one is for weak connections. 

EnThinnai uses an HTTP(S) URI that conforms to IndieAuth spec as the user id. In colloquial terms, it is a web page that you control and links to one or more authentication providers such as GitHub or a PGP key. You can see the instructions on how to setup your page to conform to Indieauth spec, please follow these instructions

This form of user id provides distinct advantages over the usual userid@dname.com, that email and others use. To begin with the chosen scheme is user-centric and not beholden to the first provider you happen to use. The email format id is not easily portable from one provider to another. Technically speaking, the chosen scheme makes it possible where the page is hosted, the authenticator and the service provider are all made independent of each other. Second, EnThinnai server does not have to worry about maintaining the integrity of passwords, a potential security risk, especially when EnThinnai servers are designed to be administered by consumers. Finally, this scheme utilizes the concept of single use password.

The use of HTTP URI as user id is not that new. When I say my Twitter id is @aswath, it is really a shorthand for https://www.twitter.com/aswath. It is the same for Facebook.

One thing to note is that an email id explicitly identifies the domain name where the service is available for this user. In EnThinnai, we carry the information within the page by adding the following line in the HEAD section of the page:

<link href=”https://dname.com” rel=”et.server”>

We recognize that not all will have a web page that they control; at least not initially. To facilitate these users, EnThinnai is designed to create an id page that can be used as the user id.But it should be recognized that these ids may not be portable from one provider to another since the initial provider may decide to delete the id upon migration. So it is important to weigh the tradeoffs before using the id provided by EnThinnai service provider.

This blog post started as a comment to a post by Olle E. Johansson in Facebook. But it grew in length so much so, I decided to capture my thoughts here and post just a link as a comment. I have a long held opinion on this that I have expressed in other fora, but so far have not gotten much traction. I even showed a working demo in VON Spring 2008. Yes it is before WebRTC, but we had used prototypical ideas that later on were included in Websocket and WebRTC. I will express them here as well, with the understanding that I am not being argumentative.
First of all, federation at the application level is unnecessary. We have all the federation we need at the Network Layer as provided by Internet. Whenever an application talks about federation, it invariably dies in real life, both at a technical level, but also due to administrative policies. For example, two autonomous entities find it difficult to federate at protocol level due to mismatch of features. Introducing a new feature becomes a coordination nightmare.
Second, it is misleading to view WebRTC apps as small islands (slide 19); on the contrary it is a castle with a moat, since we individually decide whom we will allow in and the level of access we will allow. Even though we may have our own WebRTC apps, my app is not involved when I originate a conversation session with you- it is between us two and your app. It is the other way around when you originate a session. This eliminates any need for federation between our servers. Whereas federation requires a bilateral agreement, WebRTC affords for unilateral decision.
Third, I propose that our global identity be a HTTP URI. In 2007, we started with OpenID as the identity. Since it didn’t get much traction, we moved onto IndieAuth with the plan to adopt WebID when it becomes widely adopted. The advantage of adopting IndieAuth is that portability is not a concern: one can easily move where the ID is “hosted” or who is the provider of the WebRTC app. Furthermore the authenticator can be independent of id provider. It is a true user-centric id.
Finally in answer to what the calling card would look like, take a look at what I suggested in 2007. This one resembles very much to what Olle suggests in the last slide. Also to see a working example of an id page, how it can be used to serve other contact information according to who is asking for it and how it can be the place from which one can contact you, please see my id page.

3 Ws of W(ebRTC)

For a couple of years now, WebRTC has captivated the communications industry with high expectations. Against this constant interest, the reasons for the excitement has changed. Even the defining characteristics have changed. Initially, what was noted was that there is no need for any downloads. Slowly it was deprecated since many realized that though the codecs and media engine is not downloaded, the signaling procedure, a significant component, is being downloaded. Then the attention shifted to development ease. But subsequently, people realized many developers strong in web technologies usually lack certain communication oriented skills, like signaling protocol and use of STUN/TURN. Lately, with emphasis on mobiles and Apple’s continued silence on supporting WebRTC in iOS environment, it has become necessary to focus on native apps for communication, avoiding browser altogether. Given this necessity, it looks like any app can claim the WebRTC moniker as long as it uses just a small aspect of the original idea. Some apps who do not have any aspects of “web” are now routinely being called WebRTC-enabled. The purpose of this post is to define what is WebRTC, why WebRTC is attractive and how to use WebRTC capability. (The other 3 Ws – Who, When and Where are really not that interesting.)

(Note: In the following, “media” refers not only yo voice and video, but also data transfer.)

What is WebRTC

This section enumerates defining characteristics of WebRTC. Not all of them are mandatory. Since the use cases are varied, some characteristics are not needed for some use cases. But there are a couple of them that is mandatory for all use cases. Accordingly, the list is arranged in importance.

  1. The session is initiated with an HTTP GET/POST. This is the natural way to do within a browser. Though a native app is free to use any mechanism for self-contained initiation, it should allow this or a custom URI scheme for an external entity that requires the services of the app.
  2. Use of triangular connection model. Both the initiator and the recipient of the session are served by the same entity – “server”.
  3. The session control procedure is dynamically downloaded at the time of session initiation. This may not be required for some native apps that are dedicated to connect to only pre-determined server(s) which use a fixed set of procedures.
  4. The link that carries session control messages and the link that carries session media traffic are separate.
  5. It is expected that the end-points have UI elements like screen where sophisticated and context-specific information. This will of course be further dependent on the specific use case.

Why WebRTC

This section elaborates on the rationale for the requirements listed in the previous section, by pointing out the benefits.

  1. Traditionally, communication systems have been stand alone and monolithic entities. Consequently, it has been difficult to integrate with other processes or interwork other communication systems. For example, BECP has been touted for a long time. But communication systems required the session initiation be done within the system. If there is a need to bring in any historical information, which would naturally be contained in the application associated with the business process, then these two systems must agree on the method and format of such exchange. This results in logistic and coordination quagmire. But requirement 1 simplifies this enormously. The URI scheme allows passing of information from the business process to the communication system straightforward. Additionally, the communication system can be upgraded or changed to an alternate system with minimal disruption. For example, the business process can utilize HTTP redirect to maintain the original URI, but redirect to a different entity with minimal administrative intervention.
  2. Use of triangular connection coupled with dynamic download of signaling procedure eliminates any interoperability issues and feature compatibility. A server can introduce a new feature and try out a different UI without worrying whether clients will be handle the change.
  3. If the session control messages uses a link independent of the media link, then it becomes much easier and quicker to stop and restart media flow, without incurring session setup time, which will be longer due to the required authentication procedure.

How to use WebRTC

WebRTC apps can be designed for a wide variety of applications, requiring both high-end and low-end scalability.

  1. Personal communication system: A small footprint WebRTC app server that will allow the host”s friends initiate communication session from a browser as and when needed. This eliminates the need for a 3rd party provider. Here WebRTC helps to eliminate the need for network effect at the application layer.
  2. Contact center: A potential customer browsing a website can initiate a communication session. The website can dynamically construct the session’s reach URL based on the customer’s profile and browsing history and other cookie information. Since the reach URL is dynamically generated, the website owner can change the WebRTC app server very easily. The communication provider has been commoditized.
  3. Unified Communication System: An organization can decide to gradually introduce a new UC system. Unlike traditional systems, a system based on WebRTC allows for “guest access”. This way early adopters do not suffer from lack of network effect. Indeed early adopters can become advocates of the new system because they will be in a position to demonstrate the benefits to late adopters.

In a recent post by WebRTC “activists” on the impact of WebRTC on UC, Alan Quayle writes,

The application diversity being driven by the person we’re trying to communicate with and their preferences. So what impact will WebRTC have on UC? None. Because the problem is in federation of presence, not in the standardization of media codecs, and the lack of federation is driven more by commercial issues than lack of standardization.

There is a way for a WebRTC-based system to address the natural application diversity you identify. There is a fundamental problem in current implementation of distributing Presence information. The problem arises because Presence information is usually pushed to the recipients. While federating, for various reasons it is preferable to selectively share this information outside of the local organization. There is no dependable way to do that. Instead, if the Presence information is pulled, then it will be easy to selectively share Presence information depending on the person querying it. A WebRTC-system can universally support pull request via HTTP.

The next issue federation has to address is signaling protocol. But WebRTC tackles that by dynamically downloading the signaling procedure. This is why it is important to recognize the benefit of triangular connection afforded by WebRTC. Very often, WebRTC is credited with standardizing media codecs. But by allowing dynamic download of signaling procedure, it has eliminated the need to standardize the signaling protocol.

The final point Alan makes is very valid. Till now federation between two organization means there has to be an elaborate organizational agreement has to be reached even  before administration setup can be made. But a WebRTC-based system allows an organization can unilaterally give “Guest access” to some or all of the members of the partnering organization as long as the partnering organization has a federating id mechanism like SSO. The local organization can enforce guest privileges using federated id and maintaining whitelists and blacklists.

Apart from Alan’s points, WebRTC is going to impact UC market in a major way. Thus far, it is very difficult to incrementally roll out UC system. More often than not, users of a UC system get to utilize the full feature set only when they are interacting other users of the same UC system. But “Guest access” allows for incremental roll out. This is going to have impact on the current players as well. Interestingly Skype for Business and Cisco have announced their plans to offer “Guest access”. We have to wait and see how they will be impacted.

But there is a cautionary point that needs to be noted: there is a major gotcha for “Guest access”. If an enterprise will not allow UDP traffic out of its Intranet, then “Guest access” will fail. Current WebRTC/ICE mechanism does not allow for the originating enterprise be involved. There are proposals to address this point. This is critical this gets resolved soon.

In a couple of hours there will be a VUC session on this topic. So I thought it will be useful to record some of my observations and outstanding questions.

  1. A user or administration of the local network must have a way to designate the STUN and TURN servers that override the ones specified by the application. STUN is analogous to DNS server and just like we are at liberty to specify the DNS servers, we must be able to specify the STUN server. Depending on the security considerations, a network may be obligated to record all conversations. To facilitate that, a network may deploy a TURN server and may require all RTC traffic to flow through this server.This can be simple done if the browser were to tacitly utilize its own TURN server and assign the highest priority to the corresponding ICE candidate. This is analogous to using SOCK proxy for HTTP flow.
  2. Both the users and application providers should recognize that external STUN and TURN providers have access to session metadata.
  3. TURN adds overhead and this is further added when ReTURNs are used. TURN needs this additional overhead to multiplex multiple streams between a TURN client and server. Most of the WebRTC use cases will involve a single stream. I think it is a good tradeoff to consume the occasional additional ports at the server, rather than consuming additional bandwidth for all the flows. So, it might be worthwhile to use a relay server rather than a full fledged TURN server.
  4. Some have expressed concern in sharing local address with other clients. Given that Trickle ICE is part of WebRTC, a modification to listing ICE candidates should be considered. Browsers should not include local addresses in the initial candidate set. Instead they should be added if and only if the peer’s server-reflexive or peer-reflexive address matches its own and te connectivity test passes. Of course, we have to recognize that the call setup time may increase slightly.
  5. TURN is required only when both the end-points are behind symmetric NATs. If it is known a priori that this will not be the case (as when the session is always to app’s own device/server), then we can dispense with relay addresses as ICE candidates. If further we know that app’s own device/server will have public Internet presence, then even STUN can be eliminated, since that device/server can use peer-reflexive addr it learns as part of Trickle ICE.
  6. As part of connectivity test, the two end-points must perform authentication of the other end before meaningful information is exchanged.

In a post that prompted me to write this, Tsahi discusses different alternative signaling protocols one can use in a WebRTC-enabled app. In this post, I approach the issue from a different angle and I hope this sheds additional light and helps you to reach a choice appropriate for you.

Before we dig deep, we have to recognize that we have to decide on two independent matters: 1) how will the signaling messages be carried and 2) what will be the signaling protocol. There are very many variables that will affect the optimal answer for your scenario. So it is best that we discuss them in general and let you decide on a case by case basis.

First let us consider the transport mechanism.

  1. Pure HTTP: Since the app will be accessed from a browser, an easy choice would be to use HTTP as the transport. It works great if the browser is initiating a signaling procedure and the server responds.
  2. HTTP w Long Polling/Comet: But there are times, when the server needs to initiate asynchronously. Some examples are when the server wants to notify one user of another’s action like placing mic or speaker on mute. Or the server would like to notify of an incoming call request. Since the server can autonomously initiate an HTTP session an alternate will be to use long polling or Comet. This may increase the load on the server due to excessive polling or may introduce latency and its undesirable effect on UX.
  3. HTTP w Push Notification: Alternatively the server can use Push Notification offered by both Chrome and Firefox to push a notification and upon receiving such a notification, the browser can initiate an HTTP session to continue the procedure. Of course this addresses the server load, but does not address the latency issue, especially for “in-session” procedures. Worse, the latency is affected by a third party service.
  4. Websocket: This where use of Webscoket has its advantages. Since Websocket starts as an HTTP session which is then converted to a persistent TCP session. Almost all browsers (most recent versions) support Websocket and there are server implementations that are very efficient. So it addresses both the issues.
  5. Websocket w Push Notification: If maintaining a Websocket connection during an idle period (so as to inform of an incoming session request), then one can use Push Notification during idle periods and then use Websocket only during active sessions.
  6. Data Channel w X: Final choice is for the server not to be involved during an active session, but allow the browsers to handle the signaling procedures directly between themselves via a WebRTC Data Channel. But this approach does not address how to handle notification during idle periods.

As you can see there are many choices with each having its own trade-offs. But knowing the trade-offs, you can decide the appropriate transport for your use case.

Deciding which protocol to use is either “no-brainer” or “not so fast”. If the paramount objective is to work with already deployed system and WebRTC app is just another access mechanism, then there is nothing more to consider. It is optimal just to use the signaling procedure used by the deployed system and that is that. Otherwise, it is better to start from scratch and ask questions differently. From the time of Q.931 in ISDN Basic Access up to and including SIP, the standards bodies have focused on defining the protocol so as to ensure interoperability between two autonomous systems. Since the end-points will be of different capabilities and present different user experiences, the best a standard can do is to design a protocol that drives basic user interface. Thus for example, when the far-end places a call on hold, the near-end is not notified. It is not clear how to abstract the notification so all variation in the UI can be handled.

Next, let me quickly dismiss a faux use case, but one that is widely considered. It is know as “trapezoidal connection”. In this connection, the two end points are each connected to its own WebRTC app and the two apps are federating between themselves. The fact that the two end-points are using WebRTC as access is incidental; the real crux is that the two apps are federating and they have agreed on a protocol for this. So what the apps will select for protocol belongs to the “no-brainer” category. The apps will select a protocol that is optimal for the agreed upon federation protocol.

So the real interesting use case is where the end-points are directly connected to the app server, the so called “triangular connection”. Since both the end-points are directly connected to the app server and the server can dynamically download the signaling procedures via Javascript, it is in a position to offer a rich user experience by dynamically driving UI elements. The app designer can freely devised the needed signaling procedures – conforming to a standards is not critical. A good analogy is to compare the choice to paint by number and free-form painting. At first glance, paint by number looks straight forward; but in fact it is tedious, no room for error and not very expressive. On the other hand, free-form painting, if you are good at it, is fluid, very expressive and gives lots of freedom. If the choice were only free-form painting, then I will have only blank canvas; with paint by numbers, there is a hope that I will have something that looks like a painting. So I say to each, his own.

Recently, Carl Ford was musing about potential ideas for a WebRTC Hackathon. One idea he had was exploring different UI designs associated with “Video on Hold”. This post is a summary of our design thoughts decisions we made for a WebRTC application that is part of EnThinnai.

He felt that the design used in phone systems don’t work well for smartphones. So probably we should have different approach for video calls. As an example, he was wondering how should the user be notified when she has gone to different browser tab when the held video call is being retrieved by the other party.

In a followup post he elaborates his point. He suggests that we may imitate the idea used in 1A2 Key Systems phones. To see how far we can carry its design we need to go into a bit more detail.

These phones had some white buttons, with each one controlling a line it has access to and at most one button can be engaged. There was a red button that can place the currently engaged line on hold. All these buttons can be lit and also flash to signify the status of the line. For example, the quick flashing light will signify that there is an incoming call; a slow flashing light will signify that the line has been put on hold and a steady light will suggest an active call. Subsequently, Avaya carried over this design idea to their digital sets as well. This concept of “call appearances” and “active call appearance” is natural and very familiar in computer systems using windows. It is direct to observe that capp appearances are nothing more that open windows and active call appearance is active window. When the user selects a window to be active, the OS tacitly places other windows on hold.

But the analogy goes only so far. In a computer, even if a window is not active, activities can go on an inactive window. For example, the user may be playing a You Tube video in an inactive window. Also we should note here that 1A2 Key System phone indicates whether the local user has placed the call on hold or not; it does not know whether the far-end user has placed the call on hold or whether he is retrieving the call, which is the use case Carl wants to explore.

There is one other fundamental difference between 1A2 Key System and the environment WebRTC app will find itself. The phone can safely decide that when a call appearance becomes active, the call that was active must be placed on hold. But that may not be appropriate in the case of WebRTC. For example, the user may want to continue the call while viewing and interacting with the contents of another window. Or the user may have multiple WebRTC session going at the same time in an attempt to emulate a bridged call. So the only safe approach is to let the user explicitly select whether a video call must be placed on hold or not.

If we dig a bit deeper, we will question the basic need to place a call on hold in the first place. In PSTN systems, a call must be placed on hold if the user wants to attend another call because the access line can carry only one call at a time. But that is not the case in the case of WebRTC. The user can equivalently decide to turn off the camera or the display or both instead of placing the whole call on hold.

Recently, call centers have responded to frustrations expressed by callers due to excessive hold times, by introducing a feature called “callbacks” or “virtual queuing”. A webRTC app can offer a similar feature in an elegant manner by making the app a multi-modal one with the text chat session to periodically update the status and use it as a link to provide audio and video cues when an agent becomes available.

These thoughts are captured in the current user interface design used in EnThinnai:

Screenshot of the chat window, just before start of a video call


Screenshot of the chat window, just before start of a video call

1. Screenshot of the chat window, just before start of a video call


2. Screenshot during an active video call


Screenshot of a “held” or “virtually queued” call

3. Screenshot of a “held” or “virtually queued” call


Far-end sends a text message and creates audio sound by pressing the “b” button

4. Far-end sends a text message and creates audio sound by pressing the “b” button

Inasmuch the main utility of a formal living room is to entertain visiting guests, a WebRTC allows guests to initiate a communication session with the subscriber of the app that utilizes WebRTC.

Many go to enormous lengths to furnish and decorate a living space normally called Living Room. Notwithstanding the expense involved and the name, it is used mostly when guests are visiting. When we are entertaining guests and they are using the amenities in that room, there is no question of whether the guests have similar room and similar amenities in their houses. The only requirement is that they visit you and that you are ready to host them.

So is the case with WebRTC apps. The main reason for the app and for you to sign up for one is so people can initiate communication session with you. The only requirements are that your guests have a compatible browser and that you are willing to communicate with them.

Just because you have a lavish formal Living room does not mean that when you visit one of your friends you will experience similar luxury. Similarly, subscribing to a WebRTC app may not imply that you can initiate a communication session to one of your friends. In this respect, WebRTC apps for for receiving only. This is critical. Anyone suggesting differently is misleading you.

The image is courtesy of AvaLiving.com

« Newer Posts - Older Posts »

read more today usa gambling games for money online casino real slot machine games online casino payment options online casinos that accept mastercard deposits from us online roulette for USA players first time deposit bonus casino online online slots for mac download casino casinos Classic casino games they offer online casinos that accept usa players bonus code for us casinos online blackjack real money newest casino 2013

united state https://www.euro-online.org online