Writing Your First WebRTC Application: Part 1
However, it’s proving to be much bigger and more complicated than I expected. It’s not a simple task to take the reader from square one all the way across the board and not require 30 or more pages to get him or her there.
So, like all good elephant feasts, I’ve decided to break this into smaller pieces and spend some quality time on each one of them. My goal is that when all is said and done, you will be able to string the pieces together to get a much better understanding than if I just threw them at you all at once.
Which brings me to Part One. I’ll try not to get too technical (which is a difficult task given the nature of this subject) and present the high-level concepts of writing a WebRTC application. For some of you, this will be enough. For others, it will leave you wanting more.
The Beginning of the Beginning
A WebRTC application can be divided into two halves. I will further divide up those halves, but for now, let’s stick with two.
The HTML code will handle input from the user and perform all the steps necessary to format the visual aspects of the webpage. This is where you ask the user what he or she wants to do along with defining the text and graphics to be displayed on the page. Of most importance to this discussion, you will use HTML to declare where video will be shown on the page.
For instance, to display CIF (Common Intermediate Format) video you will create a 384 pixels by 288 pixels container. QCIF (Quarter CIF) would only need 176 pixels by 144 pixels.
A WebSocket object is a fairly easy use and will consist of calls to create a connection, send data on the connection, receive data on the connection, and recognize when the connection closes.
The Signaling Server
If you read my article about WebRTC fundamentals (WebRTC for Beginners), you will know that the WebRTC specification does not specify a particular signaling server. It clearly states that one is necessary, but it puts few restrictions on what it is or how it is accessed.
This can be seen as a mixed blessing. By not prescribing what a signaling server is, developers can use the technology that best suits their needs. Do you want to use SIP? Go ahead and use SIP. Do you want to define your own custom signaling that’s perhaps easier to use than SIP? Go for it.
The con is that there is no standard way to perform your signaling. So, unless you can utilize someone else’s work, you are on the hook to write your own signaling server.
No matter whether you procure or write it, a signaling server needs to do two basic things.
- Exchange the metadata necessary to perform the signaling. This includes some form of addressing and Session Description Protocol (SDP) for each browser.
- Deal with Network Address Translation and firewalls.
For a closer look as the high level aspects of a signaling server, please refer to An Introduction to WebRTC Signaling.
Wrapping up Part One
Allow me to summarize what I just wrote.
- A WebRTC solution consists of two parts – a Web browser application and a signaling server.
- HTML will be used for user input and page display.
- A signaling server must exist, but WebRTC gives you a lot of leeway as to what it is.
In future installments, I will dig deeper into all these aspects. While you may never write your own WebRTC application, it is important to understand what is happening under the covers. By the time I am finished, I hope that you can do both with relative ease.
Stay tuned for more fun and games!