|By Peter Lubbers, Frank Greco||
|March 11, 2010 10:15 AM EST||
Lately there has been a lot of buzz around HTML5 Web Sockets, which defines a full-duplex communication channel that operates through a single socket over the Web. HTML5 Web Sockets is not just another incremental enhancement to conventional HTTP communications; it represents a colossal advance, especially for real-time, event-driven web applications.
HTML5 Web Sockets provides such a dramatic improvement from the old, convoluted "hacks" that are used to simulate a full-duplex connection in a browser that it prompted Google's Ian Hickson - the HTML5 specification lead - to say:
"Reducing kilobytes of data to 2 bytes...and reducing latency from 150ms to 50ms is far more than marginal. In fact, these two factors alone are enough to make Web Sockets seriously interesting to Google."
Let's look at how HTML5 Web Sockets can offer such an incredibly dramatic reduction of unnecessary network traffic and latency by comparing it to conventional solutions.
Polling, Long-Polling, and Streaming - Headache 2.0
Normally when a browser visits a web page, an HTTP request is sent to the web server that hosts that page. The web server acknowledges this request and sends back the response. In many cases - for example, for stock prices, news reports, ticket sales, traffic patterns, medical device readings, and so on - the response could be stale by the time the browser renders the page. If you want to get the most up-to-date "real-time" information, you can constantly refresh that page manually, but that's obviously not a great solution.
With polling, the browser sends HTTP requests at regular intervals and immediately receives a response. This technique was the first attempt for the browser to deliver real-time information. Obviously, this is a good solution if the exact interval of message delivery is known, because you can synchronize the client request to occur only when information is available on the server. However, real-time data is often not that predictable, making unnecessary requests inevitable and, as a result, many connections are opened and closed needlessly in low-message-rate situations.
With long-polling, the browser sends a request to the server and the server keeps the request open for a set period. If a notification is received within that period, a response containing the message is sent to the client. If a notification is not received within the set time period, the server sends a response to terminate the open request. It is important to understand, however, that when you have a high message volume, long-polling does not provide any substantial performance improvements over traditional polling. In fact, it could be worse, because the long-polling might spin out of control into an unthrottled, continuous loop of immediate polls.
With streaming, the browser sends a complete request, but the server sends and maintains an open response that is continuously updated and kept open indefinitely (or for a set period of time). The response is then updated whenever a message is ready to be sent, but the server never signals to complete the response, thus keeping the connection open to deliver future messages. However, since streaming is still encapsulated in HTTP, intervening firewalls and proxy servers may choose to buffer the response, increasing the latency of the message delivery. Therefore, many streaming Comet solutions fall back to long-polling in case a buffering proxy server is detected. Alternatively, TLS (SSL) connections can be used to shield the response from being buffered, but in that case the setup and tear down of each connection taxes the available server resources more heavily.
Ultimately, all of these methods for providing real-time data involve HTTP request and response headers, which contain lots of additional, unnecessary header data and introduce latency. On top of that, full-duplex connectivity requires more than just the downstream connection from server to client. In an effort to simulate full-duplex communication over half-duplex HTTP, many of today's solutions use two connections: one for the downstream and one for the upstream. The maintenance and coordination of these two connections introduces significant overhead in terms of resource consumption and adds lots of complexity. Simply put, HTTP wasn't designed for real-time, full-duplex communication as you can see in Figure 1, which shows the complexities associated with building a Comet web application that displays real-time data from a back-end data source using a publish/subscribe model over half-duplex HTTP.
Figure 1: The complexity of Comet applications
It gets even worse when you try to scale out those Comet solutions to the masses. Simulating bi-directional browser communication over HTTP is error-prone and complex and all that complexity does not scale. Even though your end users might be enjoying something that looks like a real-time web application, this "real-time" experience has an outrageously high price tag. It's a price that you will pay in additional latency, unnecessary network traffic, and a drag on CPU performance.
HTML5 Web Sockets to the Rescue!
Defined in the Communications section of the HTML5 specification, HTML5 Web Sockets represent the next evolution of web communications - a full-duplex, bidirectional communications channel that operates through a single socket over the Web. HTML5 Web Sockets provides a true standard that you can use to build scalable, real-time web applications. In addition, since it provides a socket that is native to the browser, it eliminates many of the problems Comet solutions are prone to. Web Sockets remove the overhead and dramatically reduce complexity.
To establish a WebSocket connection, the client and server upgrade from the HTTP protocol to the WebSocket protocol during their initial handshake, as shown in the following example:
Example 1: The WebSocket handshake (browser request and server response)
GET /text HTTP/1.1\r\n
HTTP/1.1 101 WebSocket Protocol Handshake\r\n
Once established, WebSocket data frames can be sent back and forth between the client and the server in full-duplex mode. Both text and binary frames can be sent full-duplex, in either direction at the same time. The data is minimally framed with just two bytes. In the case of text frames, each frame starts with a 0x00 byte, ends with a 0xFF byte, and contains UTF-8 data in between. WebSocket text frames use a terminator, while binary frames use a length prefix.
The Showdown: Comet vs HTML5 Web Sockets
How dramatic is that reduction in unnecessary network traffic and latency? Let's compare a polling application with a WebSocket application.
For the polling example, I created a simple web application in which a web page requests real-time stock data from a RabbitMQ message broker using a traditional publish/subscribe model. It does this by polling a Java servlet that is hosted on a web server. The RabbitMQ message broker receives data from a fictitious stock price feed with continuously updating prices. The web page connects and subscribes to a specific stock channel (a topic on the message broker) and uses an XMLHttpRequest to poll for updates once per second. When updates are received, some calculations are performed and the stock data is shown in a table as shown in Figure 2.
Note: The back-end stock feed actually produces a lot of stock price updates per second, so using polling at one-second intervals is actually more prudent than using a Comet long-polling solution, which would result in a series of continuous polls. Polling effectively throttles the incoming updates here.
It all looks great, but a look under the hood reveals there are some serious issues with this application. For example, in Mozilla Firefox with Firebug (a Firefox add-on that allows you to debug web pages and monitor the time it takes to load pages and execute scripts), you can see that GET requests hammer the server at one-second intervals. Turning on Live HTTP Headers (another Firefox add-on that shows live HTTP header traffic) reveals the shocking amount of header overhead that is associated with each request. The following two examples show the HTTP header data for just a single request and response:
Example 2: HTTP request header
GET /PollingStock//PollingStock HTTP/1.1
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:184.108.40.206) Gecko/20091102 Firefox/3.5.5
Cookie: showInheritedConstant=false; showInheritedProtectedConstant=false; showInheritedProperty=false; showInheritedProtectedProperty=false; showInheritedMethod=false; showInheritedProtectedMethod=false; showInheritedEvent=false; showInheritedStyle=false; showInheritedEffect=false
Example 3: HTTP response header
HTTP/1.x 200 OK
Server: Sun Java System Application Server 9.1_02
Date: Sat, 07 Nov 2009 00:32:46 GMT
Just for fun, I counted all the characters. The total HTTP request and response header information overhead contains 871 bytes and that doesn't even include any data. Of course, this is just an example and you can have less than 871 bytes of header data, but I have also seen cases where the header data exceeded 2,000 bytes. In this example application, the data for a typical stock topic message is only about 20 characters long. As you can see, it's effectively drowned out by the excessive header information, which wasn't even required in the first place.
What happens when you deploy this application to a large number of users? Let's take a look at the network throughput for just the HTTP request and response header data associated with this polling application in three different use cases.
- Use case A: 1,000 clients polling every second:
Network throughput is (871 x 1,000) = 871,000 bytes = 6,968,000 bits per second (6.6 Mbps)
- Use case B: 10,000 clients polling every second:
Network throughput is (871 x 10,000) = 8,710,000 bytes = 69,680,000 bits per second (66 Mbps)
- Use case C: 100,000 clients polling every 1 second:
Network throughput is (871 x 100,000) = 87,100,000 bytes = 696,800,000 bits per second (665 Mbps)
That's an enormous amount of unnecessary network throughput. If only we could just get the essential data over the wire. Well, guess what? You can with HTML5 Web Sockets. I rebuilt the application to use HTML5 Web Sockets, adding an event handler to the web page to asynchronously listen for stock update messages from the message broker (check out the many how-tos and tutorials on www.tech.kaazing.com/documentation for more information on how to build a WebSocket application). Each of these messages is a WebSocket frame that has just two bytes of overhead (instead of 871). Take a look at how that affects the network throughput overhead in our three use cases.
- Use case A: 1,000 clients receive 1 message per second:
Network throughput is (2 x 1,000) = 2,000 bytes = 16,000 bits per second (0.015 Mbps)
- Use case B: 10,000 clients receive 1 message per second:
Network throughput is (2 x 10,000) = 20,000 bytes = 160,000 bits per second (0.153 Mbps)
- Use case C: 100,000 clients receive 1 message per second:
Network throughput is (2 x 100,000) = 200,000 bytes = 1,600,000 bits per second (1.526 Mbps)
As you can see in Figure 3, HTML5 Web Sockets provide a dramatic reduction of unnecessary network traffic compared to the polling solution.
Figure 3: Comparison of the unnecessary network throughput overhead between the polling and the WebSocket applications
What about the reduction in latency? Take a look at Figure 4. In the top half, you can see the latency of the half-duplex polling solution. If we assume, for this example, that it takes 50 milliseconds for a message to travel from the server to the browser, then the polling application introduces a lot of extra latency, because a new request has to be sent to the server when the response is complete. This new request takes another 50ms and during this time the server cannot send any messages to the browser, resulting in additional server memory consumption.
In the bottom half of the figure, you see the reduction in latency provided by the WebSocket solution. Once the connection is upgraded to WebSocket, messages can flow from the server to the browser the moment they arrive. It still takes 50 ms for messages to travel from the server to the browser, but the WebSocket connection remains open so there is no need to send another request to the server.
Figure 4: Latency comparison between the polling and WebSocket applications
HTML5 Web Sockets and the Kaazing WebSocket Gateway
Today, only Google's Chrome browser supports HTML5 Web Sockets natively, but other browsers will soon follow. To work around that limitation, however, Kaazing WebSocket Gateway provides complete WebSocket emulation for all the older browsers (I.E. 5.5+, Firefox 1.5+, Safari 3.0+, and Opera 9.5+), so you can start using the HTML5 WebSocket APIs today.
WebSocket is great, but what you can do once you have a full-duplex socket connection available in your browser is even greater. To leverage the full power of HTML5 Web Sockets, Kaazing provides a ByteSocket library for binary communication and higher-level libraries for protocols like Stomp, AMQP, XMPP, IRC and more, built on top of WebSocket.
For example, if you use a high-level library for protocols such as Stomp or AMQP (supplied with the Kaazing Gateway), you can talk directly to back-end message brokers like the RabbitMQ broker described in the example. By having a direct connection to services, there is no need for additional application server logic to translate those bi-directional, full-duplex TCP backend protocols to uni-directional, half-duplex HTTP connections; the browser can simply understand these protocols natively.
Figure 5: Kaazing WebSocket Gateway extends TCP-based messaging to the browser with ultra high performance
HTML5 Web Sockets provides an enormous step forward in the scalability of the real-time web. As you have seen in this article, HTML5 Web Sockets can provide a 500:1 or - depending on the size of the HTTP headers - even a 1000:1 reduction in unnecessary HTTP header traffic and 3:1 reduction in latency. That is not just an incremental improvement; that is a revolutionary jump - a quantum leap.
Kaazing WebSocket Gateway makes HTML5 WebSocket code work in all the browsers today, while providing additional protocol libraries that allow you to harness the full power of the full-duplex socket connection that HTML5 Web Sockets provides and communicate directly to back-end services. For more information about Kaazing WebSocket Gateway, visit www.kaazing.com and the Kaazing technology network at tech.kaazing.com.
- HTML5 Web Sockets Specification: http://dev.w3.org/html5/websockets/
- WebSockets.org: http://www.websockets.org
- Kaazing corporation: http://www.kaazing.com
- Download the Kaazing WebSocket Gateway: http://www.kaazing.com/download
- Skills Matter HTML5 Communication Training Courses: http://skillsmatter.com/expert-profile/java-jee/peter-lubbers
- Book: Pro HTML5 Programming: Powerful APIs for Richer Internet Application Development (Link)
|PeterLubbers 03/15/10 03:55:00 PM EDT|
An additional problem with streaming is that you can get unexpected results if there are intermediaries such as buffering proxy servers in the way. Streaming solutions therefore often fall back to long-polling if they detect such intermediaries (if they are smart enough to detect them at all). With WebSocket and WebSocket secure you have a better chance of traversing proxy servers and firewalls.
|tbayes 03/11/10 10:11:00 AM EST|
Hi Peter and Frank,
Thank you for this clear and well-written article, which does much to effectively explain the benefits of the WebSocket protocol.
If I may point out one minor correction: In the WebSocket section of Example 3, Use case B and C should show overhead rates of 0.153Mbps and 1.526Mbps respectively, as opposed to 0.153Kbps and 1.526Kbps.
Perhaps a more natural comparison would be between Web Sockets and Comet-streaming (via, for example, the "Forever IFrame" technique), as I feel it a little unsporting to compare Web Sockets directly to polling or long-polling! At around 20 bytes per message, the overhead associated with clients receiving messages via Comet-style streaming is still more than WebSocket's 2 bytes per message, but is still one to two orders of magnitude less than the overhead associated with polling techniques. Similarly, latency whether receiving messages using Comet-streaming or WebSocket is the same. Of course, clients sending messages will gain the advantages you describe only with Web Sockets, as all other approaches do indeed require new HTTP requests.
I completely agree, however, that WebSocket is a definite improvement over the Comet IFrame-approaches, which have an air of string and glue about them.
I hope that we will see WebSocket support in all production mainstream browsers very soon.
SYS-CON Events announced today that robomq.io will exhibit at SYS-CON's @ThingsExpo, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. robomq.io is an interoperable and composable platform that connects any device to any application. It helps systems integrators and the solution providers build new and innovative products and service for industries requiring monitoring or intelligence from devices and sensors.
Apr. 1, 2015 06:00 PM EDT Reads: 1,654
The 17th International Cloud Expo has announced that its Call for Papers is open. 17th International Cloud Expo, to be held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, APM, APIs, Microservices, Security, Big Data, Internet of Things, DevOps and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal today!
Apr. 1, 2015 05:15 PM EDT Reads: 684
SYS-CON Events announced today that Litmus Automation will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Litmus Automation’s vision is to provide a solution for companies that are in a rush to embrace the disruptive Internet of Things technology and leverage it for real business challenges. Litmus Automation simplifies the complexity of connected devices applications with Loop, a secure and scalable cloud platform.
Apr. 1, 2015 04:00 PM EDT Reads: 834
While not quite mainstream yet, WebRTC is starting to gain ground with Carriers, Enterprises and Independent Software Vendors (ISV’s) alike. WebRTC makes it easy for developers to add audio and video communications into their applications by using Web browsers as their platform. But like any market, every customer engagement has unique requirements, as well as constraints. And of course, one size does not fit all. In her session at WebRTC Summit, Dr. Natasha Tamaskar, Vice President, Head of Cloud and Mobile Strategy at GENBAND, will explore what is needed to take a real time communications ...
Apr. 1, 2015 03:45 PM EDT Reads: 759
As Marc Andreessen says software is eating the world. Everything is rapidly moving toward being software-defined – from our phones and cars through our washing machines to the datacenter. However, there are larger challenges when implementing software defined on a larger scale - when building software defined infrastructure. In his session at 16th Cloud Expo, Boyan Ivanov, CEO of StorPool, will provide some practical insights on what, how and why when implementing "software-defined" in the datacenter.
Apr. 1, 2015 03:45 PM EDT Reads: 704
SYS-CON Events announced today that Solgenia will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY, and the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Solgenia is the global market leader in Cloud Collaboration and Cloud Infrastructure software solutions. Designed to “Bridge the Gap” between Personal and Professional Social, Mobile and Cloud user experiences, our solutions help large and medium-sized organizations dr...
Apr. 1, 2015 03:00 PM EDT Reads: 3,128
Internet of Things (IoT) will be a hybrid ecosystem of diverse devices and sensors collaborating with operational and enterprise systems to create the next big application. In their session at @ThingsExpo, Bramh Gupta, founder and CEO of robomq.io, and Fred Yatzeck, principal architect leading product development at robomq.io, will discuss how choosing the right middleware and integration strategy from the get-go will enable IoT solution developers to adapt and grow with the industry, while at the same time reduce Time to Market (TTM) by using plug and play capabilities offered by a robust I...
Apr. 1, 2015 01:45 PM EDT Reads: 893
Wearable technology was dominant at this year’s International Consumer Electronics Show (CES) , and MWC was no exception to this trend. New versions of favorites, such as the Samsung Gear (three new products were released: the Gear 2, the Gear 2 Neo and the Gear Fit), shared the limelight with new wearables like Pebble Time Steel (the new premium version of the company’s previously released smartwatch) and the LG Watch Urbane. The most dramatic difference at MWC was an emphasis on presenting wearables as fashion accessories and moving away from the original clunky technology associated with t...
Apr. 1, 2015 12:00 PM EDT Reads: 1,661
The world's leading Cloud event, Cloud Expo has launched Microservices Journal on the SYS-CON.com portal, featuring over 19,000 original articles, news stories, features, and blog entries. DevOps Journal is focused on this critical enterprise IT topic in the world of cloud computing. Microservices Journal offers top articles, news stories, and blog posts from the world's well-known experts and guarantees better exposure for its authors than any other publication. Follow new article posts on Twitter at @MicroservicesE
Apr. 1, 2015 12:00 PM EDT Reads: 1,632
After making a doctor’s appointment via your mobile device, you receive a calendar invite. The day of your appointment, you get a reminder with the doctor’s location and contact information. As you enter the doctor’s exam room, the medical team is equipped with the latest tablet containing your medical history – he or she makes real time updates to your medical file. At the end of your visit, you receive an electronic prescription to your preferred pharmacy and can schedule your next appointment.
Apr. 1, 2015 12:00 PM EDT Reads: 1,078
SYS-CON Events announced today the IoT Bootcamp – Jumpstart Your IoT Strategy, being held June 9–10, 2015, in conjunction with 16th Cloud Expo and Internet of @ThingsExpo at the Javits Center in New York City. This is your chance to jumpstart your IoT strategy. Combined with real-world scenarios and use cases, the IoT Bootcamp is not just based on presentations but includes hands-on demos and walkthroughs. We will introduce you to a variety of Do-It-Yourself IoT platforms including Arduino, Raspberry Pi, BeagleBone, Spark and Intel Edison. You will also get an overview of cloud technologies s...
Apr. 1, 2015 11:00 AM EDT Reads: 2,290
Containers and microservices have become topics of intense interest throughout the cloud developer and enterprise IT communities. Accordingly, attendees at the upcoming 16th Cloud Expo at the Javits Center in New York June 9-11 will find fresh new content in a new track called PaaS | Containers & Microservices Containers are not being considered for the first time by the cloud community, but a current era of re-consideration has pushed them to the top of the cloud agenda. With the launch of Docker's initial release in March of 2013, interest was revved up several notches. Then late last...
Apr. 1, 2015 09:15 AM EDT Reads: 2,334
The WebRTC Summit 2015 New York, to be held June 9-11, 2015, at the Javits Center in New York, NY, announces that its Call for Papers is open. Topics include all aspects of improving IT delivery by eliminating waste through automated business models leveraging cloud technologies. WebRTC Summit is co-located with 16th International Cloud Expo, @ThingsExpo, Big Data Expo, and DevOps Summit.
Apr. 1, 2015 09:00 AM EDT Reads: 1,771
SOA Software has changed its name to Akana. With roots in Web Services and SOA Governance, Akana has established itself as a leader in API Management and is expanding into cloud integration as an alternative to the traditional heavyweight enterprise service bus (ESB). The company recently announced that it achieved more than 90% year-over-year growth. As Akana, the company now addresses the evolution and diversification of SOA, unifying security, management, and DevOps across SOA, APIs, microservices, and more.
Apr. 1, 2015 08:30 AM EDT Reads: 2,144
The list of ‘new paradigm’ technologies that now surrounds us appears to be at an all time high. From cloud computing and Big Data analytics to Bring Your Own Device (BYOD) and the Internet of Things (IoT), today we have to deal with what the industry likes to call ‘paradigm shifts’ at every level of IT. This is disruption; of course, we understand that – change is almost always disruptive.
Mar. 30, 2015 04:00 PM EDT Reads: 1,228
SYS-CON Events announced today that SafeLogic has been named “Bag Sponsor” of SYS-CON's 16th International Cloud Expo® New York, which will take place June 9-11, 2015, at the Javits Center in New York City, NY. SafeLogic provides security products for applications in mobile and server/appliance environments. SafeLogic’s flagship product CryptoComply is a FIPS 140-2 validated cryptographic engine designed to secure data on servers, workstations, appliances, mobile devices, and in the Cloud.
Mar. 30, 2015 11:00 AM EDT Reads: 1,594
GENBAND has announced that SageNet is leveraging the Nuvia platform to deliver Unified Communications as a Service (UCaaS) to its large base of retail and enterprise customers. Nuvia’s cloud-based solution provides SageNet’s customers with a full suite of business communications and collaboration tools. Two large national SageNet retail customers have recently signed up to deploy the Nuvia platform and the company will continue to sell the service to new and existing customers. Nuvia’s capabilities include HD voice, video, multimedia messaging, mobility, conferencing, Web collaboration, deskt...
Mar. 30, 2015 01:00 AM EDT Reads: 1,562
SYS-CON Media announced today that @WebRTCSummit Blog, the largest WebRTC resource in the world, has been launched. @WebRTCSummit Blog offers top articles, news stories, and blog posts from the world's well-known experts and guarantees better exposure for its authors than any other publication. @WebRTCSummit Blog can be bookmarked ▸ Here @WebRTCSummit conference site can be bookmarked ▸ Here
Mar. 29, 2015 10:00 PM EDT Reads: 1,918
SYS-CON Events announced today that Cisco, the worldwide leader in IT that transforms how people connect, communicate and collaborate, has been named “Gold Sponsor” of SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Cisco makes amazing things happen by connecting the unconnected. Cisco has shaped the future of the Internet by becoming the worldwide leader in transforming how people connect, communicate and collaborate. Cisco and our partners are building the platform for the Internet of Everything by connecting the...
Mar. 29, 2015 07:00 PM EDT Reads: 5,335
Temasys has announced senior management additions to its team. Joining are David Holloway as Vice President of Commercial and Nadine Yap as Vice President of Product. Over the past 12 months Temasys has doubled in size as it adds new customers and expands the development of its Skylink platform. Skylink leads the charge to move WebRTC, traditionally seen as a desktop, browser based technology, to become a ubiquitous web communications technology on web and mobile, as well as Internet of Things compatible devices.
Mar. 29, 2015 06:00 PM EDT Reads: 1,918