Tiledmedia’s Solutions Rely on Standards

VR Industry Forum

Tiledmedia’s ClearVR solutions rely on international standards. This makes our technology straightforward to deploy to existing devices over existing content distribution channels. We believe in the interoperability that good standards bring, which benefits both consumers (things just work) and content providers (it significantly reduces their cost).

We rely on standardized HEVC decoders in consumer devices and personal computing systems. Obviously, we also rely on HEVC encoders, which require some restrictions that are also defined in the specification.

Tiledmedia is an enthusiastic member of the VR Industry Forum, VRIF, which seeks to facilitate the widespread adoption of VR services by working on quality and interoperability. Rob Koenen, one of Tiledmedia’s Founders, is the President of VRIF, representing TNO where he works part-time.

Tiled Streaming Comes in Different Flavors

It is important to understand that there are many different implementations of tiled streaming, with ClearVR one of them. MPEG’s OMAF (Omnidirectional MediA Format) specification contains another form of tiled streaming, and a “viewport-dependent media profile” that relies on tiled streaming.

Tiledmedia’s ClearVR is currently not compatible with all aspects of this OMAF profile, which is a deliberate choice. We believe that ClearVR, the result of more than five years of tiled streaming R&D, is significantly ahead of the technology that the current version of OMAF specifies. A brief summary goes below; it’s the much deeper integration of the media processing and the networking stack in the ClearVR solution that determines the performance of our solution.

At the same time, Tiledmedia is convinced that good-quality standards create markets, and we seek to provide standards-based solutions to those markets.

We want to adopt, and help improve, any standards that help our customers. We’ll do two things to move in that direction. First, we’ll adopt more and more parts of relevant standards as we evolve our platform. As an example, we’re adding support for Common Encryption (CENC), so that existing packagers can be used in a ClearVR-enabled deployment. While this will bring ClearVR closer to standards-compliance, the second part of our approach is to contribute our ideas to the relevant (MPEG) standards, bringing the standard closer to our solution. Again by way of example, we plan on proposing additional elements to the ISO Base Media File Format that will help to significantly decrease tile switching latency.

In the end, we expect to meet in the middle.

Differences Between ClearVR and Current Standards

The main advantages of ClearVR over the current generation of standards are:

  • Efficiency: ClearVR can reduce bitrate requirement by a factor of up to 5 when compared to full-sphere streaming; current standards reach about a factor of two.
  • Switching Latency: On a good quality CDN,  the ClearVR Client can switch to high-resolution imagery within one or two frames – unnoticeable to the user. In a standards-based solution, switching after head motion relies on GOP boundaries, and can take hundreds of milliseconds or more, which is very visible. While this can be alleviated to some extent, it will be at the expense of creating more encoded versions of the same content, adding inefficiencies and cost to processing and distribution. (Note that is not supported by current spec text.)
  • Flexibility: With a single representation, ClearVR can cater to all HMDs and various types of flat screens, regardless of their viewport angle. With a fully standards-based solution, the content distributor needs to provide separate representations for each viewport angle – again a significant cost factor.
  • Graceful degradation: With ClearVR, each tile forms an independent stream, which the client combines with other tiles to create a single HEVC-compliant bitstream. Such client-side processing allows the ClearVR library to make last-second decisions, and to dynamically replace tiles with their low-resolution equivalent on a frame-by-frame basis. This helps when data is not yet available and prevents buffering. Current standards hard-code all tile combinations in the bitstream during content production. When data for a single tile is not (yet) available, the client can only resort to buffering.
  • Bitrate variability: Bitrate spikes in ClearVR are limited, even with extensive head motion, because ClearVR doesn’t require clearing the decoding buffer whenever the viewport changes. In contrast, existing specs relies on field-of-view-specific metadata hard-coded in the bitstream, forcing the client to download a new batch of data whenever the field-of-view changes even slightly. This causes a significant spike in required bandwidth and gives a very noticeable motion-to-high-resolution latency.
  • User interaction: ClearVR does all processing client-side instead of during content preparation, which makes complex forms of user interaction possible without adding any latency. Examples are dynamic zooming, field-of-view adjustments, pause with the ability to look around and still get high-quality imagery, and fast seeking. None of this is supported by any available standard.