How We Test TVs
Televisioninfo.com tests HDTVs using a rigorous set of scientific methods, using the same tools and techniques that the manufacturers themselves use to test their own products. Rather than just looking at an ad hoc set images and videos on the screen, we perform an in-depth quantitative analysis using advanced instrumentation and professional tools that look at the performance of the HDTV, determining how the display produces on-screen images in extreme detail.
While other sites watch a couple of movies and discuss how grizzled the hero looks in a particular scene, we determine via extensive measurements and data analysis the true extent of the color gamut, examine the transfer function for all of the primary colors (as well as white), determine how accurately the color temperature of the whites is maintained over the entire luminance range, and examine how the display scales lower resolution video sources to appear on the screen. And that’s just some of the testing we do, which is described in greater detail below; we also evaluate the remote control, the speakers, documentation, the ease of use, and all aspects of display performance and picture quality and accuracy.
To develop this testing methodology, we worked with Dr. Raymond Soneira, the creator of DisplayMate, an advanced industry standard diagnostic program for displays that helps consumers, technicians and manufacturers setup, calibrate and test their TVs. Working with Dr. Soneira, we developed a comprehensive testing process that we consider to be the most in-depth and authoritative in the world.
DisplayMate includes a very large set of incisive, challenging and sensitive test patterns to check and optimize display performance, to show the effects of the a display's internal processing, and to highlight the differences between displays (see below for a few examples). We use the scripting capabilities of DisplayMate to automate many of our tests, and use the extensive library of test screens and test photos that it offers to highlight the performance of the display being tested and determine its strengths and weaknesses. We used the advanced professional DisplayMate Multimedia with Photos Edition and the DisplayMate Multimedia with Motion Edition for our tests. There are many other editions of DisplayMate, including basic tutorial versions for novice consumers.
Various testing screen from DisplayMate software
To analyze the photometry and colorimetry properties of the HDTV display, we also use a Konica Minolta CS-200 Chroma Meter, a laboratory CS-200 ChromaMeter that provides extremely accurate luminance and color measurements for all display technologies. It has a very narrow one-degree acceptance angle, which is very important for the accurate measurement of LCDs. For more details of this device, see here. The CS-200 can measure light sources in the range of 0.005 to 200,000 cd/m2, with an absolute accuracy of +/- 0.02 cd/m2. It is significantly more accurate than the instruments used by many other reviewers, which use a set of color filters instead of the light spectrum and generally have a wide acceptance angle that contaminates the luminance measurements. The CS-200 connects to a PC via a USB port, and every data sample is logged.
Our testing process involves capturing many thousands of individual data points, which is done using a customized scripting system that automates the testing process. We then use a number of sophisticated mathematical tools to analyze this data and produce the results and scores that you see in the reviews on this site. For more details of what we test and how we analyze these results, scroll down to the individual test below.
Almost all HDTVs arrive with preset picture modes that are chosen by the manufacturer so that the HDTV looks best in a brightly lit retail showroom. As a result, the HDTVs are set for maximum brightness and contrast rather than maximum picture quality. We adjust all of the user controls to deliver the best and most accurate picture quality by using a series of DisplayMate test patterns together with advanced instrumentation measurements and user control adjustments. This enables us to find the user control settings that produce the best balance of performance.
It's important to note that we approach calibration from a "colors first" perspective. By this, we mean that our calibration settings are directly primarily at getting the best color performance. Other review sites and magazines may sometimes focus more on getting the best contrast. In this, we simply differ in opinion. Unfortunately, it's nearly impossible to find settings that maximize all aspect's of a TV's performance.
Our process does, of course, involve using DisplayMate to find the optimum settings for the brightness and contrast controls, in order to accurately set the digital black level, the peak luminance without saturation or clipping, the sharpness control, and many others. This generally results in a significant reduction in peak brightness in order to deliver peak picture accuracy and quality. As a result, we're often running the displays at settings below those that provide the maximum luminance. In this case, we also discuss in the review the maximum possible luminance of the display and the consequence of these settings; many displays can provide extremely high levels of brightness, but these settings involve serious compromises in color accuracy and image quality.
We do not use any controls that are hidden or require special access codes or equipment to access (such as those designed for professional installers or for use in calibrating the TV at the factory); if the control is not easily accessible to an everyday user, we don’t use it in calibration. We do this because we want to get the same experience that a user would get if they bought the display and then set it up, and most users will not be able to get access to the service menus. As part of the calibration process, we also set the backlight control to maximum.
To measure the black level of the display, we put up an all-black screen in DisplayMate and measure the luminance at the center of the screen, in candelas per square meter (cd/m2). We measure the black level at several times during the testing of the display, then we report on any variance we see with these multiple measurements and we discuss any dynamic backlight or local dimming that the display uses that affect the black level. However, the main figure that we quote is for the black level at our calibrated settings, with the backlight on maximum for LCDs. Our score is based on how dark the black is: the lower the luminance, the higher the score.
To measure the brightest white the TV can achieve, we set the display to show a small area of white (about 4% of the screen) at the center of the screen and measure the luminance in candelas per square meter with a CS-200 ChromaMeter set for a one degree acceptance angle. We do this after calibrating the HDTV as described above. Our score is based on how bright the white is after calibration; the brighter the white, the higher the score. When the peak white varies with the size of the test pattern area, as in the case of plasma displays, we perform several measurements with different areas, each with a different APL (Average Picture Level).
To calculate the contrast that the screen can achieve, we divide the peak white luminance by the deepest black luminance they can produce when showing normal video and not in a standby mode. So, if a display has a deepest black of 0.4 cd/m2, and a peak white of 400 cd/m2, the contrast ratio is 1000:1. Our score here is based on how high the ratio is; the higher the better. Note that our tests differ from the approach that manufacturers use to determine the contrast ratio; they test the peak white with the backlight on full, then the deepest black with it on the lowest attainable setting (often called a dynamic contrast ratio). Our test determines the true contrast ratio with the backlight on full during the test (often called the static full field contrast ratio).
For direct view LCD and Plasma displays the ANSI checkerboard contrast ratio is generally within a few percent of the full field contrast ratio above. Reviewers that find a significant discrepancy between the two are in instead measuring the veiling glare light contamination of their measuring instrument instead of the HDTV. See below.
The tests above tell us about the performance of the screen showing just pure whites and pure blacks, but not in the more real world situation of mixed white and blacks on screen. Some displays have problems here: with these areas of high contrast, the whites bleed into the blacks, making them appear brighter than they should and reducng color saturation at the same time. To measure this, we do a test where a variable width outer rectangular frame on the screen is set to peak white, and we then measure the luminance of a small black area at the center of the screen to see how much light bleeds to the center as the frame expands closer to the center. Some other sites have a much simpler test using a checkerboard pattern (and refer to this as checkerboard contrast), but our test gives much more information on how the increasing amount of white bleeds into the black area. Other sites also forget one important technical aspect of this test: that having white on the screen can lead to some of the light from the white screen area reaching the measuring instrument and creating an artificially high reading for the black (a problem called veiling glare, which produces very large measurement errors that lead to erroneous conclusions). We avoid this by using a special black Duvatyne mask to block the white areas of the display; any light that reaches the measuring device has come directly from the center target on the screen; not from the surrounding area on the screen. The score a display gets is based on how constant the black level remains; a constant black gets a higher score.
Another issue with peak white is that power management issues on some displays (particularly plasmas) require a reduction in peak white levels when the average screen brightness gets too high. We test this by putting up a number of images with varying amounts of white and measuring the luminance of the peak white. Our scoring for this test is based on how much the luminance varies with the different amounts of white on the screen.
This test examines the uniformity of the screen, looking at how even the lighting is across an entirely black or entirely white screen. We use the DisplayMate uniformity test screens to look for irregularities anywhere on-screen, which can either be hot spots (too bright) or cold spots (too dim) or mottled screens with widespread irregularities. We pinpoint and measure the irregularities with the CS-200 set for a narrow one degree acceptance angle. Points are deducted for corners or spots on screen that are not uniform, and also for any changes in luminance that are not gradual.
We determine the gamma of the grayscale transfer function by measuring the luminance of screens with varying signal intensities of gray from 0 to 255. The gamma is determined by measuring the slope of the transfer function on a logarithmic graph between 30 and 70 percent of peak signal, avoiding the bottom and top ends of the curve, which often include spurious irregularities.
We test resolution scaling by examining a number of DisplayMate test screens in a variety of non-native resolution formats for the display under test. The test screens are designed to examine the way that the HDTV processes the screens and scales them to fit the screen, highlighting any problems such as moiré pattern interference or dithering patterns that compromise legibility.
The color of white that an HDTV produces can vary significantly with factory settings and picture modes. The exact color of white is specified precisely by its CIE chromaticity coordinates, and more commonly by its correlated color temperature, which is a rough approximation to the light given off by a laboratory black body at a temperature of 5,000 to 15,000 degrees.
The Konica Minolta CS-200 ChromaMeter that we use can measure the chromaticity coordinates and correlated color temperature very accurately. We use this to measure the performance of the display being tested, measuring the red, green and blue primaries as well as the D65 point. We test by setting the display as close as possible to D65, which is a television and photographic industry standard. D65 approximates the color of daylight at noon on an overcast day and includes components of the both blue sky and direct sunlight.
For color and grayscale tracking, we display a number of screens at intensity levels between 255 (the brightest white) and 0 (complete black), measuring both the color temperature and color coordinates of each point in the range. The scoring for this test is based on the amount of variance from the maximum intensity chromaticity values, measured in the CIE 1976 uniform color space (u’, v’). Although we feature both the color temperature variation and the CIE 1976 color space distance in our review, the score is based on the latter, as this provides a better measure of how the white of the display shifts within the color space. We discount any shift of less than 0.004, as this is not noticeable by most observers. This distance is shown on our charts by the red circle.
We determine the transfer function of a display for each of the primary colors by measuring the luminance of a screen for the range of signal intensities from 0 to 255. We then analyze the curve to determine the granularity and other characteristics. Our scoring is based on this analysis; issues such as excessive stepping, clipping and uneven response cost the display points.
We test how closely the display matches the standard primary colors of ITU-R BT.709 (generally referred to as Rec.709), which defines the color gamut of high definition TV signals. The scoring for this test is based on the distance between the measured and standard values; the greater the distance, the lower the score. We plot the measured and recommended gamut in the CIE 1976 Lu’v’ color space. The full Rec.709 standard can be downloaded here. Note that a color gamut that is greater than the standard is also undesirable; this will produce colors that are outside of the standard gamut, producing incorrect colors that are too saturated and not as the content producer intended.
Our motion test uses a variety of test screens and video sources, including the Multimedia with Motion edition of DisplayMate and a number of movie sequences. We use these to judge the quality of the motion on the display, looking for issues with ghosting, shadowing, smearing and other common artifacts.
* For reviews published after March 5, 2011, Motion Smoothness and Motion Artifacting have been combined into a single section called “Motion Performance.” The score displayed in this section is the sum of the scores for Motion Smoothness and Motion Artifacting.
We test the 3:2 pulldown processing (which is also known as 2:3 pulldown) capabilities of the display with the HQV Benchmark test disc. We also evaluate the performance of the display with a video source that has been processed with the telecine effect.
To test the performance of the display with a 24 frames per second signal, we use a PlayStation 3 configured to output a 24 frames per second signal playing a Blu-ray disc.
Our viewing angle test examines the contrast ratio and color shift of the display at different viewing angles. We measure the contrast ratio at 5 degree increments from 0 degrees (straight on) to ±85 degrees. Our scoring for this test is based upon the point at which the contrast ratio has fallen by 50 percent from the maximum we measured at 0 degrees. This means that our ranges of satisfactory viewing angles are very different from the ones the manufacturers publish, which are generally based on the angle at which the contrast ratio falls to 10:1. We feel that this is far too low, since most displays have a face-on contrast ratio of over 1000:1, making a 10:1 contrast ratio unwatchable.
We examine how reflective the screen is, considering how much light is reflected from the screen surface in a standard light setting. The points for this test are based upon how much the reflection interferes with the screen image.
We test power consumption using a Watts Up Pro power meter connected to a computer. In order to make the test results comparable between displays with different luminance levels, we calibrate the monitor backlight or other controls to produce a peak luminance of 200 cd/m2. If a display cannot reach that luminance, we get as close as possible. We then test the power consumption playing back a standard video sequence of 10 minutes of 1080i video recorded from a Comcast digital cable signal, measuring the power drawn at several points during the playback and averaging the result.
We then use these figures to calculate the typical cost of using this HDTV, working on the basis of electricity costing 10.7 cents a Kilowatt Hour (this is the 12-month average for the cost of electricity in the USA up to April 2008 from the EPA), with the viewer watching the TV for five hours a day, seven days a week, and leaving it in standby mode the rest of the time.
For LCDs we also record the wattage draw with the backlight on the minimum and maximum settings to provide a minimum and maximum figure for the power usage of the display. The weekly and yearly running costs for these figures are calculated in the same way.
Every test that we perform results in a score, which allows us to compare displays directly, even if they are not tested side by side. Our rigorous, scientific scoring system ensures that our results are consistent, accurate and represent the strengths and weaknesses of a display. Many of our scores are open-ended; the score can climb beyond the nominal maximum of 10 as the performance of new models improves. Most reviewers use a fixed 1 to 10 scoring system, but this means that once a product has earned a top score, there is nowhere else to go; the reviewer has to reset the scoring system and start again. Our infinite score system allows us to keep going, so if a new technology comes along that provides radically better color or a more accurate color gamut, we can still score it, and compare it with other models that we tested before the new technology arrived.
* For reviews published after March 5, 2011, Motion Smoothness and Motion Artifacting have been combined into a single section called “Motion Performance.” The score displayed in this section is the sum of the scores for Motion Smoothness and Motion Artifacting. The sections Input Ports, Output Ports, Other Parts and Media have been combined into a single section called “Connectivity.” The score displayed in this section is a sum of the scores for Input Ports, Output Ports, Other Parts and Media. The Photo Playback and Music & Video Playback have been combined into a section called “Local Media Playback.” The score displayed in this section is a sum of the scores for Photo Playback and Music & Video Playback.
To create our overall score for each display, each individual score is multiplied by a weighting, which is based on how important we think the individual factor is to the typical consumer. The weightings are listed below: the majority of the score is based upon the performance tests outlined above, but the features that the display offers also play a part.