Gamma is basically the nonlinear way in which the light output or luminance of a TV screen (a.k.a. it's intensity) represents the many possible levels of red, green, and blue, the three primary colors in the video signal.
The higher the display's gamma happens to be in a range from 1.0 to 2.5 or so, the darker and more contrasty the image.
Imagine a "ramp" test pattern:
In it, moving from the left edge of the screen to the right, the input signal rises in level from the minimum possible, for black, to the maximum possible intensity, for white, with all three color primaries present in equal amounts. As the signal level goes up, luminance lags behind. Due to this lag, the luminance output at, say, the halfway point across the screen, is actually a lot lower than it would seem by visual inspection.
That's because this lag is not apparent to the eye: the eye's lightness perception of the various levels of luminance is itself nonlinear. The human visual system tends to exaggerate the lightness variations at lower levels of luminance and compress those at higher levels. That's why this test pattern seems to reach its middle level of lightness right in the center of the left-to-right sweep.
If input signal level is V — for voltage, with analog signals; for video level, with digital signals — then the screen luminance L of the TV is given by the equation
L = VƔ
where Ɣ, the Greek letter gamma, represents gamma. This function is in effect computed three times for each separate pixel in the image: once for red, once for green, and once for blue.
But why? Why not use the simpler function L = V, where the gamma exponent is effectively 1?
There are several reasons, it turns out. The most basic of these reasons is that cathode ray tubes, which are inherently nonlinear, operate with an intrinsic gamma of 2.5 or thereabouts.
That's the most fundamental reason, then, why signals intended for display on CRTs have always been gamma-corrected. A video camera creates V for each color primary according to a function something like
L = V(1/Ɣ)
where Ɣ is (approximately) the assumed gamma of the TV.
I say "approximately" because the actual denominator of the exponent, in the gamma-correction equation, is typically 2.2, not 2.5. (Moreover, due to the fact that for very low values of L the functional relationship shown above is replaced with a straight line segment, the overall exponent is in effect slightly changed again; I'll ignore that nuance for now.)
Because the camera's actual gamma-correction exponent is roughly 1/2.2, or 0.45, gamma correction at the video camera serves to almost but not quite neutralize the gamma of a standard CRT display, which is nominally 2.5. Because the neutralization is incomplete, the final displayed image appears to have slightly more contrast than it would if the gamma correction were complete.
There are several reasons why the camera's gamma-correction exponent doesn't, and shouldn't, fully offset the actual display gamma. One reason is that a video display's luminance levels are tiny fractions of the real-world luminances that arrive at the camera's image sensor from the original scene. Another is that we customarily frame video images in unnaturally dark surrounds and view them in darkened or semi-darkened rooms. A third is that our TVs usually cannot achieve the ultra-wide contrast ratios found in nature.
All three of these reasons result in the need to "goose" image contrast. The best way to do that is to ensure that the gamma correction that is done in the camera does not fully offset the gamma of the display.
Yet according to Dr. Raymond Soneira's four-part series titled "Display Technology Shootout" in Widescreen Review magazine, Sept.-Dec. 2004, studio CRT monitors used in tweaking video images before they are broadcast or rendered on DVD typically have decoding gammas of 2.2, not 2.5. "Current CRTs," he writes in the second part of his series (in WR, Oct. 2004, p. 68; the article can be accessed directly as a web page here) typically have a native gamma in the range of 2.3 to 2.6, so the gamma of 2.20 for Sony (and Ikegami) CRT studio monitors is actually the result of signal processing."
Whatever it's the result of, a gamma of 2.20 seems to violate the maxim that camera inverse-gamma ought not to fully compensate the gamma of the display. If the camera exponent is approximately 1/2.2, or 0.45, and the display gamma is 2.2, then (roughly speaking, at least) full compensation does occur.
I simply can't yet explain why studio CRT-based monitors, using digital signal processing, alter their native gamma figure, which is nominally 2.5, to 2.2.
I mentioned above that the basic reason why gamma correction is done in the video camera and for all other video sources is that CRTs are inherently nonlinear: the luminance they produce is not a linear function of signal voltage.
I also mentioned that the eye's lightness response to luminance is itself nonlinear, such that lower/darker levels of luminance are exaggerated, in terms of their apparent lightness, while higher/brighter levels are more compressed. That was why the ramp test pattern shown above seems to place its middle lightness level right in the center of its horizontal sweep, although the actual luminance at that point is far less than half the luminance of white at the right edge of the ramp.
By a strange coincidence, the eye's own version of "gamma correction," a perceptual trick by which its lightness response is not a linear function of luminance, approximately matches that done in video signal encoding to offset the gamma inherent in CRT picture tubes!
That is, a scientist's graph of the eye's lightness response to luminance has very close to the same shape (and thus the relevant equation has approximately the same exponent, 0.4 or 1/2.5) as a graph of the video gamma correction function.
As a result, a gamma-corrected video signal bears an approximately linear relationship to perceived lightness — though, as I have already said, not to measurable luminance. This is a second reason why gamma correction is done in video. A camera that does gamma correction responds to luminance patterns focused on its image sensor much as the human visual system responds to luminance patterns focused on its retina.
Another way of stating it is to say that gamma-corrected video has perceptual uniformity. (I am drawing here from Charles Poynton's excellent book, Digital Video and HDTV: Algorithms and Interfaces.) Each successive increase in gamma-corrected digital code value over the available range from 0-255 (or 16-235) boosts perceived lightness (though not physical luminance) by the same barely detectable amount. (A similar statement is true for analog video signals expressed in IRE units from 0-100, though each step up in signal level — say, from 50 IRE to 51 IRE — involves a more-than-minimally-detectable boost in lightness.)
Perceptual uniformity in gamma correction works out nicely for two reasons. First, the visibility of video noise, especially troublesome in darker parts of the scene having luminances at the low end of the available range, is effectively minimized.
In the absence of display nonlinearity and gamma correction in the camera, digital video might instead be encoded in a "linear-light" domain. If 8 bits per primary color per pixel were used, the range of available code values would be (at most) 0-255. Suppose the "correct" code value for a gray pixel (ignoring color) were 50, but due to the presence of noise in the circuits of the video camea, it is instead encoded as 51. The seemingly tiny difference in code value would produce a 51/50 = 102/100 = 1.02 = 102% ratio of actual luminance to intended luminance.
That is, the actual luminance on the monitor screen would be 2% higher than it ought to be. But differences in luminance of just 1% can be detected by the eye, at least under certain conditions. So the noise in the non-perceptually uniform, linear-light signal is apt to be noticeable.
However, gamma correction of the luminance at the image sensor of a camera into a perceptually uniform signal domain according to a 1/2.5 power function compresses the low end of the tonal range especially much, and with it what I'll call the low-end noise. The luminance-plus-noise quantity which in the above exapmle prompted an erroneous code value of 51 might, with gamma correction, yield a number like 50.4. But since only integer codes are allowed, this would be rounded to 50, and the noise would disappear!
A similar logic also applies to analog video signals. Without gamma correction, low-end noise would be more of a problem than it is, simply because the eye is more sensitive to lightness variations at the low end of the tonal range than at the high end.
Noise at the high end of the tonal scale or lighntess range is much less of a problem. Again, imagine an 8-bit tonal scale with codes 0-255. If camera noise takes a "correct" pixel value up one level from 200 to 201, the ratio is just 201/200 = 100.5/100 = 1.005 = 100.5%. A mere 0.5% rise in luminance is not detectable to the eye, which under the best of circumstances needs a 1% jump in luminance for differences to be visible.
But there is a separate problem which affects the middle portion and high end of the tone scale in digital video. Poynton calls it the "code 100" problem, and it is the second reason why digital video needs to be gamma-corrected into a perceptually uniform domain.
The "code 100" problem has to do with the need to provide a minimum of a 30:1 contrast ratio between the brightest-possible parts of a scene and the darkest-possible parts. Without gamma correction, the codes from 0 to 100 in an 8-bit encoding scheme, with values from 0-255, have to be thrown out, for reasons similar to the discussion of noise above. That is, each successive code increment (say, code 50 to code 51) provides a much-more-than-barely-detectable boost in output luminance, in a linear-light encoding system.
Accordingly, what should be shades of color or gray that blend indistinguishably into one another instead exhibit banding or false contouring: visible striations that were not present in the original subject matter.
Thus, the codes from 0-100 have to be thrown out and never used. Black has to be identified with code 100, not code 0. (Remember, we are talking here about a hypothetical linear-light method of 8-bit digital encoding, not what is actually done in the real world of digital video.)
If white is at code 255 and black is at code 100, then the ratio between the two is only 255/100, or 2.55:1. That's way too low, when 30:1 is considered the minimum acceptable ratio.
In order to get a contrast ratio that meets or exceeds 30:1, you have to go to 12-bit linear-light coding. Then white is at code 4095, not 255, and black is at 100, for fully a 40.95:1 contrast ratio.
But many of the available codes are in effect wasted; they're not perceptually useful. For example, the eye can't see the difference between any two codes in the range from 4001 to 4040, because the luminance associated with the code at the top of the range is less than 1% above that associated with the code at the bottom.
Gamma-correcting the signal into a perceptually uniform lightness domain allows the same amount of perceptually useful information, with a similarly acceptable contrast ratio, to be shoehorned into pixels of just 8 bits per primary color, not 12 bits. The approximately 1/2.5 power function that converts camera luminance amounts into a gamma-corrected video signal effectively "squeezes out" the wasted code levels. This, then, is the solution to the "code 100" problem.
It solves that problem while also dealing with the low-level noise that would also plague a linear-light 8-bit system, if codes below 100 weren't tossed out. That's why Poynton says gamma correction allows video signals to make maximum effective use of digital channel bandwidths.
So we have seen several reasons for gamma-correcting a video signal:
- To precompensate for the nonlinearity of a CRT
- To suppress low-level video noise
- To code for perceptual uniformity
- To avoid the "code 100" problem
- To avoid wasted digital codes
- To reduce the number of bits needed per pixel
- To maximize effective use of the bandwidth of the digital channel
- To maximize effective use of the capacity of a digital recording device
Many of these are, of course, merely ways of saying the same things in different words, when you come right down to it. The first three apply to analog and digital video, while the others are digital-specific. In fact, the last six, all of which have to do with perceptual uniformity, show why gamma correction would need to be done even if the luminance-to-voltage curve of a CRT were perfectly linear.
That is, gamma correction would have to be done even if a CRT display's native gamma exponent were a linear 1.0, rather than around 2.5, simply because in the digital video age coding for perceptual uniformity pays off so handsomely.
At this point in the discussion we may justifiably take a moment to thank our lucky stars. For it is an extremely fortunate coincidence that the gamma-correction equation which best imposes perceptual uniformity on the digital video signal is for all intents and purposes identical to the equation which best precompensates the gamma of a CRT picture tube!
If this were not so, video engineers would have to choose between a camera transfer function with an exponent which best precompensates a CRT's inherent gamma (which under these hypothetical assumptions would not be 2.5) and one which, in Poynton's words on p. 258, "makes maximum perceptual use of the channel." The latter constraint, says Poynton, requires video coding of an image in such a way as "to minimize the visibility of noise, and to make effective perceptual use of a limited number of bits per pixel" — while at the same time sidestepping the "code 100" problem as it relates both to a too narrow contrast ratio and to wasted code values.
But since a CRT electron gun's intrinsic response to signal voltage very neatly mimics the eye's perceptual response to scene luminance, both gamma-correction goals can be served by the same camera transfer function!