Secrets of Digital Bird Photography

10.2 Pixels and Bits

Occasionally it will be useful to remember that pixels—whether those seen on your computer screen or those in the underlying image—are represented by bits. For those having only a moderate understanding of computer architecture, a very brief introduction to bits and color representation will be worthwhile.
A bit is the fundamental unit of storage in a digital computer. There are only two numeric values that can be stored in a single bit: 0 and 1. A ’0’ bit represents the number zero, and a ’1’ bit represents the number one. To represent numbers greater than one, we need to use multiple bits in combination. For example, the number 1001001 in binary represents the number 73 in decimal (decimal is the number system that we use in everyday life, while binary is the number system used internally by computers). For a given number of bits, say eight, there is a smallest number that can be represented, and a largest number that can be represented. The smallest 8-bit binary number (in the encoding most often used for representing colors) is 00000000, which represents zero, while the largest is 11111111, which represents 255. Thus, given only eight bits, we can represent only 256 different numbers: 0, 1, 2, 3, 4, ... , 252, 253, 254, and 255. A 16-bit binary number, on the other hand, can represent the whole numbers between 0 and 65535, which is a vastly larger range than 0 to 255, even though sixteen bits is only twice as many bits as eight.

Fig. 10.2.1: Bits per channel. Each pixel is encoded via blending of three channels: red,
green, and blue. Each additional bit available for encoding pixel colors doubles the
number of hues that can be specified. Using more bits can result in smoother color gradients.

    Every time you increase the number of bits you’re working with by just one bit, you double the number of values that can be encoded. This is extremely relevant to digital imaging, since binary numbers are used internally by the computer to represent images. Each pixel of a digital image is represented by a binary number, typically either an 8-bit number or 16-bit number. For an 8-bit image—that is, one in which each pixel component (i.e., the red, green, and blue components in an RBG color space) is encoded by an 8-bit binary number—each pixel component can assume one of at most 256 different values, while for a 16-bit image each pixel component can assume one of 65536 different values. These red, green, and blue pixel components are then blended into a single color hue as represented (with greater or lesser fidelity) on the final rendering medium.
    The practical consequence of this is that 16-bit images enjoy a much, much larger color space than their 8-bit counterparts, and therefore offer both greater color fidelity and greater resiliance to the accumulation of certain artifacts that can show up in the image after repeated use of certain postprocessing operations. As we’ll see shortly, many image processing tasks involve stretching the image’s histogram in various ways (histograms were introduced in section 2.6), and this stretching can cause visible artifacts in the image; the occurrence of these artifacts is often greatly reduced by simply working in a 16-bit color space rather than 8-bit. We’ll show how to do that shortly. First, let’s consider some concrete examples of how bit depth (number of bits per channel used to encode an image) can affect image quality.
    In the figure below we show a warbler image rendered in 1, 2, 3, 4, 5, and 8 bits per channel. The per channel part refers to the fact that RGB images have three distinct channels: red, green, and blue (hence RGB). These three channels get blended together to produce the other colors of the rainbow that are needed in order to render a full-spectrum image. As you’ll notice below, larger numbers of bits lend a very definite improvement to image quality.

Fig. 10.2.2: Using more bits can result not only in smoother color
gradients in background regions, but also more details in the bird,
since details are encoded via finely-scaled color differentials.

For the first image, at 1 bit per channel, there are a total of three bits (one each for the red, green, and blue channels). Because each bit can only be in the 0 or 1 state, and because the three bits can vary independently, there are a total of 2³ = 8 different color hues that can be rendered; as is attested by the image above, this isn’t enough hues to produce an aesthetically pleasing image. In the second and third images (2 and 3 bits per channel), there are significantly more details visible, and the colors begin to approach what we’d expect to see in nature, though the backgrounds are severely banded, due to what we call posterization. As we’ll see shortly, posterization can be a problem even when a relatively large number of bits are used to encode the image.
Note especially the difference in background smoothness for the 5 bits-per-channel and 8 bits-per-channel images in the figure above. For the 5 bits-per-channel imagte, it’s still possible to see banding, or posterization, in the background regions of the image, while for the 8 bit image the banding should be nonexistent on most computer screens. The most important concept to understand here is that for a given image, a larger number of bits can give rise to appreciably smoother background gradients than some smaller number of bits, and also more subtle details in the bird. As we’ll see shortly, this issue comes into play not only for smaller numbers of bits, but also for images that have been subjected to certain forms of digital manipulations (especially repeated applications of those digital manipulations), since some postprocessing operations reduce the effective bit utilization even when many bits are technically available in the color space.
Before we continue, we need to review the definition of a histogram. In the following figure we show a photograph with its intensity histogram plotted in the lower-left corner. The horizontal (x) axis of this graph corresponds to pixel brightness, while the vertical (y) axis corresponds to the frequency, or prevalence, of each brightness in the image. That means that for a given pixel brightness measure (position along the x-axis), the height of the graph at that point indicates how many pixels in the image have that brightness.

Fig. 10.2.3: An image and its histogram. The horizontal axis of the histogram corresponds to
pixel brightness; the vertical axis indicates the relative number of pixels in the image that have that
brightness. Many postprocessing operations affect the histogram. A large spike at either end of the
histogram indicates clipping, which can result in loss of detail.

In this particular example, the histogram indicates that most of the image pixels are of medium brightness, though the left tail of the distribution is fatter than the right tail, indicating a slight bias toward the prevalence of darker pixels in the image. Notice also that there is a peak in the graph right at the leftmost edge of the histogram, indicating that some clipping of dark colors has occurred (most likely in this warbler’s black throat patch, where details have been largely obliterated due to the clipping). A much smaller peak at the rightmost edge of the graph indicates a tiny amount of clipping of detail in bright regions, which in this case appears to be entirely negligible. Except for the issue of clipping, I almost never consult the histogram during post-processing (or even when shooting in the field—though I do rely crucially on my camera’s blinking highlight alerts). However, observing the effects of various post-processing operations on the histogram can be very useful as you begin to learn about the many caveats involved in applying these various operations.
In the figure below we show a histogram similar to the one from the previous figure, but this time we also show the per-channel (red, green, blue) decomposition below the main histogram. As you can see, the clipping at the leftmost end of the spectrum is confined to the blue channel. Depending on the image you’re processing and the distribution of colors among the subject and background regions of the image, you may or may not care whether the blue channel (for example) is clipped at the left (or right). The point is that when clipping is observed in the composite histogram, you may want to view the individual channels to see exactly what’s going on. As we’ll see in succeeding chapters, repairing clipped shadows or highlights is sometimes possible, though with greater or lesser amounts of effort, depending on the amount and location (in the image) of the clipping.

Fig. 10.2.4: Color components of a histogram. Top: the composite histogram.
Bottom three panes: red, green, and blue components making up the composite
histogram. In this example the blue channel is clipped at the left end.

In the histograms shown above, the x-axis was said to represent pixel intensity (meaning brightness). You can also think of individual points along the x-axis as corresponding to particular bit combinations—i.e., particular binary numbers encoding pixel intensities in the three RGB channels. Keep this in mind as you consider the histograms for the two images shown in the figure below. The top part of the figure shows an image before a Levels operation is applied (in Photoshop), while the bottom shows the image after the operation has completed.

Fig. 10.2.5: Histogram fragmentation. Top: an image before histogram adjustment.
Right: stretching the histogram via the Levels tool in Photoshop. Bottom: the image
after the adjustment. Notice the white gaps in the histogram; these indicate ranges
of pixel values that are not used in the image. Successive operations can widen these
gaps. If they get wide enough, the result can be visible posterization—obvious color
discontinuities in regions of the image that should appear smooth (not shown here).

As you can see, the bottom image is somewhat brighter, and on some monitors may look a bit more pleasing. However, the histogram for the bottom image has some disturbing characteristics. Whereas the top image’s histogram is relatively smooth and continuous, the bottom histogram is fragmented due to the insertion of gaps in the graph. These gaps correspond to bit combinations that do not occur anywhere in the image—that is, they represent colors (or in this case intensities) that aren’t used in the image. For a small number of such missing colors, the image aesthetics may be entirely unaffected, especially for small print sizes. However, as more image manipulations are performed and as these thin slices turn into wide valleys of missing colors/intensities in the histogram, visual manifestations can appear in the form of posterization. While the image may still technically be encoded in 8 or 16 bits, those bits are not being fully utilized, because some (perhaps many) bit combinations are going unused. As noted earlier, posterization typically shows up in background regions of an image that should feature a smooth gradient, but instead appear choppy and banded.
What causes this effect? In short, image processing causes it. Because there are only a fixed number of discrete “bins” into which pixel values can fall, and because image processing works by manipulating these hues and therefore moving pixels from one bin to another, there is always the potential to adversely affect the overall distribution of pixel hues (i.e., the histogram). The main culprits, however, are exposure transformations (such as Levels and Curves in Photoshop) that stretch or compress portions of the histogram. A useful mental exercise is to consider what would happen if you first apply a transformation that would compress the histogram into half its width, and then expand it back to its original extent. Because of the discrete nature of histogram bins, when the histogram is expanded back to its original range you’d see a considerable loss of continuity in the histogram.
At this point it’s worth revisiting a related issue that was noted much earlier in this book: the fact that brighter pixels are encoded with more bits (on average) than darker pixels. This was one of the justifications for the ETTR (Expose To The Right) and BETTR (Bird Exposed To The Right) philosophies expounded earlier (section 6.2). In the histograms shown above, each pixel intensity was represented by a point on the horizontal (x) axis of the histogram. In the computer, this axis is split into a small number of bins, or intervals. All of the pixel intensities falling in a single bin share the same number of bits in their encoding, but different bins use different numbers of bits for their pixels. What’s important here is that bins further to the right on the histogram use more bits for their pixels than bins further to the left, so they can represent more distinct color shades. Each additional bit used in pixel encodings doubles the number of color shades (hues) that can be represented. Now, consider what happens when you underexpose an image and then try to compensate for the underexposure by increasing the brightness via software. Pixels from darker bins will be re-mapped to lighter bins, as desired, but groups of pixels moved into the brighter bins won’t magically experience a gain of information: if there were only 32 different values represented in that dark bin, there will still be only 32 different values when they get re-mapped to the lighter bin, even if the lighter bin is capable of representing 64 distinct values. In this case we again have poor bit utilization. This is why it’s better to expose to the right (in the camera): if you later need to darken the image in software, pixels will be moving from light bins to darker bins, and there won’t be any information deficit involved in performing that mapping (just make sure you don’t clip the highlights!). But going the other way (brightening dark pixels artificially) often produces unpleasant image artifacts, such as posterization or noise.

Fig. 10.2.6: Working in 16 bit mode can be useful even if your original
image is only 8-bit (i.e., JPEG), because it reduces the magnitude and
impact of rounding errors in the computer.

    One thing you can do to maximize the number of bits available to you is to shoot in RAW instead of JPEG. Whereas JPEG encodes images using only 8 bits per channel, most cameras’ RAW formats utilize 12 or 14 bits per channel. When these are imported into Photoshop, they’ll be represented internally in 16 bits per channel. Note that even if your original image is an 8-bit JPEG, it can be useful to convert it to a 16-bit file before performing your postprocessing. As shown in the figure above, converting to 16-bit is as easy as selecting a menu option in Photoshop. Converting an 8-bit image to 16-bit doesn’t actually add any more information to the image, but it does give the computer more “headroom” for working with the image when performing mathematical transformations on the pixel values. Many people have found that this reduces the incidence of posterization. Although there are a few operations that Photoshop can only perform in 8-bit, I’ve rarely found that to be a problem; you can always convert back to 8-bit via the same menu in Photoshop if necessary.
    An additional thing to note about JPEG files is that they utilize a lossy compression scheme, which can result in loss of image information during the conversion to JPEG. If, for example, you convert a RAW image to 8-bit format, or if you load an image from an 8-bit TIFF file, and then save it as a JPEG file, when you load the JPEG back into Photoshop later you might notice some subtle distortions as compared to the original uncompressed 8-bit image. JPEG files are typically ideal for images that are to be displayed on a web page (i.e., on the internet), because the compression scheme which they utilize often results in small file sizes and therefore fast data transfer over the internet. For your personal image archives, however, I recommend using either RAW files or some other lossless format such as TIFF (or Photoshop’s proprietary format, PSD). These files will be larger and will therefore fill up your computer’s hard drive faster, but just remember that larger files have more bits, and more bits means more information (or at least the potential for more information).
    Let's now very briefly revisit the issue of color spaces. As mentioned earlier, when working in an RGB color space, each pixel is encoded by some number of bits (typically 8 or 16) per channel, with the three channels being red, green, and blue. In other color spaces (such as CMYK), colors will generally be encoded using a different mapping, so that a particular bit string (e.g., 1011101001011011) may encode a completely different color in different color spaces. Photoshop renders this issue largely transparent, since it converts the bit representations automatically for you when you switch between color spaces, but it’s worthwhile to at least be aware of the fact that a single bit encoding for a pixel can represent either subtly or drastically different colors in different color spaces. One context in which this becomes a practical issue is when dealing with different computer platforms and rendering devices. One image may appear lighter or darker, or more or less saturated (see section 10.4) on two different computer systems, due to the slightly different ways in which those systems map pixel values to actual colors. This is of particular concern for users of Apple Macintosh computers, which, prior to OS X version 10.6, applied a different gamma transformation to images before displaying them than most other computer systems; the issue of gamma is discussed in section 16.2.4.
    Finally, it’s worthwhile at this juncture to refine a concept introduced much earlier in this book. In section 2.5 we described how photons are collected by the silicon photosites that reside on the imaging sensor, and how these photons are counted and their counts converted into analog pixel intensity signals that are then converted into digital numbers and stored on the memory card. What we didn’t explain was how colors are detected and represented. Briefly, each photon, which can be interpreted equivalently as an electromagnetic wave of a particular frequency, encodes a particular value in the frequency spectrum. The biological apparatus in the human eye and brain converts these frequencies into what we perceive psychologically as colors. To achieve this conversion (of frequency into color) in a digital sensor, engineers have come up with a solution based on what’s known as the Bayer pattern, which is illustrated below. This pattern would be repeated across the imaging sensor so as to cover all photosites.

Fig. 10.2.7: The Bayer pattern.
Different photosites in a Bayer-style sensor
measure only one primary color. Interpolating
these into continuous pixel values is the job of the
demosaicing algorithm which is applied later
during RAW conversion. Note that the pattern
shown in this figure would be repeated across
the sensor so that each cell corresponds to one
photosite.

Each cell in the Bayer pattern corresponds to one photosite. The color of a cell indicates which frequencies are permitted to pass through the Bayer filter into the photosite lying beneath the cell. Thus, red photosites count only those photons having a frequency that falls within (what humans perceive as) the red portion of the color spectrum, and similarly for the blue and green sites. This of course reduces sample sizes and therefore introduces a potential source of noise. More relevant to the current discussion is the fact that only three colors are registered under a Bayer system: pure red, pure green, and pure blue. In order to reconstruct an image comprised of many different shades and hues, an algorithm has to be applied which interpolates between the neighboring red, green, and blue intensities measured under the Bayer pattern. This process is referred to as demosaicing, and has a number of subtle implications. For example, since noise may be more prevalant in one of the three channels than the others, de-noising algorithms operating on the RAW Bayer values can be more effective than those applied to the post-interpolation data. The important point is that RAW files contain the original red, green, and blue values measured directly from the Bayer pattern (after ISO amplification and digitization), and therefore contain more information than even a high-bit TIFF file. As we’ll see in the next chapter, this is one of the reasons why it’s beneficial to perform certain image processing tasks during RAW conversion, rather than post-conversion.
Though the Bayer pattern is currently dominant among consumer imaging devices, other patterns are in use and/or currently under development. One such pattern splits the red, green, and blue channels into three spatially separated layers of silicon. Other patterns permit the binning (pooling) of photon counts between neighboring photosites to reduce sampling error and therefore reduce noise, at the cost of resolution.