Light, Vision and Imagery
By: David Wood - November 28, 2001
Considering how important light and our sense of vision is to us human beings, it is not surprising that for thousands of years human beings have been finding new ways to store information and ideas visually. Cave paintings were among the first pictures drawn by human beings with charcoal or were scraped directly into the wall. Some cave paintings are 15-20 thousand years old and depict creatures like the mammoth, which became extinct around 10,000 years ago. Today we can study these pictures to get clues about human behavior and how the world was back then. The fact that we can do this, thousands of years after the people who saw these events died, shows just how significant this new skill was to the human race.
Over time drawn images evolved into writing where pictures no longer simply represented real world images or events, but instead were symbols that represented meanings or phonetic sounds. The best known ancient writing to most is Egyptian Hieroglyphics.
Hieroglyphics was developed about 4000 years BC and is made up of signs that represent single sounds, combinations of sounds, or entire words. The meaning of hieroglyphics was lost for over 1500 years until Napoleon's men discovered the Rosetta stone in 1799. The Rosetta stone displayed the same inscription in Hieroglyphic, Demotic and Greek. Since Egyptologists were already familiar with Greek, they could now learn the meaning of the Hieroglyphic symbols. Because of the vast quantity of Egyptian literature, we now know more about their society than most other ancient cultures.
The Egyptians also had a way of carrying written word around with them - paper. "Paper" comes from the Egyptian word "papyrus" which was a plant that grew along the lower parts of the Nile River. The Egyptians would harvest this plant and peel it into strips, the strips would then be layered together and pounded to form a single sheet. At around 3000 BC a Chinese man called Ts'ai Lun invented what we know today as paper. He took tree bark and bamboo fibers, mixed them with water, then pounded the mixture into a pulp which he then poured and let set on a piece of cloth. The water drained away, the fibers dried, and paper was the result.
Until a few thousand years ago the only information a human being could learn came from their own experiences and what their parents taught them. Human knowledge was poorly communicated and often forgotten. Once people began to draw and write what they learnt human knowledge began to accumulate and a permanent record of human history was started. Today we can walk into a library, pick up a book, and learn from a person who died hundreds or even thousands of years ago.
"A picture is worth a thousand words" as they say, and for many things this is true. The written word is useful for recording ideas, emotions and dialog, but when representing a visual image of a scene or object it is still easier to simply draw it than it is to describe it through writing. The problem with drawing a picture is that it takes time and the more realistic the picture the more time it take to paint. Few people even lacked the skills to paint a complex scene with a high degree of realism.
In 1727 Johann Heinrich Schulze discovered that silver nitrate darkened on exposure to light. He was not researching a way of making photographic images, but this was the first breakthrough necessary to make photography possible. In 1802 Thomas Wedgwood (son of Josiah Wedgwood, who build the world famous Wedgwood's pottery business) worked with Sir Humphrey Davy and wrote a paper entitled "An account of a method of copying paintings upon glass and of making profiles by the agency of light upon nitrate of silver." Wedgwood and Davy almost made the breakthrough into modern photography however they never found a fix to the problem that the silver nitrate would continue to expose under light. Any image that Wedgwood or Davy would make would turn completely black after a few minutes and they were restricted to viewing images by candle light. The first person to "fix" a photograph (to prevent it from exposing any further once the image is visible) was Louis Daguerre who in 1829 developed a process for making photographic plates that could be "fixed" by immersing them in salt.
Not everyone was excited about the invention of photography. Many artists saw it as a threat to their livelihood and many others considered it blasphemy. A report in a newspaper called the Leipzig City Advertiser wrote: "The wish to capture evanescent reflections is not only impossible... but the mere desire alone, the will to do so, is blasphemy. God created man in His own image, and no man- made machine may fix the image of God. Is it possible that God should have abandoned His eternal principles, and allowed a Frenchman... to give to the world an invention of the Devil?" Work continue however throughout the 1800s and by the close of the century photographic technology was available for many more people to use, thanks in part to George Eastman who introduced flexible film in 1884 and the first box camera in 1888.
All the above methods were monochromatic, or "black and white photographs", as is the common saying. In the 1930s Kodak introduced the first color film, which consisted of three layers of emulsion, each sensitive to a different color. The idea itself was proposed by Ducas du Hauron in 1869, however there were no emulsions with which Ducas could test his theory at the time.
The predecessor to what we know as the movie theatre was called "Phantasmagorie", it was a show that consisted of images projected by laterns and waxed sheets. One of the first phantasmagorie professionals was E'tienne Caspard Robert who put on demonstrations in old run-down chapels that showed phantoms and demons. To the people of the time who had never seen a projected image, the affect was quite frightening.
Cinema was the result of the combination of photography, image projection, and celluloid film. Thomas Edison and his assistant William Dickson were the first people to really use celluloid film for motion pictures. In 1890 they developed a video camera called a Kinetograph, which recorded images on a 50 foot loop of 35mm film. To view the movie they invented a machine called a Kinetoscope, which used a revolving shutter to show a glimpse of each image. When the viewer watched the images of a moving scene in sequence, persistence of vision allowed them to be perceived as a fluid moving image.
Motion pictures added another dimension to imaging technology, not only could instant images of a scene be recorded but now the motion of the scene could be recorded over a time frame. This film projection technology has been refined over the years, and is another imaging technology that has become commonplace in every household.
The Cathode Ray Tube
A cathode ray tube (CRT) is a vacuum tube that will emit a beam of electrons. When these electrons strike a phosphorescent surface a spot of visible light is produced. The first cathode ray tube display was a cathode-ray oscilloscope, invented in 1897 by Karl Ferdinand Braun. These cathode rays can be aimed and focused using an electro-magnet and by altering the current that passes through it we can alter angle of the beam. CRTs are the core component of most television sets and computer monitors.
In a modern television set we move the beam in a scan-line fashion and alter the intensity of the beam to draw the image. This is done 25-30 times a second to get fairly fluid animation. Modern TVs are also interlaced in which every even line is draw on one frame and every odd line is draw on the other. The interlacing effect produces an image that appears to be twice the resolution of that shown on a non-interlaced TV.
radiate light at different frequencies: red, green and blue. A color television also has three cathode rays, one for each color. By mixing various levels of red, green and blue, any color from the visual spectrum can be produced. These picture elements (or "pixels" for short) are very small, often less than a millimeter across, so at a distance they can't be seen individually. Patterns of these illuminated picture elements make up the image that you see on your TV.
Liquid Crystal Displays
CRTs are very good at what they do and remain the most popular electronic display technology for video images. They have gone through many refinements and improvements over the years yet one thing that is difficult to do with a CRT is miniaturize it. For example, we will likely never see a CRT built into a wristwatch. For making devices with light and compact displays, designers turned to another emerging technology: Liquid crystals.
Liquid crystals were first discovered by Friedrich Reinitzer in 1888. He noticed that while melting cholesteryl benzoate it started as a cloudy liquid but then cleared as its temperature rose. When cooled the liquid would turn blue before freezing. Liquid crystal is a state that certain materials can reach at the right temperature where the molecules flow free like a liquid yet still tend to arrange themselves in a particular pattern like a crystal. With some forms of liquid crystal you can align the pattern of the crystals by passing an electric current through them, this is what makes them useable for electronic displays.
The first LCD display was created by RCA in 1968. To make an LCD display you need two polarized panels sandwiching liquid crystal.
Light that passes through the rear screen is polarized and its orientation is altered as it passes through the liquid crystal (see picture above, left). This allows the light to pass through the front screen which is polarized at a 90 degree angle to the rear one. If a current is passed through the crystals (see picture above, right) then the polarized light will pass through them un-altered. Since this light is still polarized at a 90 degree angle to the polarized screen at the front of the display it will not pass. This is what happens on a digital watch when the a segment of one of the numbers turns black.
An LCD display is made up of multiple segments of the display above which combined together make up the image, similar to how a CRT displays an image made of pixels on a phosphor screen. Liquid crystals themselves do no radiate light, so to see them in a display either a light source is placed behind it (as with LCD monitors) or the back of the display is made of a reflective material (typical of digital watches and pocket calculators). To make a color LCD display, the LCD segments are placed in groups of three with red, green and blue filters placed over the top of them.
An interesting variant of the LCD display is the LCD projector. Instead of producing a large LCD display, a small LCD display it made, a very powerful backlight is installed, and a lens is placed in front of the LCD to focus the image onto a large screen in front of it. LCD projectors differ from regular projectors in that they can project images and video from an electronic source (e.g. a VCR or computer) rather than requiring celluloid film.
There are several other electronic video displays in development such as gas-plasma, however these displays are currently a young technology and costly to produce.
When these new electronic display technologies met with the rapidly evolving world of computers a media revolution began. A CRT combined with a few kilobytes of computer program created a new form of media that didn't exist before. This new media form was interactive in real time. Stored information could now be interacted with, which gave the user more control over retrieving the piece of information that they are interested in. An internet search engine is an excellent example of this; a student no longer has to flick through an entire biology book to learn what photosynthesis is, he or she can simple ask the computer "what is photosynthesis?" and the information will be retrieved and sorted by relevance automatically.
Apart from its educational benefits, interactive media quickly became recognized for its entertainment value. Computer games are now a multi-billion dollar industry and have evolved rapidly from the early simple blocky graphics, to the high resolution video that we see on CRT or LCD displays today.
The real-time "3D" video displayed in today's computer games is the result of billions of data operations per second, made possible only by the incredible rate at which computer technology has evolved.
Despite the high performance and capabilities of today's computers for the purposes of imagery they are still lacking. They are lacking because these "3D" real time images are still translated into two-dimensions to display on a 2D device (e.g. a CRT monitor). Using these display devices the best images we could ever create would look like two-dimensional photographs. We can recreate some 3D effects like shadows and parallax scrolling, however, the main cues for depth (e.g. stereopsis) cannot be shown on traditional display devices, at least not without a few tricks…
The most common way of creating an illusion of depth is to display a stereo pair of images, where each image represents the same scene from slightly different positions as viewed from each eye. This slight difference between the two images is called "binocular disparity". Charles Wheatstone discovered this in 1833 by drawing stereo images by hand. The technical problem is in making sure that each image only sees the image meant for it and not the other eye. But there are a couple of techniques for viewing these stereo image pairs without the need for any extra equipment. These techniques are called "free-viewing". There are two types of free-viewing: "parallel" and "cross-eye".
The cross-eye technique, as the name suggests, involves going cross-eyed to see. The image shown on the right is meant for your left eye, and the image on the left is meant for your right eye. To see the effect go cross eyed and adjust the distance that you focus until the two images join to form a third image in between them. You may need to tilt your head to the left or right slightly to get the two images to align vertically. Once you have the third image centered, try and maintain it and hold still. Here comes the hardest part: your eyes need to remain aligned in this cross eyed position (as if you are looking at something close to your nose), but your eye lenses need to focus on the images at there real distances (on the screen or paper). You are effectively asking your visual system to tune itself to a scene at two different distances simultaneously. This is difficult because your visual system isn't used to doing this, and your brain isn't used to consciously controlling your eyes in this way either (depth perception is a mostly subconscious process).
The other technique is called parallel viewing, where the left image is meant for the left eye, and right image is meant for the right. To see this illusion you must look through the picture, again trying to form a third image in the center (although this time the third image will look further away). This technique asks your eyes to position themselves as if staring off into the distance, yet at the same time asking the lenses in your eyes to focus on the actual images at their real distance. Parallel image pairs are quite small because the distance between the two images can't be further apart than the distance between your eyeballs.
Free-viewing is difficult, takes practice, and you may never be able to see one or both of the techniques. It also causes eyestrain and even temporary short-sightedness. An easier way is to view stereo images is to use a viewing machine that uses lenses and mirrors to ensure that each eye only sees the image meant for it and also makes the focusing easier. Charles Wheatstone made the first, a more modern example would be the View Master™ machines that can still be found in today's toy stores.
Autostereograms ("Auto" because they are generated by computer) are a variation of the free viewing technique. Instead of displaying a pair of images, a stereogram displays a pattern made of the stereo pair of images. Some stereograms work by slightly offsetting parts of a repeating pattern. The shape of the 3D object appears to emerge from the pattern when you look through it as with the parallel free-viewing technique. It is only when you look through the image that the patterns reassemble into a 3D shape. This stereogram displays the words "Hello world" when viewed correctly. Not everyone can see stereograms (myself included).
Anaglyphs use color to separate the stereo images from a single picture. If we take the stereo pair shown previously, color the image meant for the left eye in red, the image meant for the right in blue-green, and then overlay the two we will get an image that looks something like this:
The above picture can be viewed with a pair of red / blue-green 3D glasses, these 3D glasses are the most common type and are often distributed with comic books. The red filter on the glasses over the left eye will allow the red parts of the image to be seen, and the blue-green filter will allow only the blue-green parts of the image to be seen by the other eye. The brain of the viewer will translate the stereo pair into a single image with depth. Note: If you are viewing the above image and the 3D effect seems odd, try flipping the glasses around.
The first noticeable side effect of this 3D technique is that the image is monotone because the color is being used for the 3D effect. It is possible to make a full color anaglyph, where a third center image carries the color, and the anaglyph effect is only applied to the edges of objects. This technique only works on images of fairly simple scenes with clearly defined edges.
The ChromaDepth technique also uses glasses and color to represent depth but with a difference. In the ChromaDepth method the color of an object represents it's distance; red things look close, blue things look far away, and yellow and green look somewhere in between. ChromaDepth glasses are clear and do not colorize the image. Here is an example ChromaDepth image, a drop shadow has been added to enhance the effect:
Some people can see a depth effect in pictures like the one above without using the glasses. This is due to the fact that the different photoreceptors on the retina focus on light at slightly different distances. Usually the brain will compensate for this difference in depth and not everyone can see it unaided. ChromaDepth glasses enhance this effect; the filters refract light at a different angle depending on its color.
It is very easy to make a simple 3D picture for viewing with ChromaDepth glasses, in fact a child with some colored crayons could draw one. For creating pictures of real scenes the photographer would need to ensure that all things are colored according to their distance, which is often impractical. With a piece of video, an object that moves closer to you or further away, would change its color as it moved. On a complex piece of interactive media (like a computer game) the overall result would be simply ugly.
The Pulfrich effect was discovered by Carl Pulfrich, who in 1922 discovered that the photoreceptors in the retina take slightly longer to send a signal to the brain under low light conditions. Pulfrich theorized that if a person were viewing a swinging pendulum with a light filter over one eye and a dark filter over another, that they would see binocular disparity and the pendulum would appear to circle rather than swing back and forth in a straight line. Pulfrich himself could never see the effect as he was blind in one eye, others verified his theory. Pulfrich 3D glasses typically have a dark purple filter over one eye and a light green filter over the other.
The green filter isn't really necessary, it just helps compensate for the way that the darker filter makes everything look purple. The Pulfrich effect can work on any piece of video as long as the scene is moving in a horizontal manner or rotating around an object in a set direction.
In the above illustration a ball is moving from right to left on a piece of video. The left eye behind the light filter is seeing frame 2, but the dark filter is delaying the image perceived by the right eye so it is still seeing frame 1. The two different frames are perceived simultaneously and the disparity between the two is perceived as depth. The faster the ball moves the closer to the viewer it will seem. If the ball moved from left to right then the effect is reversed and ball would look further away (unless the viewer flips his glasses around in which case the ball will look closer again).
Pulfrich glasses are only good for certain video clips where everything is moving in the correct way and direction, but they can provide a few sweet moments when watching some sports like Nascar Racing, where the cars all go around the track in the same direction.
Polarized 3D use two projectors to project a pair of images onto a single screen. Each projection is polarized at a 90 degree angle to the other. The viewer wears a pair of glasses that in turn have polarized filters which are at a 90 degree angles to each other. These filters allow only the polarized image meant for the eye beneath it to be seen.
The main cost of a polarized 3D system is the screen, which is coated with a silver compound. A normal projection screen would scatter the light and it would loose is polarization. Because of the cost, polarized 3D systems are usually only found in places like amusement parks. It is possible to hook up a pair of projectors to a PC with dual monitor capabilities to see 3D in this fashion.
LCD Shutter Glasses
The original shutter glasses were actual mechanical shutter mechanisms. They would display an image to one eye for a brief fraction of a second, then cover it and show an image to the other eye, then cover it and repeat. Modern shutter glasses use an LCD that darkens the filter over each eye in turn to achieve the same result.
LCD shutter glasses are commonly used on computers, partly because a traditional TV can't switch between the left and right images fast enough to prevent the flicker from being visible. Since the computer flicks between the left and right images for each frame, the refresh rate is effectively halved. When this refresh rate drops below around 70 hz (70 frames a second), the flicker is perceivable. For this reason a monitor with a refresh rate of at least 140 hz was required for early LCD shutter glasses. Modern shutter glasses are able to interlace between frames so that the flicker is a lot less noticeable, even at a 100 hz refresh rate. Shutter glasses have the advantages of being able to display a 3D image in full color and are compatible with most existing 3D software. Now priced below 100 dollars, LCD shutter glasses have become the most popular form of 3D imaging for home computer users.
The basic idea behind virtual reality was to strap two CRT monitors to a person's head, one on each eye. In fact early prototypes of virtual reality were exactly this, the CRTs were suspended from the ceiling with counterweights so that the user could wear the display without breaking his neck. During the late 80s and early 90s, LCD displays allowed this technology to be miniaturized enough that the display could be contained within a wearable headset. A virtual reality display contains a position tracking device that allows the user to turn his or her head and the simulated video image would respond accordingly. A common accessory to the virtual reality (or "VR") headset was the data glove. The glove monitors the position of the users hand and fingers, allowing them to grab, push, drag, and otherwise interact with the virtual objects they saw. Virtual reality computer games became popular in amusement arcades during the early 90s and were a popular sight in science fiction movies, usually portraying visual effects a lot more advanced than they were in reality.
Virtual reality has faded in popularity, mainly because its first incarnation was based on technology that was still in its infancy. Once the novelty faded people lost interest. VR headsets were bulky and uncomfortable to wear for more than a few minutes. Add to that the fact that the graphics were early, blocky, polygon based renderings that were unimpressive even 10 years ago. LCD technology and computer graphics have advanced a lot in recent years and it seems likely that if virtual reality were to remerge it would be a lot more like what science fiction originally thought it would be.
Lenticular images are made by interweaving a stereo image pair vertically, then overlaying the resulting image with a lenticular sheet.
The lenticular sheet is a clear piece of plastic with thousands or ridges across its surface. When the viewer looks at the image the lenticular screen will refract the interweaved sections of the image so that each eye will only see the segments of one image and not the other. Lenticular pictures can also interweave several frames of animation together and the animation clip can be "played" by tilting the picture from side to side (or by keeping the image still and tilting your head). This form of lenticular imaging is sometimes seen on the covers of videotapes and CDs.
The interweaved image behind a lenticular display does not have to be a printed image; it can also be a live computer generated image on an LCD display. Several lenticular LCD displays are already available, although they currently cost around $5000 and up. The advantage to a lenticular monitor is that it can display a full color 3D image without the need to wear special glasses. The disadvantage of this technology is that the user must view the display straight center. If he or she moves their head more than an inch or so left or right the illusion is lost.
A hologram is a recording of a set of light waves that originate from a three dimensional object. Holograms reflect or refract a beam of light in way that recreates the light wave pattern as it was reflected from the original object. The first holograms were created in 1947 by Dennis Gabor. He was searching for a way to increase the resolution of the electron microscope and while doing so conducted an experiment which exploited the similarities between electron beams and light beams (which are more practical to work with). Gabor's method involved exposing an object to a beam of light, which then reflected off it onto a photosensitive plate. This beam was called the object beam. What made this different from photography was that Gabor would simultaneously expose the plate to a beam of light that originated from the same source, but did not reflect from the object. This beam was called the reference beam.
When the object beam and the reference beam converged, they produced an interference pattern on the photographic plate. This interference pattern is called a hologram. To see a hologram correctly, you must shine a beam of light on it from a particular angle. Actually the angle should be the same as the original reference beam. The plate will then reflect light as the original object did.
To make a hologram the origin of the beam that is split into the reference and object beams must be coherent. Gabor used a mercury-arc lamp, although lasers later proved to be much more useful for this purpose. With the development of cheap semi-conductor diode lasers, like those found in laser pointers, it is now possible to create your own holograms with a budget of less than 100 dollars.
Holograms produce an excellent three-dimensional effect when properly illuminated. They do not require special glasses, or even binocular vision to see. Holograms are one of the few 3D techniques that have native parallax, meaning that if you had a hologram of a cup you could, to a certain degree, peer over the edge of the cup and look inside it.
Holograms can be very detailed and accurate images. It is even possible to take medical measurements from the hologram of a person's anatomy. In theory the resolution of a hologram is only limited by the wavelength of visible light itself. It is the extremely high amount of data contained within holograms that also makes ideas like holographic television a very difficult thing to accomplish, even with today's technology.
Volumetric displays project an image within a three-dimensional area of space. The image shown within a volumetric display is truly three-dimensional. In contrast, the 3D in all the methods described previously was only illusionary. There are numerous volumetric display techniques in development, only one of which will be discussed here.
The photograph to the right is of a volumetric display called "Helios", developed by Actuality Systems Inc. This crystal ball like display is currently the highest resolution volumetric display in the world. It can be viewed from any angle, by any number of people, without the need for glasses or any other viewing equipment.
The display captures 3D data from a traditional computer source such as a 3D modeling package or CAD program. A computer inside the display renders the model as 198 two-dimensional images slices. Inside the "crystal ball" itself is a screen that rotates at approximately 730 RPM. As the screen rotates, the slices of the image are projected in sequence. Persistence of vision takes care of the rest, allowing the viewer to see a floating three-dimensional object.
The resolution of the display is approximately 100 million voxels (think of a "voxel" as a three-dimensional pixel). The display uses 6 gigabits of computer memory (roughly 20 times the amount of memory in most graphics cards inside today's person computers). Needless to say, volumetric displays this advanced would have been too expensive to produce with the technology available 10 years ago.
Volumetric displays won't allow us to live inside a computer simulation like in a science fiction movie, but they do allow us to view three-dimensional objects from outside in a convenient comfortable manner. There are many fields which would immediately benefit from this technology including molecular visualization and medicine.
The above image is of a sugar molecule. Traditionally scientists have used plastic modeling kits to visualize a molecule like this. Later, computer technology did allow the molecules to be drawn on a computer display, yet it was shown in two dimensions on a CRT monitor. The viewer would then need to instruct the computer to rotate the image so that he or she could reconstruct mentally how the molecule looked three dimensionally. By contrast, with the display by Actuality Systems, the molecule can just be shown in 3D and the viewer can walk around it if they choose to.
Displays like these could also be used one day for delicate "key-hole" surgery. Currently the surgeon has to try and make sense of multiple two-dimensional slices of a cat-scan or x-ray. It is very difficult to judge exactly where a tumor is inside a patient's head from looking at dozens of two dimensional slices. With a volumetric display the location of the tumor and the surgeon's scalpel inside the patients skull could be seen in 3D, perhaps even in real-time!