FP-RobinHeld
From CS294-10 Visualization Fa07
Contents |
[edit] Proposal
[edit] Group Members
- Robin Held
[edit] Description
- Introduction / Summary
- Most 3D, or “stereoscopic,” displays suffer from perceived distortions that limit their usefulness to technical and general audiences. An analysis of the causes of the distortions should lead to the development of better stereoscopic display systems. The distortions have therefore received attention from the stereo-cinema and virtual-reality communities, who are keen to make their displays more accessible. As I explain below, these researchers have taken a quantitative, geometric approach to predicting stereoscopic distortions. However, most of their analysis has been limited to objects lying within only one plane in 3d space, and their illustrations of the distortions are hard to follow. For this project, I will create an intuitive graphical user interface that makes the causes of stereoscopic distortions easier to understand. The program will be based on OpenGl, which will be used to create 3d models of the distortions inherent to sterescopic displays. The interface will allow a user to change various conditions relating to the capture and display of stereoscopic images, and see their effects on the distortions in real time. The combination of interactivity and 3d visualization will hopefully make the topic of stereoscopic distortions easier to understand and explore.
- Background: Existing Research
- Accurate perception of the contents of stereoscopic photographs depends on the relationship between the capture parameters (how the cameras were set up) and viewing parameters (the location of the viewer's eyes relative to the photo or display) (Masaoka et al., 2006; Wartell et al., 2002; Woods et al., 1993; Yamanoue et al., 2002). Specifically, the photos must be set up so that the rays of light delivered to the eyes, or the “light fields,” mimic what the observer would have seen had s/he been present at the original scene. To do so, the eyes must be placed in the centers of projection (CoP’s) of the photos (Vishwanath et al, 2005). Also, the photos must be positioned relative to one another to maintain the same vergence angles that were necessary to fixate on objects within the original scene. If these conditions are met, then geometric distortions are eliminated. However, viewers rarely place their eyes at the CoP’s of the image; most often people view stereoscopic images obliquely, especially in large audiences, thereby creating geometric distortions. The magnitude and nature of the geometric distortions are of particular concern to 3d display designers and 3d content creators. To approach the problem, the existing virtual-reality and stereoscopic cinema literature has produced a geometric model based on capture and viewing parameters such as the spacing, orientation, and focal length of the cameras, and the position of the viewer relative to the screen (Masaoka et al., 2006; Wartell et al., 2002; Woods et al., 1993; Yamanoue et al., 2000; Yamanoue et al., 2002). Their approach is based on a series of transformations that begins with an object’s coordinates in the original scene and ends with its 3d location as perceived by the viewer.
- In the transition from an original scene to a viewer’s percept, four transformations are employed. First, when the photo is taken, the object’s coordinates in 3d space are transformed to coordinates on the charge-coupled detectors (CCD’s) within each of the stereo cameras. Second, when the photos are displayed, those CCD coordinates must be magnified to picture coordinates. Then when the viewer looks at the photos, the picture coordinates are transformed to retinal coordinates. In the fourth and final step, the brain uses the retinal coordinates to form a 3d percept of the displayed scene. If one approaches the last step geometrically, it is an issue of finding the intersections of pairs of rays (the members of the pairs coming from different eyes). To illustrate this concept, consider a point source of light in 3d space. For each eye, there is a ray of light that emanates from the point source and hits the retina. If one approaches the situation backwards, one could find the location of the point source by beginning with the point images in the retinas, constructing rays from those points through the centers of the eyes and out into space, and finding the rays’ intersection. With traditional 3d displays, the brain uses a similar method. The rays are constructed from the centers of the eyes through the disparate points on the screen. Again, the intersection of the rays produces the perceived location of that point in space. The current stereo cinema literature uses this geometric technique to create predictions for perceived distortions. There is some vision science research that suggests this approach may not accurately reflect human perception under certain viewing conditions (Backus et al., 1999), but that discussion is beyond the scope of this project.
- Selected Visualization Features
- OpenGL-based
- Written with Cocoa to take advantage of graphical user interface controls
- Will allow the user to control stereoscopic capture and viewing parameters and observe the effects on the distortions in real time
- The visualization's OpenGL view will adjustable, so distortions can be investigated from multiple viewpoints
- Multiple stereoscopic stimulus types (frontoparallel plane, square, sphere, etc) will be available
- References
- Backus, B.T., et al., Horizontal and vertical disparity, eye position, and stereoscopic slant perception. Vision Res, 1999. 39(6): p. 1143-70.
- Masaoka, K., et al., Spatial distortion prediction system for stereoscopic images. Journal of Electronic Imaging, 2006. 15(1): p. 13002-13002.
- Vishwanath, D., A.R. Girshick, and M.S. Banks, Why pictures look right when viewed from the wrong place. Nature Neuroscience, 2005. 8(10): p. 1401-10.
- Wartell, Z., L. F. Hodges, et al. (2002). A geometric comparison of algorithms for fusion control in stereoscopic HTDs. IEEE Transactions on Visualization and Computer Graphics 8(2): 129-143.
- Woods, A., T. Docherty, and R. Koch, Image Distortions in Stereoscopic Video Systems. Proc. SPIE, 1993. 1995: p. 36-47.
- Yamanoue, H., M. Okui, and I. Yuyama, A study on the relationship between shooting conditions and cardboard effect of stereoscopic images. Circuits and Systems for Video Technology, IEEE Transactions on, 2000. 10(3): p. 411-416.
- Yamanoue, H., et al., Geometrical analysis of puppet theater and cardboard effects in stereoscopic images. J. Inst. Image Inf. TV Engineers, 2002. 56(4): p. 575–582.
[edit] Initial Problem Presentation
- Proposal Slides: Media:HeldProposal.pdf
[edit] Midpoint Design Discussion
- As of 11/18/07, the following progress has been made on the project:
- 1. Tabbed Controls
- The various controls for the software have been incorporated using a tabbed interface. The tabs are roughly arranged to mimic the geometric approach to analyzing stereoscopic distortions. That is, they begin with the stimulus, then image capture, and finally image display. See the screen shots below:
- The stimulus parameters control what type of example object is being capture by the stereoscopic cameras. The options are a plane, hinge, cube, cylinder, or sphere. The user can adjust the size and position of the stimulus, as well as its rotation relative to the cameras if it is a plane, hinge, or cube. The hinge angle can also be adjusted. Rendering settings have also been implemented that allow the user to choose between lines or individual dots to illustrate the stimulus. The next tab, labeled "Capture," allows the user to control how the stereoscopic images are acquired. The cameras' optical axes can be parallel or converging, and the camera spacing, focal length can be adjusted. The convergence distance, convergence angle, and CCD sensor offset all affect the degree to which the camera's optical axes are converged. The final tab, titled "Viewing," provides controls for the simulated 3D display and human observer. The magnification and projector offset controls affect the size and relative placement of the disparate images. Interpupillary distance is the distance between the viewer's eyes. The rest of the controls affect the position and orientation of the viewer.
- 2. Prediction Engine
- The ray-intersection geometric approach to stereoscopic distortions (described above) is employed in the software. The program sequentially transforms each point in the original stimulus to its predicted location in 3D place relative to a simulated human observer.
- 3. Five Available Stimuli
- As mentioned above, five different stimuli are now available. The inclusion of 3D objects aside from a simple plane is important to the visualization. Specifically, if chooses to display a sphere or cube, the program can show how the overall 3D shape is distorted under various stereoscopic viewing situations
- 4. Save and Load Functionality
- Text-based save and load routines have been implemented. The saved files are in .txt file, with the name of each variable listed along with its value.
- Future Work
- The basic program has now been implemented. However, the usability of the interface needs to be improved. To accomplish this, I will add macros to move the OpenGL to various locations that illustrate certain concepts in the analysis of stereoscopic distortions. For instance, to demonstrate the image capture step, the eye balls and distorted image will be turned off, leaving only the stereo cameras and original stimulus on display. The camera will also automatically move to a location that shows all of the pertinent objects. Explanatory text will also help the user progress through each step in the stimulus-capture-viewing sequence.
- Presentation
- Midterm Presentation Slides: Media:RHELDMid.pdf
[edit] Final Deliverables
- Final Poster: Media:RHELDPoster.pdf
- Final Report: Media:RHELDFinal.pdf
- Final code delivered via email.
