Conveying Shape: Lighting, Shading, Texture
From CS294-10 Visualization Sp11
Lecture on Apr 25, 2011
- Perceiving Shape from Shading. Ramachandran (pdf)
- Conveying Shape and Features with Image-Based Relighting. Akers et al. (html)
- The lit sphere: a model for capturing NPR shading from art. Sloan et al. (html)
- Automatic lighting design using a perceptual quality metric. Shacked and Lischinski. (pdf)
- Maximum entropy light source placement. Gumhold. (ieee)
- Light Collages. Lee et al. (html)
Julian Limon - Apr 25, 2011 05:18:39 pm
The Schuman et al. paper that as discussed in class today reminded be of low-fidelity prototyping. Schuman et al. ran an experiment with architects to evaluate different computer-generated images (namely, CAD plots, shaded images and sketches). They discovered that people tend to associate sketches with preliminary drafts and CAD plots with final presentations. They also found that sketches stimulate significant more discussions and active changes than CAD plots and shaded images. This reminds me of why napkin-like prototyping is so powerful. When people are looking at these kinds of prototypes they are more likely to look at them, request changes, and discuss them. On the other hand, when prototypes are too realistic or pixel-perfect, people might reserve their comments. Specifically for our final project, I think it makes sense to gather feedback using low-fidelity sketches. Subjects would be less attached to the sketches and will be more likely to give honest comments. Even if some parts of the system are already built, efforts can be done to make it "look" more like a sketch if we want to obtain useful feedback.
On a totally different note, Photomontage was criticized today because it is a user-driven system and is not fully automated. I find this to be a pro as opposed to a con. Ultimately, artists know better than computers how they want to convey a certain story. I like Photomontage because, instead of trying to automate the whole process, it provides tools for artists to determine how they want the result to look like. Totally-automated techniques might lose some of the nuances that artists want to convey and might not be suited for some cases.
Brandon Liu - Apr 25, 2011 06:19:22 pm
The tradeoffs between artistic considerations and automation in the photomontage and image-based relighting systems are interesting. Specifically, if one were to automate such a system, how well would it perform? This reminds me of the 'Auto-' color/contrast/levels functions in Photoshop, that optimize for some quantity in an image. Could Photomontage take a similar approach and optimize detail in the final image? One strong argument against this is that the boundaries of interesting regions couldn't be determined automatically; instead, it is up for the human viewer to interpret which parts form a whole and can be meaningfully depicted.
Michael Hsueh - Apr 26, 2011 12:54:31 am
The paper by Sloan et al. extracts lit spheres from objects in source images. This is done by having the user fit spherical triangles onto scene objects (ultimately covering a sufficient distribution of normals). The spherical triangles are then mapped onto a lit sphere model that is used to shade geometry. This is a very neat technique for extracting unusual (in particular, NPR) shading models that would be otherwise quite difficult to define, especially mathematically. The only concerns I had were regarding the ease and accuracy with which users can apply these triangles to scene objects, and whether these objects would typically exhibit enough information to construct the spheres, though the paper indicates these were not serious problems. Anyway, the technique's application to NPR reminded me of some work I've seen before that does a similar, though less interactive, extraction of brush stroke characteristics (http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4545847). That particular work does not go so far as to produce generative methods for the brush strokes, but it seems like it could.
Matthew Can - Apr 26, 2011 03:48:12 pm
Julian and Brandon brought up good points about fully automatic approaches to choosing lighting and shading conditions that best convey shape. I agree with Julian that this is not a one-size-fits-all problem, and that artists might want more control over the images they produce. At the same time, novices like me would be satisfied with an image that is good enough. And there are more novices than experts in the world, so it’s worth pursuing research in fully automated methods. The challenge is how to quantitatively encode how well an image reveals an object’s structure. If that were easy, we could try to optimize over lighting and shading conditions. Gumhold’s paper finds the position for a light source that maximizes the information in the resulting image. It’s not clear to me how well this information theoretic approach matches the objective of conveying structure. In any case, this is still an open and interesting problem.
Saung Li - Apr 26, 2011 07:03:15 pm
The ideas mentioned in the Aker et al. are very important, as they help address the issue of photographs not fully revealing the shape of an object. By using several photographs to form a composite image, features previously not shown can be emphasized. For some of the composite images, such as the one on the baboon skull, I found them to be unnatural with the use of lighting and shadows. However, it is the shadows that help viewers get a sense of depth in the object; we can clearly see where there are dents in the composite image. One thing I'm worried of is that when modified images such as the composite images are mistaken for real images. I also really like Ramachadran's findings. The demos done in class where spheres become holes when rotating them blew my mind away. Our brain thinks in certain ways, such as where we assume light to come from, so we can take advantage of studies in perception to make better visualizations.
Michael Cohen - Apr 27, 2011 01:41:47 am
It's always intriguing to me how difficult it is to study perceptual artifacts rigorously. I think Ramachandran does about as good a job as one can do, but he still has to resort to calling effects "stronger" and "weaker" at different orientations, and in fact some places where he saw "weaker" effects many people in class didn't see any effect... or, more insidiously, didn't see any effect until the effect was suggested. I would be interested to see some of his assertions tested cross-culturally or with young children. I took a perception class in undergrad (in the psych department) and remember reading that many of the 3D projection conventions we take for granted (isometric views, etc.) don't look 3D at all to groups who live in circular or irregularly shaped homes and don't have experience with contemporary illustrations -- they just look like flat lines on a page. I would guess that Ramachandran's spherical pop-outs are more universal, but I imagine they could also be culturally specific (or at least "stronger") to those of us who are accustomed to processing 2D projections.
Sally Ahn - Apr 27, 2011 02:16:51 am
Regarding human perception, I wonder if we would gain anything by studying the features of images that look just barely "wrong". In image manipulation systems, the advantages of seeing more details (or more informative details) of an object diminishes when the final image starts to look "infeasible," and it requires some human skills to avoid this. At what point of manipulation would an image cross this line of feasibility? I imagine such a point would be specific to the image content, but may be groups of similar photographs/images might share some features that we could use to quantify this notion of "feasibility."
Shacked and Lischinski's paper describes a system that automatically lights 3D scenes by optimizing a set of perception-based features with the goal of creating "comprehensible, communicative" images. They use the luminance values of the image to find a lighting scheme that presents a desired distribution of the image histogram. Since the system requires a precomputed map of edge, surface, and background pixels, this step become be a bottleneck for complex objects. On the other hand, the authors note that a threshold resolution for performance speed is around 62,000 (250^2) pixels, so complex objects might not be well-suited for this system anyway. I also noticed that all of the examples are shown on a black background, but this usually isn't the case with most images we see. I wonder how the color of the background might affect our perceptions of the rendered scenes lit with this system.
Siamak Faridani - Apr 27, 2011 05:10:22 am
Ramachandran's paper made me think about how we can trick our brains to better interpret visualization. It seems to me that there is still no explanatory model for visual perception. The Light Collage paper seemed to use that a lot more but I am interested to know if we can abuse the perception model to make better visualizations. For example 3D stereograms seem to do such a job by tricking the brain to interpret a 2D image as a 3D. There are examples on the web on how optical illusion can easily mess up visualization example but I was not able to find anything on optical illusions helping the visualization.
The interesting part in the life sphere paper was where they had remade Cezanne's shading style. I think it might be interesting to do learning on different painting and try to replicate them or even more interestingly transform one style of painting to another.
Karl He - Apr 27, 2011 06:16:54 am
The topics this time tied into my reflections of the previous lecture. The techniques in conveying 3D shape focus on the apparent properties of the shape. Since it isn't possible to convey 3D on 2D in complete fashion, what becomes the most important is what the user believes is the truth about the rendering.
I like the discussion about which renderings of the same shape are more "final"- or "draft"-like. A topic discussed in 160 is the fact that lo-fi prototypes should be used because it encourages people to suggest more drastic changes.