ShadowDraw: Real-Time User Guidance for Freehand Drawing
Yong Jae Lee, Larry Zitnick, and Michael Cohen
We present ShadowDraw, a system for guiding the freeform drawing of objects. As the user draws, ShadowDraw dynamically updates a shadow image underlying the user's strokes. The shadows are suggestive of object contours that guide the user as they continue drawing. This paradigm is similar to tracing, with two major differences. First, we do not provide a single image from which the user can trace; rather ShadowDraw automatically blends relevant images from a large database to construct the shadows. Second, the system dynamically adapts to the user's drawings in real-time and produces suggestions accordingly. ShadowDraw works by efficiently matching local edge patches between the query, constructed from the current drawing, and a database of images. A hashing technique enforces both local and global similarity and provides sufficient speed for interactive feedback. Shadows are created by aggregating the top retrieved edge maps, spatially weighted by their match scores. We test our approach with human subjects and show comparisons between the drawings that were produced with and without the system. The results show that our system produces more realistically proportioned line drawings.
ShadowDraw includes three main components: (1) the construction of an inverted file structure that indexes a database of images and their edge maps; (2) a query method that, given user strokes, dynamically retrieves matching images, aligns them to the evolving drawing and weights them based on a matching score; and (3) the user interface, which displays a shadow of weighted edge maps beneath the user's drawing to help guide the drawing process.
Examples of database images and corresponding edge maps
We use a set of approximately 30,000 natural images collected from the internet via approximately 40 categorical queries such as “t-shirt”, “bicycle”, “car”, etc. Although such images have many extraneous backgrounds, objects, framing lines, etc., the expectation is that, on average, they will contain edges a user may want to draw.
We process each image in three stages and add to an inverted file structure. First, we extract edges from the image using the long edge detector described in [Bhat et al., TOG 2009]. Next, we extract patches uniformly on a grid and compute Binary Coherent Edge (BiCE) descriptors for each patch [Zitnick, ECCV 2010]. Finally, for each patch, we compute concatenated sets of min-hashes called sketches and add them to the database. The database is stored as an inverted file, in other words, indexed by sketch value, which in turn points to the original image and the patch location.
Image Matching and Retrieval
The figure above shows the real-time matching pipeline between the database images and the user's drawing. Our hashing scheme allows for efficient (real-time) image queries, since we only need to consider images with matching sketches. Initially, we use the inverted file structure to obtain a set of top 100 candidate matches. Next, we align and score each candidate match to the user's drawing. This two step matching procedure is necessary for computational efficiency, since only a small subset of the database images need to be finely aligned and weighted. We use the scores from the alignment step to compute a set of spatially varying weights for each edge image. The output is a shadow image resulting from the weighted average of the edge images. Finally, we display the shadow image to the user beneath his/her drawing.
We take advantage of the fact that the user's strokes change gradually over time to increase performance. At each time step, only votes resulting from sketches derived from patches that have changed are updated. The candidate image set contains a set of images with approximate matching locations arising from the discretization of the offsets in the grid of patches. We refine these offsets using a 1D variation of the Generalized Hough transform. The use of a spatial weighting results in shadows that are a composite of multiple distinct edge images, creating the appearance of an object that does not exist in a single database image.
The user can draw or erase strokes using a stylus. The user sees their own drawing formed with pen strokes superimposed on a continuously updated shadow image. To make the shadow visible to the user, while not distracting from the user's actual drawing, we filter the shadow image to remove noisy and faint edges. Finally, we weight the shadow image to have higher contrast near the user's cursor position.
All our experiments are run using a database of approximately 30,000 images. Using our large database, the user can receive guidance for a variety of object categories, including specific types of objects such as office chairs, folding chairs, or rocking chairs. On average, a new shadow image is computed every 0.4 to 0.9 seconds depending on the number of new strokes. A fast response is critical in creating a positive feedback loop in which the user obtains suggestions while still in the process of drawing a stroke. As the user draws new strokes and erases others, the shadows dynamically update to best match the user's drawing. The last row of the figure above demonstrates the scoring function's robustness to clutter. Even when many spurious strokes are drawn, the correct images are given high weight in the shadow image. Please refer to our video to see the sessions in action and for more results.
We conducted a user study to assess the effectiveness of ShadowDraw on untrained drawers. In the first stage, subjects produced quick 1-3 minute drawings with and without ShadowDraw. In a second stage, a separate set of subjects evaluated the drawings.
Overall, ShadowDraw achieves significantly higher scored drawing results for the “average” group. The users in this group are able to draw the basic shapes and rough proportions of the objects correctly, but have difficulty applying exact proportions and details essential for producing compelling drawings, which is precisely where ShadowDraw can help.
The figure above shows some examples of the users' drawings of faces before and after practice with ShadowDraw. There is a noticeable change towards realistic proportions in the drawings for those with poor skill (left) and good skill (right). Notice how the subject's personal style is maintained between drawings, and that the more proficient drawers are not simply tracing the shadows.
Yong Jae Lee, Larry Zitnick,
and Michael Cohen
To appear, ACM Transactions on Graphics (Proceedings of SIGGRAPH), Vancouver, Canada, August 2011.