Freestyle integration into Blender

March 18, 2011

Development updates on March 17

Filed under: Update — The dev team @ 2:48 AM

View map construction is the most time-consuming part of the Freestyle rendering process.  In fact, the view map construction was impossibly slow when it came to large scenes.  This was a known issue that was meant to be addressed after all to-do items were finished and the long-awaited merge into the trunk was done, mainly because of the lack of development resources.  In last December, however, the Freestyle integration project received a big code contribution from Alexander Beels.  His focus was exactly on optimizing the view map construction code.  He made this possible by identifying the performance bottlenecks and carefully redesigning the internal data processing.  Starting from the initial version in December, Alex’s optimization patch underwent several rounds of code review and testing in collaboration with the dev team.  He made substantial efforts on examining different optimization strategies and implementations.  After more than three months of dedicated research and development, his work resulted in a highly efficient view map construction code that is now available in revision 35525 of the Freestyle branch.  As illustrated with several test scenes below, the optimized view map code offers surprising performance gains.  Many thanks to Alex for the excellent achievement!

Technical overview of the optimization

Now let us describe Alex’s optimization work from a technical perspective.  The performance improvements have been mainly done in two major components of the view map construction: silhouette edge detection, and edge visibility calculation.

Silhouette edge detection is to find feature edges of interest that constitute a basis of line drawing in Freestyle.  To optimize this data processing for speed, Alex made careful changes to the code base so as to exploit compilers’ optimization capability.  This modification leads to a shorter rendering time independently of the spatial structure of a given scene.

The detected feature edges then undergo the edge visibility calculation where an integer value expressing quantitative visibility is assigned to each feature edge.  The edge visibility is determined by a raycasting algorithm with the help of a spatial grid structure that allows efficient traversal of faces in the 3D space (i.e., tris and quads) from the view point of a camera.  As a matter of fact, the non-optimized old view map code relied on a grid data structure that was implemented with the assumption that the faces were evenly distributed within the 3D bounding box.  In addition, the old code was internally creating a list of occluders by copying polygons again and again, which slowed down the view map construction.  Alex devised new grid data structures (SphericalGrid for perspective cameras and BoxGrid for orthographic cameras; automatically switched based on the camera types) that do not assume the ideal even distribution of polygons, as well as a heuristic grid density calculation algorithm that enables an adaptive population of the grid structures with polygons.  Moreover, an iterator is employed for obtaining the occluders on demand instead of making a local copy of them.

Based on these internal improvements, two optimized edge visibility calculation algorithms have been added: a “traditional” algorithm for emulating old visibility algorithms, and a “cumulative” algorithm for improved, more consistent line visibility.  Both algorithms exploit the new spatial grid data structures for fast raycasting.  Each optimized algorithm comes with two variations: culled and unculled.  The culled visibility algorithms exclude most of the feature edges outside the image boundary.  This will result in much faster visibility calculation at the cost of possible alterations to the results of edge chaining.  The unculled algorithms take all feature edges into account.  The latter is useful when off-image edges matter for chaining.

New rendering options

A new option “Raycasting Algorithm” has been added to the Freestyle tab to allow users to choose a raycasting algorithm.  Available choices are:

  • Normal Ray Casting
  • Fast Ray Casting
  • Very Fast Ray Casting
  • Culled Traditional Visibility Detection
  • Unculled Traditional Visibility Detection
  • Culled Cumulative Visibility Detection
  • Unculled Cumulative Visibility Detection

The first three algorithms are those available in the original Freestyle (the “normal” raycasting was used unconditionally, though).  The “fast” and “very fast” raycasting algorithms achieve a faster calculation at the cost of less visibility accuracy.  The other four are newly introduced optimized options.  The visibility accuracy of the “traditional” algorithms are the same with the “normal” algorithm, while that of the “cumulative” algorithms is supposed to be the same with or better than the “normal” case.  Performance improvements over the old algorithms depend on the scenes to be rendered.  The recommended options for branch users are the culled/unculled cumulative visibility algorithms.  These two options are intended to replace the other algorithms in the future.

Performance results

To conclude this brief introduction of Alex’s wonderful optimization work, some performance results are shown below using four test scenes.  In all cases, the non-optimized “normal” visibility algorithm in revision 35506 was compared with the culled cumulative visibility algorithm.

Ichiotsu (orthographic)

73448 vertices, 72260 faces, orthographic camera.

  Normal Culled cumulative
Time [sec] Time [sec] Speedup
Silhouette edge detection 5.062 3.993 1.268
View map building 14.27 1.126 12.67
Sum 19.33 5.119 3.777

Ichiotsu (perspective)

73448 vertices, 72260 faces, perspective camera.

  Normal Culled cumulative
Time [sec] Time [sec] Speedup
Silhouette edge detection 5.084 4.182 1.216
View map building 15.98 1.301 12.28
Sum 21.06 5.483 3.841

Lily

64011 vertices, 60346 faces, perspective camera.

  Normal Culled cumulative
Time [sec] Time [sec] Speedup
Silhouette edge detection 17.66 0.4280 41.27
View map building 13.07 2.422 5.397
Sum 30.74 2.850 10.78

A scene by Greg Sandor (from Lighting Challenge #15: Film Noir)

109180 vertices, 189191 faces, perspective camera.

  Normal Culled cumulative
Time [sec] Time [sec] Speedup
Silhouette edge detection 0.5670 0.3080 1.841
View map building 2045.0 10.74 190.5
Sum 2045.5 11.04 185.2

The first two examples indicate that the new grid data structures for perspective and orthographic cameras have comparable performance.  The third test showed a huge performance gain in silhouette edge detection.  The last big scene resulted in an incredible speedup rate in the view map building phase.  Remarkably, these amazing performance results are solely by means of Alex’s optimization work.  Again, thank you Alex!  This is really great!

Branch users are encouraged to test the new raycasting algorithms from both performance and artistic perspectives.  Performance reports and rendering results are highly welcome, possibly through the Freestyle thread in BlenderArtists.org.

About these ads

18 Comments »

  1. Fantastic speed up. Thanks very much Alex :)

    Comment by Big Fan — March 18, 2011 @ 7:41 AM

  2. This is Insane!
    Great work developers!

    Comment by Tungee — March 18, 2011 @ 1:18 PM

  3. Amazing..

    Comment by yoff — March 18, 2011 @ 1:30 PM

  4. That’s an amazing work, thanks Alex, and also the rest of the team :D

    Comment by Cloud_GL — March 18, 2011 @ 6:09 PM

  5. You guys ROCK! I tried rendering a while back and this was the weakest section on my big scenes since it took a LONG time to render anything, I just might try this again:)

    Comment by NRK — March 19, 2011 @ 2:26 AM

    • Indeed. That was the reason why I started this patch in the first place. I was curious about Freestyle and gave it a try, but Freestyle took soooo long to render my scene that I was ready to give up on it. But instead of deleting Freestyle I decided to poke around, because I just didn’t believe that Freestyle had to take 15 minutes on an image that the Blender renderer could process in six seconds. Once I started poking around I found things that could be improved.

      The whole point, though, was that I figured a lot of other users out there were giving up on Freestyle because it was just too slow on large scenes (mine was 2,000,000 faces), and those users who stayed with Freestyle were limiting themselves to small scenes for the same reasons. If the speedup makes it possible for you to come back to Freestyle, then I have achieved my goal.

      I love open source!

      Comment by arbeels — March 19, 2011 @ 5:59 AM

  6. Thank you, Tamito, for all your kind words. I read your post on the bus home, and all the other passengers were looking at me thinking: who’s the guy with that stupid smile on his face!

    A few words about the optimization, for artists and for developers:

    For Artists:

    * It is very important that we can confirm that the optimized version of Freestyle produces the same results as the old version of Freestyle. For these purposes, please compare the results of “Normal Ray Casting” (the legacy code you are used to) with the results of “Unculled Traditional Visibility Detection”. The results should be identical.

    * The dev team and I would appreciate any and all feedback you can give about the difference between “traditional” and “cumulative” visibility detection. “Traditional” visibility detection is what you are used to. It has some subtle quirks where the decision about whether an edge is visible or not can depend on the order in which the program looks at the segments that make up the edge. That did not seem reasonable to me, so I came up with “cumulative” visibility detection, which is almost identical to “traditional” but more consistent. The assumption is that Freestyle’s legacy ray casting code was coded incorrectly, and “cumulative” tries to provide the results the original developers intended. It is possible, however, that the original developers got it exactly right, and “cumulative” is a step in the wrong direction. In the end, all that really matters is what gives the artists the best results. It should be difficult to find real-world cases where the two algorithms differ, but if you do find differences, please speak up and let us know which algorithm looks right to you. If artists consistently prefer one over the other, or if no one can tell the difference between the two, then the dev team will probably standardize on only one algorithm some time in the future.

    * Many artists will choose to stay away from the “culled” options, because eliminating outlying edges does change the way that chaining brushes look. Other artists do not use chaining brushes and will appreciate the speedup that culling provides. If artists using non-chaining brushes notice a difference between the output of a “culled” render and its equivalent “unculled” render, please let the dev team know. Finally, I encourage even artists who use chaining brushes to consider rendering with one of the “culled” options when making draft images of a small piece of a large scene. The speedup will improve your creative turnaround time, and you can switch back to “unculled” for the final product.

    For developers:

    * Anyone who wants to help out with optimization is encouraged to take a long, hard look at the silhouette edge detection code. Right now for many scenes this is the biggest computational bottleneck we have. I was able to speed up silhouette edge detection by only 20% using standard techniques. (The huge speedup in Tamito’s “Lily” example is due to a common degenerate case, not to a fundamental change in the way silhouette edge detection works.) I don’t know how much room there is for further optimization, but every bit helps! Silhouette edge detection is begging for a breakthrough.

    * One of the problems with the old visibility detection code was that it was not amenable to multithreading. I have tried to design the optimized visibility detection code with an eye to allowing multithreading. If anyone wants to try implementing multithreaded visibility detection, it’s an instant 2x or 4x speedup! I believe the silhouette edge detection code can also be multithreaded without too much effort.

    * The next frontier for optimization is memory usage. The dev team has already done much work here, but it looks like memory usage could still be cut down considerably. Freestyle builds four overlapping representations of the scene right now. We should be able to get by with only two. Also, there seems to be a longstanding serious memory leak in the mesh loading and/or silhouette edge detection code. Now that visibility detection has gotten faster, the main obstacle to rendering big scenes is memory usage.

    Alex

    Comment by arbeels — March 19, 2011 @ 7:14 AM

  7. Have zero talent with regards to animation, blender, coding (the list goes), but everyday I’m amazed at the things going on, in and around the Blender community!

    cracking work Alex……

    regards

    TFS

    Comment by The Fatsnacker — March 28, 2011 @ 8:30 AM

  8. THANK YOU SO MUCH FREESTYLE TEAM! FREESTYLE ROCKZZZZZZ!!!!

    Animated .GIF, APNG coming soon:

    Yours,

    Ortiz

    Comment by Francisco Ortiz — March 31, 2011 @ 6:28 AM

  9. UPDATE:

    512p, APNG version. You need Firefox or Opera to see this one (Warning: 4mb large image).

    Yours,

    Ortiz

    Comment by Francisco Ortiz — March 31, 2011 @ 6:21 PM

  10. UPDATE 2: Thankx to Max, from the Apng community I compressed the above picture. Exactly the same quality now 2.3mb, Link already updated.

    Yours,

    Ortiz

    Comment by Francisco Ortiz — April 1, 2011 @ 4:17 PM

  11. I’ve got to tip my hat off to you developers. This code is very fine. Even in this still-early state it is already a life saver for me. Especially the new mesh culling and those really fantastic parameters driven line settings are making a real difference in a small project I’m working on. As a small observation, I think it would be a very positive thing if there was some way to show a ‘preview’ of the effect that one would obtain with each of the classic effects. At this time it is always necessary to run a test render in order to have a good idea of what will happen. That burns up a lot of time.

    Comment by Ignatz — April 5, 2011 @ 1:45 PM

  12. Thanks for all your hard work Alex!!! Your contributions to Blender/Freestyle are greatly appreciated!

    Comment by Blenderificus — April 9, 2011 @ 3:03 PM

  13. awesome result!

    Comment by max puliero — April 10, 2011 @ 5:57 AM

  14. WooW

    thankyou guys!

    Comment by max puliero — April 10, 2011 @ 5:58 AM

  15. Hello. First of all I have to say that your job is fantastic! The only thing that I miss in freestyle is the possibility to choose an edge and render it always. It should be fantastic.
    Ciao!

    Comment by Emiliano Bonaccorso — May 1, 2011 @ 5:54 PM

  16. Hi,

    Are there any updates on the project status?

    Thanks,

    Barts

    Comment by barts706 — June 14, 2011 @ 6:58 PM

  17. Hi, umm, I really would appreciate it to get some update report, I’m getting nervous because I always think that when there are no reports then this must mean that developers stopped developing Freestyle. I hope I’m wrong

    Comment by Marcus — July 19, 2011 @ 8:02 PM


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The WordPress Classic Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 60 other followers

%d bloggers like this: