John Ratcliff's Code Suppository

A place where I insert my code into the anus of the Internet.

 

Friday, December 09, 2011

New Implementation of the HACD Algorithm : Julio Jerez


So, it turns out that Julio Jerez, author of the Newton Physics Engine, did his own implementation of Khaled Mamou's HACD algorithm (hierarchical approximate convex decomposition).

The big problem with Khaled's version is that it was extremely slow due to how he went about performing a distance calculation.  Julio's implementation does not suffer from this, and is much, much faster.

In addition, I added my own merge-hulls code as an optional post process step.

You can download the new version from this Google Code Project page.

Once you are synced you can try running say the complex segmented bug model.

Run: TestHACD hornbug.obj

This will produce a 58 convex hull model saved out to the file 'ConvexDecomposition.obj'

If you want one that is even more precision you can try something like:

TestHACD hornbug.obj -c 0.1

This sets the concavity value to 0.1 from the default of 0.2.

This will produce 73 convex hulls.

However, if you want to run the post-process merge phase, you can reduce the number of hulls by the following.

Run: TestHACD hornbug.obj -m 40

That will recursively collapse the convex hulls (starting with the smallest to the largest) until there are no more than 40 hulls left.

Take a look at the resulting file 'ConvexDecomposition.obj' and you will see how nicely the convex hulls were collapsed.

This version is *extremely* fast and can be used even on meshes with high vert counts without taking forever.

While, in general, Julio's implementation seems to be working great, we have found a couple of edge cases it fails on.

In the project you will find two wavefront OBJ files, 'box-thick.obj' and 'cylinder.obj'.

'Box-thick.obj' is a hollowed out box, which Khaled's implementation handles correctly, but Julio's fails on.  Same thing for the cylinder.

Hopefully we can get this tracked down soon. 

To use this source code in your own project, just copy the sources from the sub-directory 'hacd'.  The rest of your application only needs access to the folder 'hacd\public'.

Modify the header file 'PlatformConfigHACD.h' to change the global memory allocation routines if you want them to vector to your own methods.

11 Comments:

  • At 3:51 AM, Blogger Foxtox said…

    I'm using the convex hull generation available in the bullet physics library, it seems fairly fast but I've never compared it against any other options.

    Any idea how yours compares to bullets?

     
  • At 4:26 AM, Blogger Foxtox said…

    Also does this algorithm generate conservative meshes(contained within the original mesh)? I assume not, but is there any chance it could be modified to do so?

     
  • At 8:34 AM, Blogger John said…

    The convex hull implementation is plenty fast. I haven't compared it head to head with the Bullet version, but I have no reason to assume it's particularly slow. Also, to be clear, it's not 'mine' but written by Julio Jerez. I just repackaged it here.

    Not sure what you mean by 'conservative' meshes. Do you mean something like a skin-width that controls whether the convex hulls is exactly on top of the source mesh, slightly inside, or slightly outside?

    This can be accomplished by modifying the convex hull slightly if its something you want as a result.

    John

     
  • At 12:02 PM, Blogger Khaled said…

    Hi John,
    Thanks for sharing this re-factored version of Julio Jerez's code.

    Please, would it be possible to give more details about the part of the algorithm that were optimized or modified?
    --Khaled

     
  • At 12:46 PM, Blogger John said…

    Khaled,

    You would have to talk with Julio to get the details. What happened is that Julio's first implementation did not do *any* raycasting, which made it much faster than your version.

    However, we then found that it failed on some edge cases, like a hollow box, or hose. So, Julio had to refactor his code to take these edge cases into account, and in doing so he had to introduce raycasting into the system, thus making his go a lot slower.

    What I've asked him to do now is make two versions. The 'fast' version without raycasting (but fails on hollow shapes) and then, of course, the slower version which takes the edge conditions into account.

    In addition, I have also added support for my legacy ACD implementation because, if you only want a very quick and dirty crude approximation, mine works fine and it much, much, faster. Especially with my current implementation.

    A lot of the reasons that Julio's is faster is not simply algorithmic, but rather because he just writes faster code. Coming from the game industry Julio is used to writing extremely high performance code and, for example, his winged edge data structure allows him to build convex hulls and iterate on meshes extremely quickly.

    I'll post an update when I get a new version which supports the 'fast-path'.

    Currently the tools guys at NVIDIA are working a version of their plugin which supports both your version and Julio's version as an option.

    My work is much lower-level where I have a lot greater code restrictions.

    John

     
  • At 1:33 PM, Blogger Julio Jerez said…

    hi: John and Khaled
    I can offered some of the explanations you want.
    If you remember back in may 2011 you registered on my forum and we were discussing your algorithm
    and some of the issues I had with it. I proposed to you to work together on an implementation.

    http://newtondynamics.com/forum/viewtopic.php?f=9&t=4607&p=47305&hilit=Khaled#p47305

    we continue the conversation off line but we only got one or two email. I do not know if I did something wrong and I offended you because you never came back, therere and I abandoned the project. It was until Ratcliff mention to me that he was replacing his ACD with HACD which generate better decomposition but that it was too slow, and he would like to be more optimized.

    I mention to him that I had a version that almost works in linear time with the number of faces and he decided to try it out.
    I also mention these concerns but that I have the solutions, it was just that I had not work on theme.

    Anyway, back them if you remember we were talking about the deep concavity problem.
    but that was not the only problem I had with it.

    the second problem was that you metric is not a monotonic function, so using in a priority heap in steepest descend or hill climbing faction leads to unexpected results with difficult clusters.

    The third problem the metric does not takes into account the balance of the cluster.

    I will explain these two and the solution, and you will see that it is not the ray cast that make it slower. In fact the ray cast part is totally negligible.
    basically the ray cats part time complexity is a true (N * log (N)) where N is the number of faces in the cluster. you can deduce this as follow.
    To find the back facing face you shot a number of rays per face. each ray use a bounding box hierarchical with for perfect balance AABB the time complexity is log(n).
    Newton's BVH are optimally balance in the same sense that a RedBlack tree is optimally balanced, in fact the algorithm is inspired by the same technique used to balance Red Black trees.

    Ok I will go over the problem with perimeter/Area metrics.

    your concavity metrics has two components, at least it had too component when your published the paper.

    Tet us write an expression here so that we are all on the same page.

    c = d + r;

    c = the concavity of a cluster,
    d = a distance concavity.
    r = a perimeter square over area of the cluster area.

    d is measured as the maximum distance from the front teh positive side of a face to the interstion of the convex Hull.
    this is a Monotonic function in the sense that for tow clusters the distance concavity if the combined cluster is either equal or larger that the individual concavity of each sub cluster.
    because this is a monotonic function this function can be uses on any steeps decent algorithm and it will lead to an optimal function.
    That part of the metric has linear dimensions which is also very nice.

    continue...

     
  • At 1:35 PM, Blogger Julio Jerez said…

    It is the second part of the metric that is problematic.
    the second part is given by the ration of the squared of perimeter of a cluster and area of the cluster.
    This part is dimensionless, to make it has linear dimension you propose weigh factor one tenth of the ratio, but it could be anything.
    this explain why the algorithm is not scale invariant, but that can also be solve by pre scaling the mesh into a unit box. so that is not a real big problem.

    however the metric is not a monotonic function and that is a real problem, to see that the metric is not monotonic we can
    you can simple write the symbolic ratio concavity of two cluster of one face each.
    Ignoring the constant this can be written like this:

    for two triangles sharing and edge
    for each triangle r = (3 * L)^2 / (L^2) = 9
    the combined concavity is r = (2 * L + 2 *L)^2 / 2 * (L^2) = 8
    concavity goes down

    now let us use two regular quad that share one edge and calculate the ratio concavity
    for one quad r = (4 * L) ^2 / (L^2) = 16
    the combined concavity is r = (3 * L + 3 *L)^2 / 2 * (L^2) = 36 / 2 = 19
    concavity goes up.

    you can see how two identical configuration produces point of inflextion on the slope of the function.
    think for example these faces were part of shape that was convex to begin with.

    if the shape is convex we know that each intermediate subs cluster will have a zero distance concavity.
    this mean the presses will be control only by the ratio concavity.
    we also know the final decomposition should be the convex hull of the shape.
    what happen is the as cluster are merged some new merge cluster will lower the concavity but some merge will actually increased (this is what the algorithm expect).
    Potential pairs that increase concavity will do down on the priority heap let it merge only the cluster with lower metric.
    Are some point the algorithm will be left with a number of cluster that if merge they will increase concavity above the user specified value, and it will terminate with an array of cluster by no necessarily the best solution. Basically the hill climbing terminate on a local minimal.

    The solution to local minimal on hill climb problems is Anelining.

    Basically Anelining is a process that allow the hill climb to take a step that increases the entropy but it only validate that step if the result of that step when taking another step lower the entropy at some point.

    This is a recursive process because the combined of the combined can also increase entropy and the next combine lowered, and so on.
    As you can see adding this to the algorithm turn it into an all Cluster build combination and that is why is become slower.

     
  • At 1:38 PM, Blogger Julio Jerez said…

    The way I implemented this was by making a bottom up hierarchical tree.

    The algorithm start with each face being a leaf node. each time it merge two nodes it checks if the combined concavity is higher than the individual concavity.
    if it is it mean that the parent node is a lower resolution version of the two children.
    the process in control by the priory merge loop.

    the loop run until all pairs are merged. at some point the concavity of a merge cluster will have lower equal or lower concavity than the two children and if this happens the children are deleted.

    the algorithm terminate with a tree with one root, which is the convex hull of the mesh,
    the two children are the tow best partition of, and so on.

    so to get any number of partition all we need to do is to find the N node that fall bellow some criteria,
    can be a concavity threshold of a number of Hulls.

    as you can see if we run the algorithm of say a perfect sphere, at some point it will have two node each with a number of child sub nodes, the combined of this tow nodes will delete the two children because the combined cluster will have zero concavity. so the algorithm terminate with a tree with one node.

    This is what make my variation generate very niece partition, at the expense of doing more work.


    and I was telling Ratcliff there are ton of way to make it better but simple making so simple observations.

    the first observation is that that the last stage is more expensive that all previous stages combined.
    the is explained because the last stage will calculate the concavity of all faces in teh mesh again teh more complex hulls.

    take for example a sphere.

    the last two nodes will have a combine with proximally large the number of faces than the final hull,
    and will also have clusters with about half the number of faces.

    so the concavity of the last cluster is proportional to N * N where N is the number of face in the mesh
    the concavity of the children is proportional to (N / 2 * N /2 ) = N * N / 4

    but there are two children so it is N ^ 2 / 2

    if we calculate this regression we can see that the time complexity of building all the children of the root is approximately equal to the time complexity of build the root alone.

    however we know that the solution is the convex Hull, so we can simple terminate when the algorithm have the two last nodes and combine then unconditionally under one root node.

     
  • At 1:41 PM, Blogger Julio Jerez said…

    this can be extended or generalized as follows.
    if we keep the face on a cluster sorted by increasing linear concavity, can bail out as soon as the concavity is worse than some upper bound.
    This can be justified because as I was saying in the beginning, the linear concavity is a monotonic function, therefore if we have the concavity of two clusters with a bad concavity, we know that the combined concavity cannot be lower therefore with can terminated if we take the faces sorted by their last known concavity and we start we test those face first. We keep testing as long and the new concavity do not become much worse. The worse that can happen is that for example two cluster can have a two bad concavity that do no increases but the last face is the one that become the larger offender. in the case it test all the faces, but that is not the norm, in more cases a bad face will because a worse face and we can use that value as a reprehensive concavity.

    This should reduce the time complexity by about half order of magnitude;

    I should point out than even with building all clusters, my implementation of your algorithm is still faster than your implementation, and this is simply because the intermediate data structures that I use are more efficient than the one you use.

    There are other improvement I added to the metric, like an Balance cost, but I can explain that on another post if you are interested.

    I will be glad if you at least check it out and, this is your signature work afterward.
    those were the things I was think back where I proposed to you to work together.

    I hope my poor English do not get in the way of what I was trying to explain.

     
  • At 10:34 AM, Blogger Khaled said…

    Hi Julio,
    Thanks for the details. I will need to read carefully all this.
    I think there is a big misunderstanding. Probably you have not received the email I sent you the on the 25th of May (“Great! Just register on sourceforge and send me your login.”) Or maybe I haven’t received your replay.
    I would be glad to work with you and with anybody willing to provide opensource code/tools/libraries.
    I hope this will clear up things.

    Cheers,
    --Khaled

     
  • At 5:49 PM, Blogger carl can said…

    This is a good post. This post give truly quality information.I’m definitely going to look into it.Really very useful tips are provided here.thank you so much.Keep up the good works. krk realty

     

Post a Comment

<< Home