Coder Social home page Coder Social logo

Error with Alpha filtration about cechmate HOT 6 OPEN

scikit-tda avatar scikit-tda commented on May 27, 2024
Error with Alpha filtration

from cechmate.

Comments (6)

ctralie avatar ctralie commented on May 27, 2024 1

from cechmate.

ctralie avatar ctralie commented on May 27, 2024

from cechmate.

ctralie avatar ctralie commented on May 27, 2024

Hello Kiran,
My apologies for the incredibly long lag on this. I suspect you have moved onto other things now, but I still want to fix this, and also to add the weighted capabilities that you opened in the other pull request (the code there is good overall, but I want to try to address some potential numerical stability issues with your implementation). I finally have a little bit of time, so I'm going to try to knock this out over the next couple of weeks.

First, starting with this issue, I noticed that the points are not in general position; they must have been sampled from some sort of grid.

Points

So this is causing numerical problems both with our library and with GUDHI actually (which you may have already noticed).
One way to deal with this problem is to perturb the points slightly. Here's some code I wrote to do this:

import numpy as np
import matplotlib.pyplot as plt
import cechmate as cm
import gudhi
from persim import plot_diagrams

def convertGUDHIPD(pers):
    Is = [[], [], []]
    for i in range(len(pers)):
        (dim, (b, d)) = pers[i]
        Is[dim].append([b, d])
    return [np.sqrt(np.array(I)) for I in Is]

def alphaBug():
    np.random.seed(0)
    X = np.loadtxt("points.txt")
    XP = X + 0.01*np.random.randn(X.shape[0], X.shape[1])

    alpha = cm.Alpha()
    alpha_filtration = alpha.build(XP)
    dgmsalpha = alpha.diagrams(alpha_filtration)

    alpha_complex = gudhi.AlphaComplex(points=XP)
    simplex_tree = alpha_complex.create_simplex_tree()
    dgmsalpha2 = simplex_tree.persistence()
    dgmsalpha2 = convertGUDHIPD(dgmsalpha2)

    plt.figure(figsize=(8, 4))
    plt.subplot(121)
    if len(dgmsalpha) > 0:
        plot_diagrams(dgmsalpha)
    plt.title("Cechmate")
    plt.subplot(122)
    plot_diagrams(dgmsalpha2)
    plt.xlim([-1, 6])
    plt.ylim([-1, 6])
    plt.title("GUDHI")
    plt.tight_layout()
    plt.savefig("Perturbed.png", bbox_inches='tight')

alphaBug()

When I run this code, I get a completely satisfactory answer that agrees with GUDHI
Perturbed

If I go back and run both cechmat and GUDHI on the original point cloud, GUDHI also has some issues. In particular, it has some classes that are born in H1 and H2 very late but which die almost instantly
Original

if I zoom into the different range, though, the rest of the diagram looks fine:
Original_Zoom
So overall, GUDHI definitely handles this more gracefully, but it still has some strange numerical issues.

Anyway, this problem could be pretty difficult to resolve, but I would like to figure it out. I noticed that this example also throws Cech into an infinite loop!

For now, if you want to use alpha, you should just add some random noise to this dataset. But that's clearly not a satisfactory answer in general

from cechmate.

ctralie avatar ctralie commented on May 27, 2024

Hello @kiranvad,
Here's an update after a lot of work today.

First of all, I found a bug with the simplex condition (which was the first problem you noticed)/

But also, it's more and more clear that the points you provided are not in general position. I verified that a lot of the tetrahedra it tries to add into the filtration are numerically flat. Here's one, for example:
DegenerateTetrahedron

Bearing this in mind, I did a few things to improve numerical precision. The biggest help was to normalize the simplices before computing the circumcenter, and then to scale back at the end

scaleSqr = np.max(np.sum(Y**2, 1))

I also avoid adding a simplex if it's numerically flat after this normalization.

Here's what happens for slightly perturbed points (same as before)
Bugfix_Perturbed

Here's the result for both of the libraries now (notice how cechmate is not crashing anymore
Bugfix

Zooming in, though, there's a problem I didn't notice before with GUDHI, and there's a different problem with cechmate
Bugfix_Zoom
Cechmate has two extra points of non-negligible persistence in H1, while GUDHI has three extra points of non-negligible persistence in H2. So it seems we're on par with GUDHI now in the ways that we both mess up.

I still want to leave this issue open, though, because there are definitely more elegant ways of dealing with this. I want to see if I can use "Simulations of Simplicity" when computing circumradii, which is a more systematic way of perturbing points with theoretical guarantees
https://arxiv.org/pdf/math/9410209.pdf
For now, if you want to use either cechmate or GUDHI, you should be aware of this issue, and you should perturb your points slightly manually.

from cechmate.

kiranvad avatar kiranvad commented on May 27, 2024

Hi Chris,
Thanks a lot for your response. This is really helpful.
It is interesting that it adds flat tetrahedron simplices although the point cloud I had this issue with has a nice geometric structure to it.
Perturbing points makes sense for the data I was using for this work. That's a physically meaningful solution.

from cechmate.

ctralie avatar ctralie commented on May 27, 2024

from cechmate.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.