ddansabelenda / doc-clusterizer Goto Github PK
View Code? Open in Web Editor NEWDocClusterizer is a Java desktop application designed to analyze and cluster documents based on their content similarity. The application utilizes Lucene and Tika libraries to process various file extensions such as txt, pdf, docx, and pptx.