Coder Social home page Coder Social logo

simhash_similarity's Introduction

Project

A text similarity by simhash

About text similarity

https://cloud.tencent.com/developer/article/1389446

How to test

$ go test -v -test.run TestSimHashSimilar

=== RUN TestSimHashSimilar

srcWordsWeight: [{区块链 58.69602153541771} {货币 42.49228769} {分布式 31.1229513822} {比特 30.7892744766} {密码学 26.4150609428} {数字 25.9012790598} {虚拟 25.18603834812} {数据结构 21.285162228} {链式 21.066763644} {利用 20.01581093792} {方式 19.20596931748} {数据 19.12542671356} {顺序 15.15263737918} {来讲 14.73377762808} {基础架构 12.8020653633} {一种 12.394908587969999} {2009 11.739204307083542} {中本聪 11.739204307083542} {最早 11.7133864415} {区块 11.5027823792} {保证 11.32904398058} {不可 10.95866957244} {数据传输 10.604840786} {账本 10.1871055853} {组合成 10.0720362555} {以太 10.0505300503} {篡改 9.96885201925} {莱特 9.87532596124} {编程 9.84023464143} {发明者 9.69598503258}]

dstWordsWeight: [{区块链 58.69602153541771} {篡改 29.906556057750002} {数据 19.12542671356} {技术 18.87782871428} {节点 18.29417492174} {信息 15.76158207831} {金融 15.752110047990001} {缺陷 15.05813160948} {交易 14.735571450600002} {互联网 14.24378550858} {信任 14.15260796386} {痛点 12.8020653633} {中心化 12.8020653633} {有三大 11.739204307083542} {区块 11.5027823792} {假冒伪劣 11.2616203224} {不可 10.95866957244} {解决目前 10.9049453784} {领域 10.82459108482} {大有裨益 10.604840786} {分布式 10.3743171274} {银团 10.3171587135} {工业 10.1355963834} {记账 10.1164880181} {资产 10.08319666818} {内置 9.96885201925} {数据链 9.89334446674} {生命周期 9.72629038208} {金融业务 9.62401153296} {讲课 9.61021821083}]

srcWords:[{区块链 58.69602153541771} {货币 42.49228769} {分布式 31.1229513822} {比特 30.7892744766} {密码学 26.4150609428} {数字 25.9012790598} {虚拟 25.18603834812} {数据结构 21.285162228} {链式 21.066763644} {利用 20.01581093792} {方式 19.20596931748} {数据 19.12542671356} {顺序 15.15263737918} {来讲 14.73377762808} {基础架构 12.8020653633} {一种 12.394908587969999} {2009 11.739204307083542} {中本聪 11.739204307083542} {最早 11.7133864415} {区块 11.5027823792} {保证 11.32904398058} {不可 10.95866957244} {数据传输 10.604840786} {账本 10.1871055853} {组合成 10.0720362555} {以太 10.0505300503} {篡改 9.96885201925} {莱特 9.87532596124} {编程 9.84023464143} {发明者 9.69598503258}]

dstWords:[{区块链 58.69602153541771} {篡改 29.906556057750002} {数据 19.12542671356} {技术 18.87782871428} {节点 18.29417492174} {信息 15.76158207831} {金融 15.752110047990001} {缺陷 15.05813160948} {交易 14.735571450600002} {互联网 14.24378550858} {信任 14.15260796386} {痛点 12.8020653633} {中心化 12.8020653633} {有三大 11.739204307083542} {区块 11.5027823792} {假冒伪劣 11.2616203224} {不可 10.95866957244} {解决目前 10.9049453784} {领域 10.82459108482} {大有裨益 10.604840786} {分布式 10.3743171274} {银团 10.3171587135} {工业 10.1355963834} {记账 10.1164880181} {资产 10.08319666818} {内置 9.96885201925} {数据链 9.89334446674} {生命周期 9.72629038208} {金融业务 9.62401153296} {讲课 9.61021821083}]

srcFingerPrint: [1 0 1 1 0 1 1 0 0 1 1 1 1 0 1 0 1 0 0 1 0 1 1 0 0 0 0 0 0 1 0 1]

dstFingerPrint: [1 0 1 1 0 1 0 1 0 1 1 0 1 0 0 0 1 1 0 1 1 1 1 0 0 0 1 1 0 1 0 1]

--- PASS: TestSimHashSimilar (0.85s)

similarity_test.go:56: SimHashSimilar distance: 8

PASS

simhash_similarity's People

Contributors

netkiddy avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.