Coder Social home page Coder Social logo

SHA objects about sha.jl HOT 7 OPEN

juliacrypto avatar juliacrypto commented on July 17, 2024
SHA objects

from sha.jl.

Comments (7)

simonbyrne avatar simonbyrne commented on July 17, 2024 1

I think the main advantages are:

  1. It adds semantic information: I know the bytes correspond to the output of a specific hash. This is helpful when variables are just named commit_hash.
  2. It can be a bitstype, which has some performance advantages, and makes it easier for C interop.
  3. There are many cases where hashes are passed as hexadecimal strings: having dedicated hash objects makes the conversions easier, e.g.
    • the various registry packages (Registrator.jl, RegistryTools.jl, RegistryCI.jl) all represents hashes as strings, as that is how they are represented in the TOML files.
    • interfacing with git: it's helpful to be able to do
      hash = SHA1("...")
      run(`git checkout $hash`)
      and have it work as expected
  4. It appears to be what people do anyway, but we end up with the same thing defined in multiple places: I didn't realize Base had an SHA1 type, but LibGit2 should use this rather than define it's own GitHash type. Similarly, GitHub.jl should use this instead String, etc.

from sha.jl.

simonbyrne avatar simonbyrne commented on July 17, 2024

See GitHash object in LibGit2:
https://github.com/JuliaLang/julia/blob/972f55feedd12b6549002604fb08fde5206bfe37/stdlib/LibGit2/src/types.jl#L13-L24

from sha.jl.

inkydragon avatar inkydragon commented on July 17, 2024

A quick Proof-Of-Concept impl:

If we really need this, I'd like to add it to the base and have SHA, MD5, CRC32, GitHash... all reuse these codes.

hash_obj.jl

# SPDX-License-Identifier: MIT
abstract type AbstractHash end

"""
    HashBytes{N}

A hash object identifier. It is a `N` byte string.
"""
struct HashBytes{N} <: AbstractHash where {N}
    val::NTuple{N, UInt8}
    HashBytes(val::NTuple{N, UInt8}) where N = new{N}(val)
end

HashBytes{N}() where N = HashBytes(ntuple(i->zero(UInt8), N))
HashBytes(h::HashBytes) = h
function HashBytes{N}(u8::Vector{UInt8}) where N
    @assert N == length(u8) "Hash length not match"
    HashBytes(ntuple(idx->u8[idx], N))
end
HashBytes(s::AbstractString) = error("not impl")


import Base.show
function show(io::IO, hash_bytes::HashBytes{N}) where N
    hash = join( repr(u)[3:end] for u in hash_bytes.val )
    print(io, "HashBytes{$N}($(repr(hash)))")
end



# ==== Generate Hash Type Definitions for All SHA Types
# Examples:
#   const Sha1Hash = HashBytes{20}
#   const Sha3_512Hash = HashBytes{64}
using SHA
for (sha_prefix, sha_type) in [(:Sha1, :SHA1_CTX),
                 (:Sha224, :SHA224_CTX),
                 (:Sha256, :SHA256_CTX),
                 (:Sha384, :SHA384_CTX),
                 (:Sha512, :SHA512_CTX),
                 (:Sha2_224, :SHA2_224_CTX),
                 (:Sha2_256, :SHA2_256_CTX),
                 (:Sha2_384, :SHA2_384_CTX),
                 (:Sha2_512, :SHA2_512_CTX),
                 (:Sha3_224, :SHA3_224_CTX),
                 (:Sha3_256, :SHA3_256_CTX),
                 (:Sha3_384, :SHA3_384_CTX),
                 (:Sha3_512, :SHA3_512_CTX)]
    hashsha_type = Symbol(sha_prefix, :Hash)
    @eval begin
        hashtype_len = SHA.digestlen($sha_type)
        const $(hashsha_type) = HashBytes{hashtype_len}
    end
end


# ---- examples:
Sha1Hash(sha1(""))
Sha3_256Hash(sha3_256(""))
Sha3_512Hash(sha3_512(""))

example outout:

julia> Sha1Hash(sha1(""))
HashBytes{20}("da39a3ee5e6b4b0d3255bfef95601890afd80709")

julia> Sha3_256Hash(sha3_256(""))
HashBytes{32}("a7ffc6f8bf1ed76651c14756a061d662f580ff4de43b49fa82d80a4b80f8434a")

julia> Sha3_512Hash(sha3_512(""))
HashBytes{64}("a69f73cca23a9ac5c8b567dc185a756e97c982164fe25859e0d1dcc1475c80a615b2123af1f5f94c11e3e9402c3ac558f500199d95b6d3e301758586281dcd26")

from sha.jl.

staticfloat avatar staticfloat commented on July 17, 2024

It would be helpful if for each hash there was an object representing a hash (e.g. SHA1, SHA256 etc), similar to UUID

Can you explain a bit more about what you want and why it would be useful? I have heard strong arguments both for hashes being objects, and for hash contexts being objects, but the hashes themselves being just arrays of bytes. I'd like to hear your argument for why it's better that they are their own objects.

If its just for dispatch, I think a higher-level package like AbstractHashing or something similar may be a better fit for these kinds of concerns. I myself wanted something that lives higher level than SHA.jl (and can work with MD5 and whatnot) so I wrote this mini package to make dealing with different hashes easier. You can then constrain things to only take a certain hash type via snippets like this.

from sha.jl.

staticfloat avatar staticfloat commented on July 17, 2024

Yes, so my main point would be that we probably want an AbstractHashType that is more than just SHA hashes, and then we have two options for implementation:

Bottom-up; define AbstractHashType in some bare-bones package, then packages like MD5.jl can define their types as inheriting from the abstract type, and get all the goodness defined in the abstract package's generic methods.

Top-down; create a AbstractHashes package that imports SHA.jl, MD5.jl, and every other hash type, then defines the shared functionality right there in terms of the things it has imported.

I think the bottom-up organization is better, but I don't think we want AbstractHashType to be tied to julia releases as a stdlib. So perhaps the best way forward is to have a kind of middle ground, where AbstractHashes.jl is meant to be a bottom-up package, but it includes funcitonality for SHA.jl since it knows that will always be a part of your environment?

from sha.jl.

simonbyrne avatar simonbyrne commented on July 17, 2024

Possibly? This is complicated somewhat by the fact that SHA1 is defined in Base: ideally that would use the same machinery, otherwise we end up with multiple implementations again.

What if we add AbstractHashType and SHA1Hash (and make Base.SHA1 an alias) in Base, adding them to Compat.jl for existing releases, and add the remaining hash objects here?

from sha.jl.

staticfloat avatar staticfloat commented on July 17, 2024

What if we add AbstractHashType and SHA1Hash (and make Base.SHA1 an alias) in Base, adding them to Compat.jl for existing releases, and add the remaining hash objects here?

The downside to this is that it's then only available in Julia v1.12+, and if we want to change something about how hash functions work, we have to wait for a new Julia version. I think it's actually better to have an AbstractHash.jl that just implements whatever adapters are needed for the SHA that happens to be shipped with Julia, and then has maybe package extensions for MD5 and other hash types. Truly the only reason SHA is a stdlib is because Pkg needs to be able to hash things to verify their contents, we should not introduce more code into the stdlib if at all possible.

from sha.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.