multiformats / rust-multiaddr Goto Github PK

View Code? Open in Web Editor NEW

84.0 84.0 44.0 5.3 MB

multiaddr implementation in rust

Home Page: https://crates.io/crates/multiaddr

License: Other

Rust 100.00%

rust-multiaddr's Introduction

rust-multiaddr

multiaddr implementation in Rust.

Install
Usage
Maintainers
Contribute
License

Install

First add this to your Cargo.toml

[dependencies]
multiaddr = "*"

then run cargo build.

Usage

extern crate multiaddr;

use multiaddr::{Multiaddr, multiaddr};

let address = "/ip4/127.0.0.1/tcp/1234".parse::<Multiaddr>().unwrap();
// or with a macro
let other = multiaddr!(Ip4([127, 0, 0, 1]), Udp(10500u16), QuicV1);

assert_eq!(address.to_string(), "/ip4/127.0.0.1/tcp/1234");
assert_eq!(other.to_string(), "/ip4/127.0.0.1/udp/10500/quic-v1");

Maintainers

Captain: @dignifiedquire.

Contribute

Contributions welcome. Please check out the issues.

Check out our contributing document for more information on how we work, and about contributing in general. Please be aware that all interactions related to multiformats are subject to the IPFS Code of Conduct.

Small note: If editing the README, please conform to the standard-readme specification.

License

rust-multiaddr's People

Contributors

Stargazers

Watchers

Forkers

nham progval maciejhirsz leavehouse eira-fransham ntninja tomaka deg4uss3r jcc333 dalavancloud renlulu tari-project kevincox jonnycrunch isabella232 mxinden whereistejas jochasinga melekes jacklund john-littlebearlabs elenaf9 jamyw7g jxs thomaseizinger thebearda chainsafe lilsunny243 p-shahi imbrem dariusc93 prx0 oblique marcel-g shabbirhasan1 jmcph4 umgefahren stormshield-frb hrxi amdfxlucas dav1do 0xtylerholmes amitpr

rust-multiaddr's Issues

Consider an inline representation for small multiaddrs

Using the same approach, for a total size of 64 bytes, you get 62 bytes of data. That is enough to store a multiaddr like /ip4/<32 bit addr>/tcp/<port>/ipfs/<256 bit node hash> or even /ip6/<128 bit addr>/tcp/<port>/ipfs/<256 bit node hash> inline, which should be quite useful.

Rust-libp2p nodes frequently store peer ids multiple times, so this would be a nice optimisation.

Improve docs and publish them to github pages

Auto publish on travis push to gh pages
Readme badge and website set
Improve inline docs

Move to Multiformats?

Hey @dignifiedquire - want to move this to the multiformats org?

Tracking issue, here: multiformats/multiformats#4.

`/webrtc` `/webrtc-direct` rename proceeding

Tracking next steps to do the /webrtc -> /webrtc-direct rename (multiformats/multiaddr#150 (comment)) with potential for collision with the introduction of /webrtc (browser-to-browser) #79.

Release #84 as non-breaking change
Release breaking change from Protocol::WebRTC to Protocol::WebRTCDirect #78.
Release #79 as a non-breaking change.

Publish to crates.io

Add ToMultiaddr trait

Implement for

&str
String
IPv4Addr
IPv6Addr

Ref http://doc.rust-lang.org/src/std/net/addr.rs.html#308-327

Change Multiaddr internal structure to store parsed address

As of now, Multiaddr stores a byte array, representing the serialized version of the address.

Although this allows very fast serialization (array copy), I believe this is not the most appropriate format.
Indeed, operations on Multiaddr (eg. decapsulate) actually work on the list of protocols, and not on the byte representation. The bug mentioned in PR #18 is a good example of it.

I see two possible representations:

Make Multiaddr store a Vec<Addr>, with Addr being an Enum for all supported protocols
Make Multiaddr only store an Addr and an Option<Multiaddr> for the inner and/or outer multiaddr.

schemars support

add support for schemars to allow people to allow people to generate json schemas / swagger

Add Multiaddr.to_string()

Let's make fewer breaking changes

A Multiaddr is a core primitive of libp2p and thus an almost guaranteed dependency of everything libp2p. Every breaking change in this library will trigger a cascade of breaking changes across the ecosystem because Multiaddr constantly appears in public type signatures.

Just within our own crates, a breaking change here causes a breaking change in: libp2p-core -> libp2p-swarm -> libp2p-identify (for example) -> libp2p. There are several users of libp2p that build their own libraries on top which continues the chain.

We should do everything possible to:

Harden the API against breaking changes
Only make them very selectively (4 breaking changes in 1 year is pretty bad)

After a quick survey, I see the following problems:

Protocol is not non_exhaustive: There will always be more protocols, we should future proof the API for that.
multihash is re-exported and part of the public API despite being < 1.0: This is a problem. A fundamental crate like multiaddr should only have stable dependencies in its API.

I am not sure how to deal with multihash. It is useful to represent Certhash and P2p in a type-safe way. One thing we can do is just not update to the latest version as quickly unless we actually need a specific feature.

Update: After looking at multihash in more detail and opening and issue there, I think the best way forward would be to split multihash into two crates: one for the definition of multihash and one for all the implementations of hash algorithms, the custom derive, etc.

Making the `/p2p` protocol type-safe

The /p2p protocol can only be followed by a PeerId: https://github.com/libp2p/specs/blob/master/addressing/README.md#the-p2p-multiaddr

As per spec, a PeerId can only be a multihash with either SHA256 or the identity hash in case the encoded public key is less than 42 bytes.

Currently, /p2p exposes a Multihash which is more general than that.

I see several options of how we can improve this situation:

Extract the PeerId type from rust-libp2p into a dedicated crate and have multiaddr depend on that
Move the entire identity module of libp2p-core into its own crate (keys + PeerId)
Move the rust-libp2p PeerId type into multiaddr
Define a minimal PeerId type within multiaddr that encodes the above invariants

All of the above have their pros and cons.

Extracting PeerId into its own crate feels a bit odd because it would be a very small crate. On the other hand, it would encode a very important concept in a concise form so it may be worth it.

The next step up would be a crate that encapsulates everything around keys in libp2p into its own crate, i.e. libp2p-identity. That is basically this module + the peer_id module.

Personally, I'd be in favor of option (2). I think it makes sense to break out this part into a separate crate. We can heavily feature-flag that one to the point where the multiaddr crate itself only depends on the bits that define the PeerId and doesn't come with any other dependencies.

Input welcome!

cc @mxinden @dignifiedquire

Tooling for unparsable `Multiaddr`

Sample use-case:

Roll out of new Protocol::X.
Old DHT nodes should forward Multiaddrs with Protocol::X of new nodes even though they can't parse the Multiaddr. Currently they don't.

Considerations:

Instead of providing e.g. an Unparsable type in multiaddr, each user that cares could also carry a Either<Multiaddr, Vec<u8>>.
One could add a Protocol::Unparsable(Vec<u8>) containing the remaining unparsable bytes. This might break existing implementations as they depend on Multiaddr either to succeed or fail, but not fail partially.

multiformats/multiaddr#155

Use `impl From` instead of `trait ToMultiaddr`

It seems to me that there is too much of to_* and from_* when it should be enough to use .into() and single impl From.

Add Encapsulate and Decapsulate

Update to multihash v0.19

See https://github.com/multiformats/rust-multihash/milestone/1.

Mark `/wss` as deprecated

According to the multiaddr spec, /wss is deprecated, see https://github.com/multiformats/multiaddr/blob/master/protocols.csv#L33.

We should reflect this in the code such that downstream users can upgrade accordingly. I'd suggest we mark it as #[deprecated] first, do a point release and remove it in 0.19.

Use trait object

I believe all the multiformats are meant to be extensible, so a Multiaddr trait object is probably more appropriate than a closed enum.

Remove `from_url` module

As discussed in #71, we should remove the from_url module to reduce the surface area of this crate's public API.

Users are encouraged to implement the from_url functionality themselves and/or create their own crate if they need to reuse the code in different projects.

src/protocol.rs: Add WebTransport

See corresponding specification pull request multiformats/multiaddr#126.

Add Multiaddr::starts_with

There is Multiaddr::ends_with, but for matching addresses with/without P2P ending it would be convenient to have Multiaddr::start_with method as well.

src/protocol: Add support for `tls` protocol

multiformats/multiaddr#109 specifies tls as a valid Multiaddr protocol component. The Rust Multiaddr implementation should support it as well.

See multiformats/go-multiaddr#153 for the corresponding change in the Golang implementation.

Implement `/ipcidr`

Add support for /ipcidr.

See specification added with multiformats/multiaddr#129.

Motivation is to specify address ranges to (a) not announce or (b) not dial in https://github.com/mxinden/rust-libp2p-server/ used as an IPFS bootstrap node.

feat: validate onion3 addresses

You can currently create invalid onion3 addresses:

   let mut b = [0u8; 35];
   OsRng.fill_bytes(&mut b);
   let addr = multiaddr::Onion3Addr::from((b, 12345u16));
   let invalid = multiaddr!(Onion3(addr));

Resulting in these error logs in tor

Jun  5 10:10:44 thor Tor[692]: Service address "pd6sf3mqkkkfrn4rk5odgcr2j5sn7m523a4tm7pzpuotk2b7rpuhaeym" invalid checksum.
Jun  5 10:10:44 thor Tor[692]: Invalid onion hostname pd6sf3mqkkkfrn4rk5odgcr2j5sn7m523a4tm7pzpuotk2b7rpuhaeym.onion; rejecting

Question: Should we be validating the checksum and/or the public key in this library? Defined as follows:

  onion_address = base32(PUBKEY | CHECKSUM | VERSION) + ".onion"
     CHECKSUM = H(".onion checksum" | PUBKEY | VERSION)[:2]

     where:
       - PUBKEY is the 32 bytes ed25519 master pubkey of the hidden service.
       - VERSION is a one byte version field (default value '\x03')
       - ".onion checksum" is a constant string
       - CHECKSUM is truncated to two bytes before inserting it in onion_address

source: https://github.com/torproject/torspec/blob/main/rend-spec-v3.txt#LL2258C6-L2258C6

This would introduce a number of dependencies: ~~base32~~(edit: not required due to existing data-encoding dep), sha3 and an ed25519 library if we decided to validate the PUBKEY.

In many cases the user can leave it to tor, but you may not want to e.g. include the address in a database if it is invalid.

I wanted thoughts on if this is in scope for this library or if this validation should be left up to the user.

I'd be happy to work on a PR for this if this is in scope. Changes should be fairly minor: Fallible TryFrom impl, decoding the address and verifying the checksum, version and possibly the ed25519 key, and adding some new tests.

Another rust-multiaddr in rust-libp2p/misc/multiaddr

I was wondering about rust-libp2p's bundled version of rust-multiaddr, but it turns out it's already marked for being replaced by the upstream rust-multiaddr: libp2p/rust-libp2p#758

(Filing this issue for visibility)

Improve error handling

Re-export libp2p_identity::PeerId?

While updating a project with dependencies on multiaddr and multihash, I noticed that libp2p_identity::PeerId isn't re-exported but is in two public interfaces: MultiAddr::with_p2p and Protocol::P2p. Any objections to a re-export (or a wrapper tuple struct with Deref/DerefMut)? I'm happy to make a PR, but I figured I'd ask first as it has dependency/API ramifications.

I know the libp2p-identity crate is small so I could add it, but it'd be nice to have this library be self contained. I have an API that looks like a tiny bit of IPFS, which is why I want address info without actually pulling in libp2p proper.

Make decapsulate return an error if the protocol is not found

The doc of decapsulate currently reads: “Returns the original if the passed in address is not found”.

Could you change the behavior of decapsulate to return an error of the protocol is not found? This way we have a way to know the protocol was not in the multiaddr.
Additionally, it forces users of decapsulate to explicitly handle this case when calling it.

Build fails on rust stable due to CID crate

$ cargo --version
cargo 1.42.0 (86334295e 2020-01-31)
$ git rev-parse --short HEAD
9682623
$ cargo build
   Compiling multiaddr v0.3.1 (/Users/jameshageman/Documents/Github/school/fydp/rust-multiaddr)
warning: trait objects without an explicit `dyn` are deprecated
  --> src/errors.rs:13:22
   |
13 |     ParsingError(Box<error::Error + Send + Sync>),
   |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^ help: use `dyn`: `dyn error::Error + Send + Sync`
   |
   = note: `#[warn(bare_trait_objects)]` on by default

warning: trait objects without an explicit `dyn` are deprecated
  --> src/errors.rs:34:32
   |
34 |     fn cause(&self) -> Option<&error::Error> {
   |                                ^^^^^^^^^^^^ help: use `dyn`: `dyn error::Error`

warning: use of deprecated item 'std::error::Error::description': use the Display impl or to_string()
  --> src/errors.rs:18:21
   |
18 |         f.write_str(error::Error::description(self))
   |                     ^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: `#[warn(deprecated)]` on by default

error[E0277]: the trait bound `cid::Error: std::error::Error` is not satisfied
  --> src/errors.rs:50:33
   |
50 |         Error::ParsingError(err.into())
   |                                 ^^^^ the trait `std::error::Error` is not implemented for `cid::Error`
   |
   = note: required because of the requirements on the impl of `std::convert::From<cid::Error>` for `std::boxed::Box<dyn std::error::Error + std::marker::Send + std::marker::Sync>`
   = note: required because of the requirements on the impl of `std::convert::Into<std::boxed::Box<dyn std::error::Error + std::marker::Send + std::marker::Sync>>` for `cid::Error`

error: aborting due to previous error

For more information about this error, try `rustc --explain E0277`.
error: could not compile `multiaddr`.

To learn more, run the command again with --verbose.

Just thought I'd flag the build failure - it looks like #37 is addressing it.

Add ipfs and onion support

IPFS
Onion

avoid breaking network compatibility

As witnessed in libp2p/rust-libp2p#3244 (comment) adding a new protocol to the multiaddr implementation breaks communication between nodes using the new feature and nodes using an older version of this library. This would be warranted if understanding the meaning of a multiaddr is always mandatory, but there are cases (like the identify protocol) where that is not the case.

Breakage for such cases could be avoided by adding a new layer of validation: besides syntactically invalid and fully understood there could be a class that is syntactically valid but not fully understood.

Due to the design of multiaddr syntax, this is not a trivial question: a protocol segment may have arguments, like /tcp/1234, and without understanding the protocol name it is impossible to know the number of arguments. It would have been possible to choose different separator characters (like /tcp=1324 or some such), but that ship has sailed. So the only way to express syntactically valid but not fully understood addresses is to add a variant like Protocol::Unknown(Cow<'a, str>), where /tcp2/1234 would lead to two unknown segments (with tcp2 and 1234 payloads, respectively).

The alternative to handling this in this library is to always deserialize a multiaddr property as String first and then check whether it can be fully parsed if needed. However, given that multiaddr aims to offer an abstraction over various addressing schemes, I think it is reasonable to expect that this scheme itself is extensible and handles extensions in a graceful fashion.

Links to the doc are broken

Links to the doc in the Github project description and the README are https://dignifiedquire.github.io/rust-multiaddr/multiaddr/struct.Multiaddr.html , which returns a not found error.

Purpose of Arc in Multiaddr

i was searching for why Arc was used after my overnight test shown additional allocation tracing back to the Arc in Multiaddr (probably unrelated, though it did catch my eye). The previous commit that made the change did not mention the reason for using it, and looking over the code, removing the Arc should work just fine and without being a breaking change.

`Multiaddr::from_bytes` panics on output of `to_bytes()`/`.as_slice()`

Wanted to add parsing support for an external library, added a test, found that:

        let address: Multiaddr = "/ipv4/127.0.0.1".parse().unwrap();
        let wtf = Multiaddr::from_bytes(address.to_bytes()).unwrap();

panics. Same with as_slice().to_vec(). Is that expected behavior?

The docs don't really state/show how these APIs are supposed to be used, neither do any tests show that any output of the bytes*-functions of Multiaddr can be parsed by from_bytes...

Should we add an empty MA to the failure test?

Seems that the spec doesn't allow an empty MA.

Human-readable multiaddr: (/<protoName string>/<value string>)+
Example: /ip4/127.0.0.1/udp/1234
Machine-readable multiaddr: (<protoCode uvarint><value []byte>)+
Same example: 0x4 0x7f 0x0 0x0 0x1 0x91 0x2 0x4 0xd2
Values are usually length-prefixed with a uvarint

Publishing v0.18

Is there anything blocking for publishing v0.18?

tests/lib: Update `impl Arbitrary for Proto`

Arbitrary implementation seems to be missing some enum variants.

Protocol has 30 variants. Arbitrary implementation generates 25.

Ideally this would be fixed by removing the need to list every Protocol variant. Though I can not think of a clean way of doing that. Simply updating the Arbitrary implementation seems fine for now.

rust-multiaddr/tests/lib.rs

Lines 88 to 139 in 1cfb923

    
           impl Arbitrary for Proto { 
        
               fn arbitrary<G: Gen>(g: &mut G) -> Self { 
        
                   use Protocol::*; 
        
                   match u8::arbitrary(g) % 26 { 
        
                       // TODO: Add Protocol::Quic 
        
                       0 => Proto(Dccp(Arbitrary::arbitrary(g))), 
        
                       1 => Proto(Dns(Cow::Owned(SubString::arbitrary(g).0))), 
        
                       2 => Proto(Dns4(Cow::Owned(SubString::arbitrary(g).0))), 
        
                       3 => Proto(Dns6(Cow::Owned(SubString::arbitrary(g).0))), 
        
                       4 => Proto(Http), 
        
                       5 => Proto(Https), 
        
                       6 => Proto(Ip4(Ipv4Addr::arbitrary(g))), 
        
                       7 => Proto(Ip6(Ipv6Addr::arbitrary(g))), 
        
                       8 => Proto(P2pWebRtcDirect), 
        
                       9 => Proto(P2pWebRtcStar), 
        
                       10 => Proto(P2pWebSocketStar), 
        
                       11 => Proto(Memory(Arbitrary::arbitrary(g))), 
        
                       // TODO: impl Arbitrary for Multihash: 
        
                       12 => Proto(P2p(multihash( 
        
                           "QmcgpsyWgH8Y8ajJz1Cu72KnS5uo2Aa2LpzU7kinSupNKC", 
        
                       ))), 
        
                       13 => Proto(P2pCircuit), 
        
                       14 => Proto(Quic), 
        
                       15 => Proto(Sctp(Arbitrary::arbitrary(g))), 
        
                       16 => Proto(Tcp(Arbitrary::arbitrary(g))), 
        
                       17 => Proto(Udp(Arbitrary::arbitrary(g))), 
        
                       18 => Proto(Udt), 
        
                       19 => Proto(Unix(Cow::Owned(SubString::arbitrary(g).0))), 
        
                       20 => Proto(Utp), 
        
                       21 => Proto(Ws("/".into())), 
        
                       22 => Proto(Wss("/".into())), 
        
                       23 => { 
        
                           let a = iter::repeat_with(|| u8::arbitrary(g)) 
        
                               .take(10) 
        
                               .collect::<Vec<_>>() 
        
                               .try_into() 
        
                               .unwrap(); 
        
                           Proto(Onion(Cow::Owned(a), std::cmp::max(1, u16::arbitrary(g)))) 
        
                       } 
        
                       24 => { 
        
                           let a: [u8; 35] = iter::repeat_with(|| u8::arbitrary(g)) 
        
                               .take(35) 
        
                               .collect::<Vec<_>>() 
        
                               .try_into() 
        
                               .unwrap(); 
        
                           Proto(Onion3((a, std::cmp::max(1, u16::arbitrary(g))).into())) 
        
                       } 
        
                       25 => Proto(Tls), 
        
                       _ => panic!("outside range"), 
        
                   } 
        
               } 
        
           }

Move to varint for the code bytes

Add linting rules

See https://github.com/maidsafe/QA/blob/master/Documentation/Rust%20Lint%20Checks.md for ideas

	impl Arbitrary for Proto {
	fn arbitrary<G: Gen>(g: &mut G) -> Self {
	use Protocol::*;
	match u8::arbitrary(g) % 26 {
	// TODO: Add Protocol::Quic
	0 => Proto(Dccp(Arbitrary::arbitrary(g))),
	1 => Proto(Dns(Cow::Owned(SubString::arbitrary(g).0))),
	2 => Proto(Dns4(Cow::Owned(SubString::arbitrary(g).0))),
	3 => Proto(Dns6(Cow::Owned(SubString::arbitrary(g).0))),
	4 => Proto(Http),
	5 => Proto(Https),
	6 => Proto(Ip4(Ipv4Addr::arbitrary(g))),
	7 => Proto(Ip6(Ipv6Addr::arbitrary(g))),
	8 => Proto(P2pWebRtcDirect),
	9 => Proto(P2pWebRtcStar),
	10 => Proto(P2pWebSocketStar),
	11 => Proto(Memory(Arbitrary::arbitrary(g))),
	// TODO: impl Arbitrary for Multihash:
	12 => Proto(P2p(multihash(
	"QmcgpsyWgH8Y8ajJz1Cu72KnS5uo2Aa2LpzU7kinSupNKC",
	))),
	13 => Proto(P2pCircuit),
	14 => Proto(Quic),
	15 => Proto(Sctp(Arbitrary::arbitrary(g))),
	16 => Proto(Tcp(Arbitrary::arbitrary(g))),
	17 => Proto(Udp(Arbitrary::arbitrary(g))),
	18 => Proto(Udt),
	19 => Proto(Unix(Cow::Owned(SubString::arbitrary(g).0))),
	20 => Proto(Utp),
	21 => Proto(Ws("/".into())),
	22 => Proto(Wss("/".into())),
	23 => {
	let a = iter::repeat_with(\|\| u8::arbitrary(g))
	.take(10)
	.collect::<Vec<_>>()
	.try_into()
	.unwrap();
	Proto(Onion(Cow::Owned(a), std::cmp::max(1, u16::arbitrary(g))))
	}
	24 => {
	let a: [u8; 35] = iter::repeat_with(\|\| u8::arbitrary(g))
	.take(35)
	.collect::<Vec<_>>()
	.try_into()
	.unwrap();
	Proto(Onion3((a, std::cmp::max(1, u16::arbitrary(g))).into()))
	}
	25 => Proto(Tls),
	_ => panic!("outside range"),
	}
	}
	}