Coder Social home page Coder Social logo

aw's People

Contributors

bzlw avatar divyakoshy avatar jazg avatar loongy avatar rahulghangas avatar ross-pure avatar roynalnaruto avatar susruth avatar tok-kkk avatar vinceau avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aw's Issues

IP address doesn't get deleted from table if custom receive is used

If a custom peer.Receive() call is used with peer.transport.Run() instead of just calling peer.Run(), the IP address might never get deleted from the hash table. This is because IP address deletion happens in peer discovery and a custom receive function might not use peer discovery at all.

Transport dial method will potentially spam dials if dial context is exceeded or cancelled

Transport dial method will spam dials if dial context is exceeded or cancelled and the connection is still not established

if err != nil {
    t.opts.Logger.Error("dial", zap.String("remote", remote.String()), zap.String("addr", remoteAddr.String()), zap.Error(err))
    select {
    case <-retryCtx.Done():
    case <-dialCtx.Done():
        if !t.IsConnected(remote) {
	    continue
	}
    }
}

Untested modues tracker

This issue tracks untested modules or modules that require further testing

  • Peer Discovery (peer/peerdiscovery.go)
    • Full Testing
  • Pull based gossiping (peer/sync.go, peer/gossip.go)
    • Full Testing
  • Filtering (channel/filter.go)
    • Full Testing
  • Channels (channel/channel.go) Broken tests

Rate-limiting with exponential blacklisting

Currently, rate-limiting in aw is required to be implemented at the application-level. It assumes that the application has the best information about per-message rate limits, and to him to apply them. While this remains true, it is still worth implementing a basic rate-limiter.

The rate-limit should be implemented using a standard rate-per-second with temporary burst. If the rate-limit is violated, then the offending IP address is blacklisted. This drops the existing connection, and refuses connections from this IP address until the end of blacklist timeout. If the IP address attempts connections during this blacklisted period, then the period is extended by the back-off factor (multiplying the current time left by the back-off factor).

  • Add an option to the peer for per-protocol rate limiting (see #25)
  • Add an option for the blacklist timeout (default: 30 seconds)
  • Add an option for the blacklist timeout back-off (default: 1.6)
  • Violation of the rate-limit drops the connection and blacklists the IP address

Recent offenders should be stored in-memory, but also saved on-disk in the case of an unexpected reboot. Assuming all IP addresses are 128 bits, an in-memory limit of 1MB would allow for 65,536 offenders before the server begins to drop attackers. In the case that this limit is reached, the least recent offender will be dropped from the list (implying that the offender is forgiven).

  • Store offenders in-memory (default: 65,536 addresses)
  • Save offenders on-disk and load them at boot
  • Forgive least recent offenders when in-memory limit is reched

Return port as additional return value in Listen and Dial functions

When listening for incoming connections, specifying a port does not guarantee that it will be available. Supplying a port value of 0 picks up an available port, which can be queried as follows

addr := ":0"
listener, _ := net.Listen("tcp", addr)
port := listener.Addr().(*net.TCPAddr).Port
fmt.Println("Port:", port)

The return value of functions Dial and Listen will change from error to (uint16, error). Does this seem like a feature we would like? This would also solve our testing bugs implicitly and we don't have to use different ports for different tests.

Receive functions not registered properly when having more than one receiver.

Inside the channel package, the client.Receive() function won't register receiver functions if you call it more than once.
Calls after the first one will just exit and does nothing (since the client.receiversRunning has been marked to true)

I'm thinking to add

		select {
		case <-ctx.Done():
		case client.receivers <- receiver{ctx: ctx, f: f}:
		}

between

aw/channel/client.go

Lines 171 to 172 in 27ccfc3

if client.receiversRunning {
client.receiversRunningMu.Unlock()

but I realize it could cause a potential deadlock.

I'll let you guys decide how to fix this.

Migrate to GitHub Actions from CircleCI

We want to migrate all of our CI/CD into GitHub Actions so that we can minimize the number of different services that we are using, but still maximize the integration of different features. Moving to GitHub Actions means that we can drop CircleCI (which we have been having issues with), but still allows for a deep integration with GitHub.

Tracking

  • Translate the CircleCI config file into one that is compatible with GitHub Actions.
  • Check that Actions is passing all checks.
  • Require status checks from Actions to be passing before allowing PRs to merge.

Inefficient removal in sendToSubnet while gossiping

Function sendToSubnet in gosspier.go currently uses re-slicing to remove elements. Re-slicing moves the entire array after the point of removal one position to the left. Since there is an exponential bias for signatories earlier in the queue, a LinkedList, although heavier and not cache-friendly makes up with the ability to do constant removal. That is, performance is dependant on alpha and not on the number of signatories. I did some initial testing and the performance hit is about 5x slower for 1000 signatories based on default bias and alpha.

Benchmark - https://gist.github.com/rahulghangas/e823bb3c45229c0a4460c44898299d05
can be run with go run test.go numSignatories alpha

Screen Shot 2020-09-19 at 9 15 19 PM

Negotiate aw and protocol versions during the handshake

Right now, peers engage in a simple handshake immediately after establishing a connection to verify their identities. This issue proposes extending the handshake to negotiate the aw version being used, and the application-specific protocols being sent via aw.

Negotiate aw version

After establishing identities, the client is expected to send their aw version to the server. This establishes whether or not the client and server are compatibility based on the aw version. If the version is accepted by the server, then an acknowledgement must be sent back. Otherwise, the highest version supported by the server must be sent back, and the client must send an acknowledgement or drop the connection. Semver must be used to determine compatibility.

  • Implemented
  • Tested
  • Documented

Negotiate protocol versions

If the aw versions are compatible, then the client should send a list of the "application-level" protocols that it will be communicating via aw. For example: in RenVM, the client might send the protocols hy-1.4.0 and zo-1.0.3 to establish the application-specific messages that can appear within the aw messages. If all protocol version are accepted by the server, then an acknowledgement must be sent back. Otherwise, the highest version supported by the server for each protocol must be sent back, and the client must send an acknowledgement or drop the connection. Semver must be used to determine compatibility, but it is not required.

  • Protocol versions standardised
  • Implemented
  • Tested
  • Documented

Unified timeout for sending a message

Currently the Send method in the peer module, and consequently, the Send method in transport uses a user supplied context for timeout. This causes multiple timing issues across gossipping and syncing if not configured with proper times. Also, using two different timeouts can get confusing. There are two potential solution that come to mind

  1. remove the context parameter and have a single timeout for Send that gets configured while creating the transport layer.
  2. Since some messages might be quite big and might not finish in the timeout specified in the transport layer, create two methods for sending a message
    • where the timeout field from the transport layer is used
    • where the user supplies the context with a custom timeout

Sync message always gets denied when pulling

The Gossiper.didReceivePush() function will deny the content id in the filter after the context is done.

aw/peer/gossip.go

Lines 104 to 112 in 27ccfc3

go func() {
<-ctx.Done()
g.subnetsMu.Lock()
delete(g.subnets, string(msg.Data))
g.subnetsMu.Unlock()
g.filter.Deny(msg.Data)
}()

But the context is defer cancelled and will be cancelled after finishing sending the pull message

aw/peer/gossip.go

Lines 86 to 87 in 27ccfc3

ctx, cancel := context.WithTimeout(context.Background(), g.opts.Timeout)
defer cancel()

This causes the incoming sync message will always be rejected.

Sync context always timing out

There seem to be at least two issues in the syncer:

  1. We're overriding the global context on the following lines, when instead we should probably be defining an inner context:

aw/peer/sync.go

Lines 110 to 117 in ac3835b

ctx, cancel := context.WithTimeout(ctx, syncer.opts.Timeout)
defer cancel()
err := syncer.transport.Send(ctx, peer, wire.Msg{
Version: wire.MsgVersion1,
Type: wire.MsgTypePull,
Data: contentID,
})

  1. The context always times out, even when the DidReceiveMessage function is called and the pending content flag is signalled. Is there a reason we can't have something as simple as the following?
type pendingContent struct {
	// content is nil while synchronisation is happening. After synchronisation
	// has completed, content will be set.
	content chan []byte
}

func (pending *pendingContent) wait() <-chan []byte {
	return pending.content
}

func (pending *pendingContent) signal(content []byte) {
	pending.content <- content
}

Incorrect usage of timeouts in gossip

While gosspping, two different timeouts are used
- defined by context supplied by user
- defined by value of field in the GossipOptions struct

The first one defies the total time for a round of gossipping, while the second one defines the timeout for sending each message. When a sync message is received, the gossipper tries to propagate the new message by re-gossipping it. However, the gossipper now has no notion of the timeout supplied by the user (and that particular context has probably gone out scope!). Currently we use the second timeout to a call to Gossip, but the gossipping might fail if the first attempt to message a peer fails (since that uses the same timeout)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.