revtr,aistein

*** What is this project all about? ***

Primary Goal
* Make the reverse traceroute system better by minimizing the number of RR-pings required,
* and by maximizing the amount of information we get from each one.

Why Rank VPs?
* First of all, to do reverse traceroute, we need to find VPs that are RR-reachable
* i.e. there are empty slots in the RR header by the time that the VP->Dst ping reaches
* Dst. Second of all, we do not want to issue so many RR-pings into a network that
* rate-limiting is triggered. Third of all, it would be great if the VPs chosen were
* as close as possible to the destination, so as to maximize the number of free slots
* in the RR header.

VP Ranking Methods
* 1) Destination Cover
+ Overview:
- Say we have a new destination with prefix 'P'. Given a set 'S' of VPs for which we
- already have some RR-reachable measurements to other destinations within P, rank
- each VP 'v' in S in terms of how many such destinations it reaches. This is a greedy
- set-covering approximation algorithm which just picks the "set" v that has the largest
- number of RR-reachable destinations in P.
+ Drawbacks:
- 1) It has no "terminating condition"; i.e. just because it supplies us with a rank-ordering
- of VPs, we may still have to issue probes from each VP moving down the list towards our
- new target destination. This could end up having us send pings from every single VP in S!
- 2) It does not take into account the number of hops away from the destination a VP sits.
- By extension, it does not maximize the amount of information we can get back from an RR-ping.
* 2) Top-K / Restricted Set
+ Overview:
- A small number of M-Lab sites can to a large majority of RR-reachable destinations. As shown
- in Brian's paper, using just the top 10 M-Lab sites reached 95% of RR-reachable destinations
- , within the same number of hops, as reached by the full set of M-Lab + PlanetLab nodes.
- In light of this, restricting the set of VPs to just the top-K can give most of the results,
- while greatly reducing the amount of probing required.
+ Drawbacks:
- ???
* 3) Ingress-Based
+ Overview:
- Define some notion of "network" -- we choose BGP-routable prefix in our work -- and find for each
- RR-pingable measurement the point of "ingress." That could be either the last hop before entering
- the given prefix, or the first hop after entering the prefix -- as long as the choice is consistent.
- Now just deliver a ranking based on the number of hops away from an ingress that a VP sits. If several
- VPs enter the network via the same ingress, just keep the one with the smallest hop-count, and discard
- the rest. This is okay because Internet routing is destination-based: any distinct routes from S to D
- going through some common router R will likely be exactly the same from R to D.
+ Benefits:
- 1) This approach does have a "terminating condition" in that many VPs from the original set S will be
- eliminated from the delivered list of rankings. This contributes towards the overall goal of minimizing
- the number of probes sent overall during the RTR process.
- 2) The distance from VP to ingress is taken into account for these rankings. This effectively maximizes
- the amount of information we are likely to receive from a successful RR-ping when using top-ranked VPs.
+ Drawbacks:
- ???

Evaluation
* We will use existing measurements from 2011 in conjunction with BGP-dumps from the same time period to evaluate the
* efficacy of each of the three aforementioned methods. For each BGP-routable prefix for which the VPs in this
* dataset have RR-reachable measurements, the measurements will be split into test and training data. The training
* set is used to create VP rankings via each of the three approaches. The test set will be used to emulate actual
* RR-pings to never-before-seen destinations. In the process of emulation for each method we can record metrics:
* - NumProbes: the number of probes we had to send before receiving a successful stamping
* - NumHops: the number of hops it takes to reach the destination on a successful probe
* - VPChosen: ???
* - AltVPExists: ???
* With these metrics in hand, we'll be able to make a first-order judgement about the relative effectiveness of each
* ranking method. Results will include:
* - CDF showing the fraction of RR-reachable test-destinations for which k probes were required to receive a stamp;
* legend: ranking methods
* - CDF showing the fraction of RR-reachable test-destinations for which the destination was reached within k hops;
* legend: ranking methods
* - ...

Challenges
*) Large datasets can take a long time to process, so writing efficient fast code was an important consideration
*) ...

aistein / revtr Goto Github PK

revtr's Introduction

revtr's People

Contributors

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent