Coder Social home page Coder Social logo

veritas's Introduction

Veritas: Answering Causal Queries from Video Streaming Traces

Veritas is an easy-to-interpret domain-specific ML model to tackle causal reasoning for video streaming.

Given data collected from real video streaming sessions, a video publisher may wish to answer "what-if" questions such as understanding the performance if a different Adaptive Bitrate (ABR) algorithm were used , or if a new video quality (e.g., an 8K resolution) were added to the ABR selection. Currently, Randomized Control Trials (RCTs) are widely used to answer such questions, also known as causal reasoning. However, RCTs require active interventions involving changing the system and observing the performance of real users. Thus, RCTs must be conservatively deployed as they could be disruptive to the performance of real users.

Veritas tackles causal reasoning using passively collected data without requiring RCTs. Veritas enables more accurate trace-driven emulations and simulations of a wide range of design alternatives without impacting the performance of live users. For a given video session, it uses the observed data (chunk download times, chunk sizes, TCP states, etc.) to infer the latent Intrinisic Network Bandwidth (INB) during the session. Once an INB sample is obtained, we can now directly evaluate the proposed changes, and return the answer to the what-if query. Further, rather than a single point estimate, Veritas provides a range of potential outcomes reflecting the inherent uncertainty in inferences that can be made from the data.

This artifact accompanies the paper: "Veritas: Answering Causal Queries from Video Streaming Traces". Chandan Bothra, Jianfei Gao, Sanjay Rao, and Bruno Ribeiro. In Proceedings of the ACM Special Interest Group on Data Communication, SIGCOMM ’23, New York, NY, USA. If you use this artifact, please cite:

@inproceedings{Veritas_2023,
  author    = {Bothra, Chandan and Gao, Jianfei and Rao, Sanjay and Ribeiro, Bruno},
  title     = {Veritas: Answering Causal Queries from Video Streaming Traces},
  year      = {2023},
  url       = {https://doi.org/10.1145/3603269.3604828},
  doi       = {10.1145/3603269.3604828},
  booktitle = {Proceedings of the ACM Special Interest Group on Data Communication},
  series    = {SIGCOMM '23}
}

Pre-requisites:

The following set up has been tested on Ubuntu 22.04.

# Installing pip3
sudo apt update
sudo apt upgrade
sudo apt-get install python3-pip

# Installing conda
    * https://docs.conda.io/projects/conda/en/latest/user-guide/install/linux.html [python3.10]
# Clone the repository.
cd VeritasML
conda create --name veritas
conda activate veritas
bash environment.sh

Preparing dataset for Veritas:

For a given video sessions dataset collected by user, create an input directory <input_directory> with the following structure:

 input_directory/
 |_ video_session_streams/
 |_ ground_truth_capacity/
 |_ train_config.yaml
 |_ inference_config.yaml
 |_ full.json
 |_ fhash.json

  • train_config.yaml: It contains various parameters needed to train a dataset. E.g.: sample_file.
  • inference_config.yaml: It contains various parameters needed for inference using a dataset. E.g.: sample_file.

More details about the parameters used in training & inference and how to choose them are available in the sample configuration files shared above.

  • video_session_streams: It contains the observed data relevant to a video session such as download time, chunk size, TCP states (when available), etc. Each line in the video session file includes information about chunk payloads in a video session. The fields included in the file are: the start time (numpy.datetime64), end time (numpy.datetime64), size (KB), trans_time/download time (ms), cwnd (number), rtt (ms),rto (ms), ssthresh (number), last_snd (s), min_rtt (ms), delivery_rate (-). E.g.: sample_file.
  • ground_truth_capacity: This is useful for evaluating the performance of Veritas by comparing the inferred values with ground truth, and to plot figures. In emulation experiments, the INB is known and Veritas samples aspire to match the INB. In real world data, we do not know the INB, hence we can make a best guess (or provide dummy values) for the INB. Please note: this data is not used by the core Veritas logic, it is only used for comparison when ground truth information is available. Each line in a ground truth capacity file includes the ground truth bandwidth (Mbps) and start time (numpy.datetime64) for that capacity. E.g.: sample_file.
  • full.json: It contains a list of the video session files to be used to for evaluation. E.g.: sample_file. This file is used to identify the sessions used for training, validation and inference. In our case, we use all the sessions for training and again use all the sessions for inference. Thus, full.json includes the names of all the sessions in the video_session_streams directory. The script can be used to generate this file.
    python3 scripts/get_full.py --input_directory <path_to_input_directory>
    
  • fhash.json: It contains hash value for each file in the video_session_streams and ground_truth_capacity directory. It is useful to uniquely identify the input files and helps in logging the results. E.g.: sample_file. The script can be used to generate this file.
    python3 scripts/get_fhash.py --input_directory <path_to_input_directory>
    

For reference, we have shared a dataset used in the paper, which contains the files and directories mentioned above. This dataset contains video sessions emulated using BBA ABR algorithm and a 15s client buffer. More details about emulation setup are shared in the paper.

Using Veritas

The following steps run Veritas for training and inference. We use the above dataset as input, but any user input directory with above defined structure can be used as an input. Please note the commands are run from the home directory, VeritasML.

  1. Training: The parameters (general, HMM, video sessions, etc.) from training configuration file in the input directory are used for training. The trained model is saved in the logs/fit/ directory with the name: <curent_timestamp>:<suffix_in_the_config_file>.
    python3 scripts/train.py --input_directory <path_to_input_directory>
    E.g.: python3 scripts/train.py --input_directory src/data/datasets/Controlled-GT-Cubic-BBA-LMH
    
  2. Inference: The output model from training and the parameters (number of samples, duration of samples, etc.) from the inference configuration file in the input directory are used for inference of INB traces.
    python3 scripts/inference.py --input_directory <path_to_input_directory> --trained_model <path_to_trained_model>
    E.g.: python3 scripts/inference.py --input_directory src/data/datasets/Controlled-GT-Cubic-BBA-LMH --trained_model <path_to_trained_model>
    
    The location and contents of output directory look like:
    logs/transform/<current_timestamp>:<suffix_in_the_config_file>
       |_ sample
          |_<session_1>
            |_ sample_full.csv
          |_<session_2>
            |_ sample_full.csv
          ...
          |_<session_1.png>
          |_<session_2.png>
          ...
    

Let's say for each video session, we want to sample the INB traces for a duration of <num_sample_seconds>=300s and also get <num_random_samples>=3 (as defined in the inference configuration file). Further, the transition step size (the duration for which the INB remains constant) set during training is 5s. Then, each sample_full.csv has 300/5 = 60 lines and contains '3' comma separated values for the inferred INB for the given session in each line. E.g.: sample_full.csv.

0,1,2
4.5,4.5,4.5
3.5,3.5,3.5
3.0,3.5,3.0
3.0,3.5,3.0
3.0,3.0,3.0
3.5,3.5,3.0
...

Further, the output directory also contains the figures comparing the Ground Truth, Baseline (defined in the paper) and Veritas inferred INB traces. E.g.: sample figure. For more details, please check Fig 7 in the paper. sample figure

Using the inferred INB traces

The above video session dataset used BBA ABR with 15s client buffer size. Let's say we want to know the performance of BBA ABR if the client buffer size was changed from 15s to 5s. We use the INB traces (using the sample_full.csv) and run emulation with new settings, i.e BBA with 5s buffer using emulation tools such as Mahimahi. In the emulation environment, we now directly evaluate the performance of the proposed changes, and return the answer to the what-if queries.

Veritas parameters

As mentioned above, the details of the parameters used for training and inference are provided in the config files. They are also available by running following commands from VeritasML directory:

Training: python3 fit.py -h
Inference: python3 transform.py -h

One of the special parameters used by Veritas is the Domain-specific emission model (f). Veritas has the flexibility to use custom functions for the emission models of Veritas’s High-order Embedded Hidden Markov Model (HoEHMM). We pass the emission functions in the fit.py and transform.py files. These functions use the fields described in the video_session_file (except download time) and possible capacity values for abduction as inputs and return the estimated throughput. For reference, we have included a few emission functions in fit.py and transform.py files in the VeritasML directory.

Other datasets

We have shared other datasets (along with config files) used in our emulation experiments:

  • MPC ABR, 15 s buffer: This is used to answer what-if we change the ABR from MPC to BBA with same buffer size.
  • BBA ABR, 15s buffer, low qualities only: This is used to answer what-if we use higher qualities in deployment.
  • Puffer (Aug 24, 2020), [all streams: BBA, Bola1, Bola2, slow streams: BBA, Bola1, Bola2]: We use the Puffer data to perform validations real world data. All streams includes the sessions for the given ABR, while the slow streams includes the sessions with mean delivery rate less than 6 Mbps.

We use the steps described in this section (change the input directory path) to infer the INB traces and then run the counterfactual query to get the results.

Contact

Please contact [email protected] for any questions.

veritas's People

Contributors

cbothra123 avatar cbothra171994 avatar sgrlaf avatar

Stargazers

MingXuan Yan avatar Hu Binbin avatar Zhang Ruixiao avatar zzhao avatar

Watchers

 avatar  avatar

veritas's Issues

Training script does not work with provided Puffer dataset.

Hi, I'm training Veritas with one of the provided datasets. Concretely, I'm trying to run:

python scripts/train.py --input_directory src/data/datasets/Aug24-Slow-Bola1

Unfortunately, this does not work. The first error is the following:

> src/veritas/frameworks/fit/hmm/stream.py(156)parse()
-> assert self._capmin < self._capunit, "Minimum capacity should be strictly smaller than capacity unit."

After some digging, I discovered that the provided script do not set --capacity_min, and Veritas uses the default value of 0.1, which is larger than the capacity unit of 0.05 configured in train_config.yaml

This issue can be fixed by updating the train scripts to include capacity min. I have chosen a value of 0.01 for now, but could you let me know which value was used for the results in the paper?

This is not the only issue, though. After fixing capacity min, Veritas trains successfully for one epoch, then crashes. This is the observed output:

+-------+-------------------------+----------+--------+
|       |                NLL.Mean |          |        |
| Epoch +------------+------------+ Time.Sec | Signal |
|       |      Train |      Valid |          |        |
+-------+------------+------------+----------+--------+
|     0 |        inf |   0.036072 |   23.423 |      ↓ |
Traceback (most recent call last):
 (...)
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.DoubleTensor [298]], which is output 0 of AsStridedBackward0, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

I am not sure what is going on, but the "inf" value under Train definitely doesn't look right.

According to this section in the README, I should be able to use the other datasets as well, correct?

How can I train Veritas with the provided data?

Thank you for the advice.

Question regarding CausalSim (NoRCT) in the Veritas paper

Hi,

I'm not sure whether this is the right place for it, but I say in your paper that you were retraining CausalSim without RCT data.

Can I find the code and data you have used for this somewhere? I'd like to compute some similar plots like you show in the paper and it would be helpful for me to be able to build on your work.

Thanks and best,
Alex

How can I obtain TCP's state information?

In the process of reading this paper, I have gained valuable insights. May I ask the author where the code for obtaining TCP state information is located and how it was obtained?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.