Coder Social home page Coder Social logo

dragonnet's People

Contributors

arose13 avatar claudiashi57 avatar vveitch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dragonnet's Issues

IHDP Dataset query

Hello @claudiashi57,

Could you tell why there are 50 CSVs as dataset for IHDP? Does it make sense to combine all the CSVs into one big CSV and then make observations on them?

Query about NPCI data

Could you please explain how you used NPCI to generate the 1k data files mentioned in the paper?
I tried to ask the author and he suggested getting back to you: vdorie/npci#2
This files would be useful to me to replicate your results and compare them to our own approach

Upgrade for imports and functions

  • Import has updated from keras.optimizers to tensorflow.keras.optimizers.
  • y_scaler.inverse_transform requires a 2 dimensions matrix, thus resizing required.
  • tf.random.set_random_seed() has updated to tf.random.set_seed(i).
  • lr parameter in optimizers are deprecated for both Adam and SGD. Replaced with learning_rate.
  • from keras.engine.topology import Layer has updated to from tensorflow.keras.layers import Layer.
  • Script at src/experiment/run_ihdp.sh is updated to make it more generic.

ihdp data indices

Hi,

Thanks for sharing your interesting work. I am trying to work through some of the results of the paper

I noticed that the column indices mentioned idhp_data.py:

 binfeats = [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]
 contfeats = [i for i in range(25) if i not in binfeats]

They do not match the columns in the csv files contained in the dat folder. E.g. ihdp_npci_1.csv

Can you please advise if this is the correct reference files.

Thanks in advance.

about GPIO with wiringpi

There is no longer a site where I can use wiringpi to upgrade my Raspberry Pi's GPIO. Could you please create the repository again?

Query about the IHDP data folder

According to the the original paper (Hill 2011) there are 747 units (139 treated, 608 control). In the dat folder there are 50 csv.

Which is the original csv?
Also, Why are there are 50 csv? Are they simulated?

Correct test_size in train_test_split of ihdp_main.py to reproduce in-sample and out-sample paper results

From documentation:
Note: the default code use all the data for prediction and estimation. If you want to get the in-sample or out-sample error: i) change the train_test_split criteria in ihdp_main.py; ii) rerun the neural net training; iii) run ihdp_ate.py with apporiate in-sample data and out-sample data.

From paper:
We randomly split the data into test/validation/train with proportion 63/27/10 and report the in sample and out of sample estimation errors.

Is this split correct (train size 10% vs. test size 63%)?

Demo notebook on simulated examples to check correctness of implementation?

Hi,

thanks for putting the code together!

I tried to train dragonnet on toy examples (KangSchafer) with 0 treatment effect. Various variations on hidden layers, batch sizes, etc yield estimates that are completely off (ATE = -60). Did you test your implementation on toy examples from literature or your own simulated ground truth data to verify it works as intended?

Some example code

 np.random.seed(123)
import empirical_calibration as ec
simulation = ec.data.kang_schafer.Simulation(size=2000)

t = simulation.treatment.reshape(-1, 1)
x = simulation.transformed_covariates
y = simulation.outcome.reshape(-1, 1)
t.shape, x.shape, y.shape
# Use causalml to show other methods work as intended
def _ks_df(size, seed = None):
    if seed:
        np.random.seed(seed)
    simulation = ec.data.kang_schafer.Simulation(size=size)
    df = pd.DataFrame(
    np.column_stack([
        simulation.treatment, simulation.covariates,
        simulation.transformed_covariates, simulation.outcome
    ]))
    df.columns = [
    "treatment", "z1", "z2", "z3", "z4", "x1", "x2", "x3", "x4", "outcome"
    ]
    return df

df = _ks_df(size=1000)
from causalml.inference.meta import XGBTRegressor

xg = XGBTRegressor(random_state=42)
te, lb, ub = xg.estimate_ate(df[["x1", "x2", "x3", "x4"]], 
                             df["treatment"], df["outcome"])
print('Average Treatment Effect (XGBoost): {:.2f} ({:.2f}, {:.2f})'.format(te[0], lb[0], ub[0]))

When using dragonnet via the acic_main functions the estimates are entirely off essentially equal to the very naive method of taking outcome differences (--> ATE estimates of roughly -20).

test_outputs, train_outputs = acic_main.train_and_predict_dragons(t, y, x,
                                    targeted_regularization=True,
                                    output_dir="",
                                                                  dragon="dragonnet",
                                    knob_loss=models.dragonnet_loss_binarycross,
                                                              ratio=1.,
                                    val_split=0.2, batch_size=64, hidden_size_multiplier=2, verbose=False)

Doing this 10 times for sample size of n=5000 the ATE estimates (by method) are as follows:
image

Similarly for targeted_regularization=False. Do you have notebooks/documentation on running this on such toy examples for verification of implementation?

Interested in the Table Results

Hi Claudia,

Your work is interesting! I have a little confusion (why two TARNet results) on Table 1 as follows:
Screen Shot 2022-05-17 at 9 05 18 pm

I noticed that the stats in the upper section are cited from their original paper, and you attach your testing results in the bottom section. Since all the algorithms in the upper section use the data provided by TARNet paper (which I assume it is a widely used IHDP simulated data), and I also notice that you use your own simulated IHDP dataset, is that the reason why you do a baseline (TARNet) testing on your own simulated dataset, that that's the reason we see two TARNet results on Table 1?

Thanks for any reply in advance!

Regards,
Hechuan

Precisions concerning the ITE computation

Hi Claudia,

Thanks for your work. Could you please elaborate on the way the ITE is computed in the semi-parametric estimation file ?

image

In particular I don't understand how the $\hat{\epsilon}$ term relates to the paper. Isn't the targeted regularization part already tackled during the training phase ?

Regards,

Armand

Multiple treatments

Hi, thank you for your work. It is very interesting. I am currently trying to adopt your work to my problem but my problem has several possible treatments and I am having difficulty in generalizing some of the equations. Could you kindly provide some guidance on this?

Code does not match with description in paper

Thank you for sharing your code-base publicly. The idea presented in the paper is interesting. There are, however, several disparities between this code-base and the paper; these include:

  1. Not only make_tarnet and make_dragonnet share the same code, but also the same objective function is used to learn the parameters of TARnet and DRAGONnet. Therefore, the results must be the same.

  2. It is mentioned in the paper that:

To find the relevant parts of X, first, train a deep net to predict T. Then remove the final (predictive) layer. Finally, use the activation of the remaining net as features for predicting the outcome.

However, the code is implemented such that both outcome loss and cross entropy loss are optimized in the same objective function.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.