Please refer to CANDLE Documentation Home for tutorials and CANDLE Library API.
ecp-candle / benchmarks Goto Github PK
View Code? Open in Web Editor NEWECP-CANDLE Benchmarks
License: MIT License
ECP-CANDLE Benchmarks
License: MIT License
Please refer to CANDLE Documentation Home for tutorials and CANDLE Library API.
Not sure how else to handle this
Need to use new on_bad_lines after Pandas 1.4.0
While running Pilot 1 NT3 using the command python nt3_baseline_keras2.py --conf nt3_perf_bench_model.txt
I ran into this error caused by a missing parameter
Traceback (most recent call last): File "nt3_baseline_keras2.py", line 290, in <module> main() File "nt3_baseline_keras2.py", line 286, in main run(gParameters) File "nt3_baseline_keras2.py", line 101, in run X_train, Y_train, X_test, Y_test = load_data(train_file, test_file, gParameters) File "nt3_baseline_keras2.py", line 70, in load_data if gParameters['add_noise']: KeyError: 'add_noise'
Issue is caused by these lines
Benchmarks/Pilot1/NT3/nt3_baseline_keras2.py
Lines 68 to 82 in a48c85a
It seems like the candle parser being used never includes the parameters being checked here
Benchmarks/Pilot1/NT3/nt3_baseline_keras2.py
Lines 30 to 31 in a48c85a
So is there a different way to run this (maybe different flags) to avoid this issue. Obviously commented out lines 68-82 in nt3_baseline_keras2.py works but was not sure if parameters such as 'add_noise' will ever make it through to NT3. If not then maybe commenting out these lines permanently will save others some trouble?
./Pilot1/P1B3/p1b3.py:import logging
./Pilot1/P1B3/p1b3_baseline_keras2.py:import logging
./Pilot1/Uno_UQ/uno_holdoutUQ_data.py:import logging
./Pilot1/Uno_UQ/uno_inferUQ_keras2.py:import logging
./Pilot1/Uno_UQ/uno_trainUQ_keras2.py:import logging
./Pilot1/Uno_UQ/data_utils_/uno_combined_data_loader.py:import logging
./Pilot1/Uno_UQ/data_utils_/uno.py:import logging
./Pilot1/P1B2/p1b2.py:import logging
./Pilot1/P1B2/p1b2_baseline_neon.py:import logging
./Pilot1/TC1/tc1.py:import logging
./Pilot1/Combo/combo_baseline_keras2.py:import logging
./Pilot1/Combo/combo_dose.py:import logging
./Pilot1/Combo/combo.py:import logging
./Pilot1/UnoMT/utils/data_processing/label_encoding.py:import logging
./Pilot1/UnoMT/utils/data_processing/dataframe_scaling.py:import logging
./Pilot1/UnoMT/utils/data_processing/response_dataframes.py:import logging
./Pilot1/UnoMT/utils/data_processing/drug_dataframes.py:import logging
./Pilot1/UnoMT/utils/data_processing/cell_line_dataframes.py:import logging
./Pilot1/UnoMT/utils/datasets/drug_qed_dataset.py:import logging
./Pilot1/UnoMT/utils/datasets/drug_target_dataset.py:import logging
./Pilot1/UnoMT/utils/datasets/cl_class_dataset.py:import logging
./Pilot1/UnoMT/utils/datasets/drug_resp_dataset.py:import logging
./Pilot1/UnoMT/utils/miscellaneous/file_downloading.py:import logging
./Pilot1/UnoMT/networks/initialization/encoder_init.py:import logging
./Pilot1/UnoMT/unoMT.py:import logging
./Pilot1/P1B1/p1b1.py:import logging
./Pilot1/Uno/uno_mixedprecision_tfkeras.py:import logging
./Pilot1/Uno/uno_baseline_keras2.py:import logging
./Pilot1/Uno/uno_data.py:import logging
./Pilot1/Uno/uno.py:import logging
./Pilot2/P2B1/p2b1_baseline_keras2.py:import logging
./common/candle_keras/__init__.py:from keras_utils import LoggingCallback
./common/generic_utils.py:import logging
./common/default_utils.py:import logging
./common/candle/__init__.py: from keras_utils import LoggingCallback
Is there a distributed implementation of these benchmarks in TensorFlow/Keras?
Thank you.
Belongs here:
ECP-CANDLE/Supervisor#72
Use --get_data_only, then run with local data.
echo $CANDLE_DATA_DIR
/homes/brettin/Singularity/workspace/data_dir
ls $CANDLE_DATA_DIR
uno_input_data.h5
OSError: /homes/brettin/Singularity/workspace/uno_input_data.h5
does not exist
[Global_Params]
train_sources=['CCLE', 'GDSC', 'CTRP', 'ALMANAC']
#export_data='uno_input_data.h5'
use_exported_data='uno_input_data.h5'
test_sources=['train']
cell_types=None
cell_features=['rnaseq']
drug_features=['descriptors']
dense=[1000, 1000, 1000, 1000, 1000]
dense_feature_layers=[1000, 1000, 1000]
activation='relu'
loss='mse'
optimizer='adamax'
scaling='std'
dropout=.1
epochs=1
batch_size=32
val_split=0.2
cv=1
max_val_loss=1.0
learning_rate=0.0001
base_lr=None
agg_dose='AUC'
residual=False
reduce_lr=True
warmup_lr=True
batch_normalization=False
feature_subsample=0
rng_seed=2018
no_gen=False
verbose=False
preprocess_rnaseq='source_scale'
gpus=[0]
use_landmark_genes=True
no_feature_source=True
no_response_source=True
cp=True
save_path='save/uno'
output_dir='output/uno'
single=True
on_memory_loader=True
[Monitor_Params]
timeout=-1
Do not force big input data to reside inside a git clone. This forces code and big data to reside on the same FS (I use a soft link to get around this), and is likely to trigger git mistakes.
Allow user to invoke Benchmark in download-only mode, which will simply download the input data if it does not exist. This is necessary on supercomputers. This mode should not import keras or any other modules not required for data download.
Hello,
I was looking at this baseline: https://github.com/ECP-CANDLE/Benchmarks/blob/release_01/Pilot1/Combo/combo_baseline_keras2.py
I was asking myself if the data are already preprocessed/normalized because I don't see any preprocessing function.
[1]+ nohup singularity exec --nv ../images/uno-tensorflow:2.8.2-gpu-20220624.sif train.sh 0 $CANDLE_DATA_DIR /homes/brettin/Singularity/workspace/configs/uno_auc_model.txt &
(base) brettin@lambda7:~/Singularity/workspace/top21_uno$ tail -f nohup.out
'timeout': -1,
'train_bool': True,
'train_sources': ['CCLE', 'GDSC', 'CTRP', 'NCI60'],
'use_exported_data': 'top21_uno_v2.h5',
'use_filtered_genes': False,
'use_landmark_genes': True,
'val_split': 0.2,
'verbose': False,
'warmup_lr': True}
Params: {'train_sources': ['CCLE', 'GDSC', 'CTRP', 'NCI60'], 'use_exported_data': 'top21_uno_v2.h5', 'shuffle': True, 'test_sources': ['train'], 'cell_types': None, 'cell_features': ['rnaseq'], 'drug_features': ['descriptors'], 'dense': [1000, 1000, 1000, 1000, 1000], 'dense_feature_layers': [1000, 1000, 1000], 'activation': 'relu', 'loss': 'mse', 'optimizer': 'adamax', 'scaling': 'std', 'dropout': 0.1, 'epochs': 400, 'batch_size': 32, 'val_split': 0.2, 'cv': 1, 'max_val_loss': 1.0, 'learning_rate': 0.0001, 'base_lr': None, 'agg_dose': 'AUC', 'residual': False, 'reduce_lr': True, 'warmup_lr': True, 'batch_normalization': False, 'feature_subsample': 0, 'rng_seed': 2018, 'no_gen': False, 'verbose': False, 'preprocess_rnaseq': 'source_scale', 'gpus': [0], 'use_landmark_genes': True, 'no_feature_source': True, 'no_response_source': True, 'cp': True, 'save_path': 'save/uno', 'output_dir': '/homes/brettin/Singularity/workspace/top21_uno/output/uno/EXP000/RUN000', 'single': True, 'on_memory_loader': True, 'timeout': -1, 'train_bool': True, 'profiling': False, 'experiment_id': 'EXP000', 'run_id': 'RUN000', 'logfile': None, 'ckpt_restart_mode': 'auto', 'ckpt_checksum': False, 'ckpt_skip_epochs': 0, 'ckpt_directory': './save', 'ckpt_save_best': True, 'ckpt_save_best_metric': 'val_loss', 'ckpt_save_weights_only': False, 'ckpt_save_interval': 0, 'ckpt_keep_mode': 'linear', 'ckpt_keep_limit': 1000000, 'by_cell': None, 'by_drug': None, 'cell_subset_path': '', 'drug_subset_path': '', 'drug_median_response_min': -1, 'drug_median_response_max': 1, 'dense_cell_feature_layers': None, 'dense_drug_feature_layers': None, 'use_filtered_genes': False, 'feature_subset_path': '', 'cell_feature_subset_path': '', 'drug_feature_subset_path': '', 'es': False, 'tb': False, 'tb_prefix': 'tb', 'partition_by': None, 'cache': None, 'export_csv': None, 'export_data': None, 'growth_bins': 0, 'initial_weights': None, 'save_weights': None, 'config_file': '/homes/brettin/Singularity/workspace/configs/uno_auc_model.txt', 'data_type': <class 'numpy.float32'>}
/usr/local/Benchmarks/Pilot1/Uno/uno_baseline_keras2.py:14: DeprecationWarning: Please use pearsonr
from the scipy.stats
namespace, the scipy.stats.stats
namespace is deprecated.
from scipy.stats.stats import pearsonr
WARNING:tensorflow:From /usr/local/Benchmarks/Pilot1/Uno/uno_baseline_keras2.py:269: The name tf.keras.backend.set_session is deprecated. Please use tf.compat.v1.keras.backend.set_session instead.
Traceback (most recent call last):
File "/usr/local/Benchmarks/Pilot1/Uno/uno_baseline_keras2.py", line 676, in
main()
File "/usr/local/Benchmarks/Pilot1/Uno/uno_baseline_keras2.py", line 672, in main
run(params)
File "/usr/local/Benchmarks/Pilot1/Uno/uno_baseline_keras2.py", line 272, in run
loader.load(
File "/usr/local/Benchmarks/Pilot1/Uno/uno_data.py", line 1142, in load
with pd.HDFStore(use_exported_data, "r") as store:
File "/usr/local/lib/python3.8/dist-packages/pandas/io/pytables.py", line 591, in init
self.open(mode=mode, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/pandas/io/pytables.py", line 740, in open
self._handle = tables.open_file(self._path, self._mode, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/tables/file.py", line 300, in open_file
return File(filename, mode, title, root_uep, filters, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/tables/file.py", line 750, in init
self._g_new(filename, mode, **params)
File "tables/hdf5extension.pyx", line 368, in tables.hdf5extension.File._g_new
File "/usr/local/lib/python3.8/dist-packages/tables/utils.py", line 143, in check_file_access
raise OSError(f"{path}
does not exist")
OSError: /homes/brettin/Singularity/workspace/top21_uno/top21_uno_v2.h5
does not exist
(base) brettin@lambda7:/Singularity/workspace/top21_uno$ echo $CANDLE_DATA_DIR/Singularity/workspace/top21_uno$ ls $CANDLE_DATA_DIR
/homes/brettin/Singularity/workspace/data_dir
(base) brettin@lambda7:
Pilot1 top21_uno_v2.h5 uno_input_data.h5
We need the API for HPO runs on Polaris - has anyone started on this?
save is defined and assigned a value in uno_default_params.txt
when --save is specified on the command line, it does not override the value in the default_params.txt file
when i hack default_utils.py and add
parser.add_argument('--save', ...
it seems to work.
State AE input/output dimension: 24320
Traceback (most recent call last):
File "p2b1_baseline_keras2.py", line 300, in
main()
File "p2b1_baseline_keras2.py", line 297, in main
run(gParameters)
File "p2b1_baseline_keras2.py", line 178, in run
type_feat_vect = fields.keys()[3:8]
TypeError: 'odict_keys' object is not subscriptable
It would be nice to have the optimizer string be case insensitive. For example, build_optimizer could accept both 'sgd' and 'SGD' as valid optimizers.
Investigate availability of Globus downloads for CANDLE input data. Collaborate with ExaLearn?
When I try to run
python uno_baseline_keras2.py --config_file uno_by_drug_example.txt
I am getting the following error:
Traceback (most recent call last):
File "uno_baseline_keras2.py", line 555, in <module>
main()
File "uno_baseline_keras2.py", line 551, in main
run(params)
File "uno_baseline_keras2.py", line 309, in run
use_exported_data=args.use_exported_data,
File "/home/z1835018/code/uncertainty/Benchmarks/Pilot1/Uno/uno_data.py", line 999, in load
self.save_to_cache(cache, params)
File "/home/z1835018/code/uncertainty/Benchmarks/Pilot1/Uno/uno_data.py", line 698, in save_to_cache
os.mkdir(dirname)
FileNotFoundError: [Errno 2] No such file or directory: ''
Use a JSON file to keep track of checkpoint status.
Cf. https://docs.google.com/document/d/1Z5nE-Y5XsfUAe4MngHD5xmMb25o4Hw5Zawg93kjEeBY
Hi,
I got the following error running the Uno code. This used to work earlier though, I believe changes in Benchmarks/common/candle/init.py resulted in this error.
Traceback (most recent call last):
File "uno_baseline_keras2.py", line 8, in <module>
import candle
File "/lus/theta-fs0/projects/datascience/memani/uno-121422/Benchmarks-master/common/candle/__init__.py", line 175, in <module>
raise Exception("No backend has been specified.")
For use by IMPROVE
Developed by @brettin , bringing into Supervisor.
Traceback (most recent call last): File "p2b1_baseline_keras2.py", line 298, in <module> main() File "p2b1_baseline_keras2.py", line 294, in main run(gParameters) File "p2b1_baseline_keras2.py", line 231, in run molecular_model.compile(optimizer=opt, loss=loss_func, metrics=['mean_squared_error', 'mean_absolute_error']) File "/nfs/gce/software/custom/linux-ubuntu18.04-x86_64/anaconda3/rolling/envs/candle-tf1/lib/python3.7/site-packages/keras/engine/training.py", line 95, in compile self.optimizer = optimizers.get(optimizer) File "/nfs/gce/software/custom/linux-ubuntu18.04-x86_64/anaconda3/rolling/envs/candle-tf1/lib/python3.7/site-packages/keras/optimizers.py", line 873, in get str(identifier)) ValueError: Could not interpret optimizer identifier: <tensorflow.python.keras.optimizer_v2.adam.Adam object at 0x7f06fa323d90>
This might be due to mixed Tensorflow keras and keras API in the code.
Create this variable for issue 5 of candle_lib
Looks like the DARTS
example fails when a genotype.json
is not created before saving in the results/
directory.
I am assigning myself this issue.
as a regular Benchmark.
The suggestion is to allow the developer to "mask" what command line parameters are displayed when a user uses the --help option. Many of the default command line parameters are not used in a DNN.
Need a README at this https://github.com/ECP-CANDLE/Benchmarks/tree/develop/Pilot3 level and folders inside.
With the updates in candle_lib model_name
is a required hyperparameter. Breaks UNO and might other Benchmarks also.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.