autonomio / talos Goto Github PK
View Code? Open in Web Editor NEWHyperparameter Experiments with TensorFlow and Keras
Home Page: https://autonom.io
License: MIT License
Hyperparameter Experiments with TensorFlow and Keras
Home Page: https://autonom.io
License: MIT License
Condition Check:
[x ] I'm up-to-date with the latest release:
pip install -U talos
[x ] I've confirmed that my Keras model works outside of Talos.
If you still have an error, please submit complete trace and a code with:
You can provide the code in pastebin / gist or any other format you like.
I want to perform Hyperparameter Optimization on my Keras Model. The problem is the dataset is quite big, normally in training I usefit_generator
to load the data in batch from disk, but the Talos only support fit
method.
I tried to load the whole data to memory, by using this:
train_generator = train_datagen.flow_from_directory(
original_dir,
target_size=(img_height, img_width),
batch_size=train_nb,
class_mode='categorical')
X_train,y_train = train_generator.next()
But the when performing talos.Scan()
, the OS kills it because of large memory usage. I also tried to undersampling my dataset to only 10%, but it's still too big.
I saw that the issue #11 is being working on, but I wonder is there any workaround strategy to perform Hyperparameter Opimization for large dataset in this case?
Right now performance.py works on the level of the mainline Hyperio program, and outside of Keras. This means it's not available at epoch level and therefore, is also not included in Keras reporting (the history object) or is not available to be used for EarlyStop or other callbacks. Once the exact same scoring functionality works as a Keras metric, then this is is resolved :)
Now that a lot of the issues are handled, I think the next big push is on putting the pieces together for the Validator() i.e what happens after Scan() and that leads to the information that is needed to train the production model finally.
I think that in #17 you had more or less nailed the outline of the approach, and I will follow that for now. We already have an objective measure for classification tasks in form of score_model, so I will focus on that use-case (class predictions) first.
For instance, trying to raise an issue...
I would wager that "How to raise an issue" is not a valid issue format! @mikkokotila I can look into this if you'd like. Let me know what you think.
Note that I've opened this issue by clicking "Open a regular issue"!
This is now handled in a way where both accuracy and loss are reported in terms of the peak epoch, which will automatically result in having a very high loss because it tends to report the first epoch loss. As an intermediate solution maybe can check the string label in the dictionary, and it it contains loss, take the minimum, and otherwise take the maximum. Accuracy is harder as it often does not have the word acc.
I'm not succeeded to import hyperio
I would enjoy if you will explain how to do it.
Because I am writing like you've done:
sys.path.insert(0, '/Users/mikko/Documents/GitHub/hyperio')
import hyperio as hy
and it didn't work
Thanks
Thanks so much for coming here to raise an issue. Please take a moment to 'check' the below boxes:
I'm up-to-date with the latest release:
pip install -U talos
I've confirmed that my Keras model works outside of Talos.
If you still have an error, please submit complete trace and a code with:
You can provide the code in pastebin / gist or any other format you like.
Needs descriptive doc strings.
Thanks so much for coming here to raise an issue. Please take a moment to 'check' the below boxes:
https://gist.github.com/guilhermesilveira/5e2f3ae4f60a2929897f8ef4da99a73e
The above example shows a sample usage of lr_normalizer
which seems to not be working as expected (I might be wrong, but the issue might be with the comparison itself). THe code should return 1 (1000/1000) but returns 1000.
Hi,
When using a continuous metric such as mse
, Scan
will throw an error when its _run
method is called. This happens when it calls epoch_entropy
from talos.metrics.entropy
which requires val_acc
to be in the history dictionary. This only happens if accuracy
is used as the metric. For example, if the selected metric is mae
, val_mean_absolute_error
will be in the history dictionary in place of val_acc
.
Kind regards,
Harry
These do not seem to work in Windows.
This is referenced in e6ecae7 but seems to not have been resolved. I also cannot find the commit where Scan.py
was changed to remove the additions from that commit. Figured it made sense to open an issue about it.
Current implementation runs successfully:
h = ta.Scan(np.concatenate((x_train, x_dev)),
np.concatenate((y_train, y_dev)),
params=p,
dataset_name='xray',
experiment_no='1',
model=general_model,
grid_downsample=0.0000025, # lots of parameters
val_split=0.2,
save_best_model=True)
# output
# 20 scans will take roughly 160 seconds
# Scan Finished!
But when attempting to load
from keras.models import load_model
mod = load_model('xray_1.h5')
raises
/usr/local/lib/python3.6/dist-packages/keras/models.py in load_model(filepath, custom_objects, compile)
266 model_config = f.attrs.get('model_config')
267 if model_config is None:
--> 268 raise ValueError('No model found in config file.')
269 model_config = json.loads(model_config.decode('utf-8'))
270 model = model_from_config(model_config, custom_objects=custom_objects)
ValueError: No model found in config file.
Has this been resolved already and I'm just missing something?
When running the Scan() method with the following parameters:
t = talos.Scan( x=x, y=y, model=model, params=p, dataset_name="ECG Classifier", experiment_no='10' )
and returns the following error:
The model needs to have Return in format "return history, model"
UnboundLocalError Traceback (most recent call last)
in ()
5 params=p,
6 dataset_name="ECG Classifier",
----> 7 experiment_no='10'
8 )~\Anaconda3\envs\ECG-Research\lib\site-packages\talos\scan.py in init(self, x, y, params, dataset_name, experiment_no, model, val_split, shuffle, search_method, save_best_model, reduction_method, reduction_interval, reduction_window, grid_downsample, reduction_metric, talos_log_name, debug)
75 self.result = []
76 while len(self.param_log) != 0:
---> 77 self._null = self._run()
78
79 self = result_todf(self)~\Anaconda3\envs\ECG-Research\lib\site-packages\talos\scan.py in _run(self)
91 print('The model needs to have Return in format "return history, model"')
92
---> 93 self.epoch_entropy.append(epoch_entropy((_hr_out)))
94 _hr_out = run_round_results(self, _hr_out)
95UnboundLocalError: local variable '_hr_out' referenced before assignment
From what I can tell, the seeming lack of a return method in my model function is triggering the first error, however the model function does have it:
def model(x_train, y_train, x_val, y_val, params):
mdl = Sequential()
input_shape = (10*60*fields.get('fs'), 1)
#input
mdl.add(Conv1D(params['input_neuron'],
input_shape=input_shape,
activation=params['activation'],
kernel_size=(params['kernel_size'],),
kernel_regularizer=l2(params['l2_regularizer']),
)
)
if params['batch_normalization'] is True:
mdl.add(BatchNormalization())
mdl.add(MaxPool1D(1))
mdl.add(Dropout(params['dropout']))
hidden_layers(mdl, params['hidden_layers'], params)
mdl.add(Flatten())
#output
mdl.add(Dense(1,
kernel_regularizer=l2(params['l2_regularizer']),
activation=params['last_activation']
)
)
mdl.compile(loss=params['losses'],
optimizer=params['optimizer'](),
metrics=['accuracy'])
tb_callback = TensorBoard(log_dir='./logs/log80' + str(9)
#+ '/' + str(run_num), write_graph=True, write_images=True
)
reduce_lr = ReduceLROnPlateau(monitor='val_loss',
factor=0.2,
patience=5, min_lr=0.001)
history = mdl.fit(x_train,
y_train,
epochs=params['epochs'],
#batch_size=18,
verbose=1,
validation_data=(x_val, y_val),
callbacks=
[
tb_callback,
reduce_lr
]
)
#~~~~return method here~~~~
return history, mdl
I'm not sure what the second, _hr_out error is pointing out.
As reference, the hidden_layers method just add more layers depending on the respective hyperparameter that I listed in the params dictionary, which looks like:
p = {
'input_neuron':(1,5,1),
'hidden_layers':(1,4,1),
'dropout':[p/10 for p in range(0, 6)],
'optimizer':[Adam, SGD, rmsprop, adagrad],
'losses':[binary_crossentropy],
'activation':[relu, elu],
'last_activation':[sigmoid],
'kernel_size':(75,175,25),
'l2_regularizer':[p/10 for p in range(0, 10)],
'filters':(1,9,1),
'batch_normalization':[True,False],
'epochs':(50,250,50)
}
Thanks so much for coming here to raise an issue. Please take a moment to 'check' the below boxes:
I'm up-to-date with the latest release:
pip install -U talos
I've confirmed that my Keras model works outside of Talos.
If you still have an error, please submit complete trace and a code with:
You can provide the code in pastebin / gist or any other format you like.
If I am not making a mistake, the current example does not fit with the latest version. Can you please update the documentation on calling Scan?
Thank you,
import talos as ta
h = ta.Scan(X, Y, params=p, experiment_name='first_test', model=drug2gene_model, grid_downsample=0.5)
TypeError Traceback (most recent call last)
in ()
1 import talos as ta
----> 2 h = ta.Scan(X, Y, params=p, experiment_name='first_test', model=drug2gene_model, grid_downsample=0.5)
TypeError: init() got an unexpected keyword argument 'experiment_name'
[ x] I'm up-to-date with the latest release:
pip install --upgrade --user git+https://github.com/autonomio/talos.git@daily-dev
[x ] I've confirmed that my Keras model works outside of Talos.
When I run
h = ta.Scan(X_train, Y_train, params=p, dataset_name="debug", experiment_no="1", model=keras_nn_model_talos, grid_downsample=0.002, talos_log_name="talos.log", reduction_method="spear", reduction_metric="val_fbeta_score_acc")
I get the following error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-9-b4cbea7ca6f1> in <module>()
8 'second_GRU_layer':[True, False]}
9 h = ta.Scan(X_train, Y_train, x_val=X_dev, y_val=Y_dev, params=p, dataset_name="debug", experiment_no="1",
---> 10 model=keras_nn_model_talos, grid_downsample=0.002, talos_log_name="talos.log", reduction_method="spear", reduction_metric="fbeta_score")
11
12 ## I had to edit a line of ~/anaconda3/envs/tfgpu-keras/lib/python3.6/site-packages/talos/metrics/score_model.py
~/anaconda3/envs/tfgpu-keras/lib/python3.6/site-packages/talos/scan/Scan.py in __init__(self, x, y, params, dataset_name, experiment_no, model, x_val, y_val, val_split, shuffle, search_method, reduction_method, reduction_interval, reduction_window, grid_downsample, reduction_threshold, reduction_metric, round_limit, talos_log_name, debug, seed, clear_tf_session, disable_progress_bar)
140 # input parameters section ends
141
--> 142 self._null = self.runtime()
143
144 def runtime(self):
~/anaconda3/envs/tfgpu-keras/lib/python3.6/site-packages/talos/scan/Scan.py in runtime(self)
145
146 self = scan_prepare(self)
--> 147 self = scan_run(self)
~/anaconda3/envs/tfgpu-keras/lib/python3.6/site-packages/talos/scan/scan_run.py in scan_run(self)
27 disable=self.disable_progress_bar)
28 while len(self.param_log) != 0:
---> 29 self = rounds_run(self)
30 self.pbar.update(1)
31 self.pbar.close()
~/anaconda3/envs/tfgpu-keras/lib/python3.6/site-packages/talos/scan/scan_run.py in rounds_run(self)
59
60 _hr_out = run_round_results(self, _hr_out)
---> 61 self._val_score = get_score(self)
62 write_log(self)
63 self.result.append(_hr_out)
~/anaconda3/envs/tfgpu-keras/lib/python3.6/site-packages/talos/metrics/score_model.py in get_score(self)
15
16 try:
---> 17 y_pred = self.keras_model.predict_classes(self.x_val)
18 # y_pred = self.keras_model.predict(self.x_val)
19 return Performance(y_pred, self.y_val, self.shape, self.y_max).result
AttributeError: 'Model' object has no attribute 'predict_classes'
Which can seemingly be fixed simply by changing
talos/metrics/score_model.py line 17
from y_pred = self.keras_model.predict_classes(self.x_val)
to y_pred = self.keras_model.predict(self.x_val)
My params dictionary and model:
p = {'adam_learning_rate': [0.01, 0.001, 0.0001],
'num_filters': [12, 32, 64, 196],
'gru_hidden_units':[32, 64, 128, 196],
'dropout_rate':[0.2,0.5,0.8],
'batch_size': [64, 128, 256],
'epochs': [3],
'second_GRU_layer':[True, False]}
def keras_nn_model_talos(x_train, y_train, x_val, y_val, params):
#https://stackoverflow.com/questions/43547402/how-to-calculate-f1-macro-in-keras
def my_recall_acc(y_true, y_pred):
"""Recall metric.
Only computes a batch-wise average of recall.
Computes the recall, a metric for multi-label classification of
how many relevant items are selected.
"""
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
recall = true_positives / (possible_positives + K.epsilon())
return recall
def my_precision_acc(y_true, y_pred):
"""Precision metric.
Only computes a batch-wise average of precision.
Computes the precision, a metric for multi-label classification of
how many selected items are relevant.
"""
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
precision = true_positives / (predicted_positives + K.epsilon())
return precision
def f1_acc(y_true, y_pred):
precision = my_precision_acc(y_true, y_pred)
recall = my_recall_acc(y_true, y_pred)
return 2*((precision*recall)/(precision+recall+K.epsilon()))
X_input = Input(shape = x_train.shape[1:])
# Step 1: CONV layer
X = Conv1D(filters=int(params["num_filters"]), kernel_size=15,strides=4)(X_input) # CONV1D
X = BatchNormalization()(X) # Batch normalization
X = Activation('relu')(X) # ReLu activation
X = Dropout(rate=params["dropout_rate"])(X) # dropout (use 0.8)
# Step 2: First GRU Layer
X = GRU(units = int(params["gru_hidden_units"]), return_sequences = True)(X) # GRU (use 128 units and return the sequences)
X = Dropout(rate=params["dropout_rate"])(X) # dropout (use 0.8)
X = BatchNormalization()(X) # Batch normalization
if params["second_GRU_layer"]:
# Step 3: Second GRU Layer
X = GRU(units = int(params["gru_hidden_units"]), return_sequences = True)(X) # GRU (use 128 units and return the sequences)
X = Dropout(rate=params["dropout_rate"])(X) # dropout (use 0.8)
X = BatchNormalization()(X) # Batch normalization
X = Dropout(rate=params["dropout_rate"])(X) # dropout (use 0.8)
# Step 4: Time-distributed dense layer
X = TimeDistributed(Dense(1, activation = "sigmoid"))(X) # time distributed (sigmoid)
model = Model(inputs = X_input, outputs = X)
opt = Adam(lr=params["adam_learning_rate"], beta_1=0.9, beta_2=0.999, decay=0.01)
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=["acc", my_recall_acc, my_precision_acc, f1_acc])
history = model.fit(x_train, y_train, batch_size = int(params["batch_size"]),
validation_data=(x_val, y_val),
epochs=int(params["epochs"]))
return history, model
X_train.shape, Y_train.shape, X_dev.shape, Y_dev.shape
((1800, 201, 41), (1800, 47, 1), (400, 201, 41), (400, 47, 1))
See Insights -> Community
.
For some reason, the CONTRIBUTING.md
and ISSUE_TEMPLATE.md
files are not showing up in the Talos community profile.
Not sure if this is a big deal, but maybe the project gets more visibility if these things are detected as completed. Should be relatively easy to fix this, but I don't want to overstep anyone if there's a good reason these are not in the community profile.
@mikkokotila thoughts? In principle, we should complete this profile as well.
We can add this as optional code for contributors to download to prevent accidental pushes to master
(or even direct commits on master
) from the command line. There is a way to install it on remote but I don't know how. I've made this mistake before on some of my personal repositories where it didn't matter, but here it will!
I have the code lying around somewhere. When I find it I'll post it and we can decide where to put it if at all. Maybe in the wiki or something?
It seems that always when there is the message:
UnboundLocalError: local variable '_hr_out' referenced before assignment
...there is an issue with the model. The issue is that the error on the model level is kind of swallowed by this, so indeed it is related with the way errors are handled. This could be a major annoyance as the user thinks something is wrong with the backend, even thought its something they themselves can resolve.
no matter what I change in the parameters i get
"TypeError: include and exclude must both be non-string sequences"
which i assume is cause by rows containing strings such
"<class 'keras.optimizers.Adam'>"
is this due to me using my own Adam optimizer?
other columns include
kernel_initializer
activations
losses
..so on
If the metric name does not have word 'acc' in it, the lowest value will be taken.
The following calls need to have proper docstrings:
It's still referencing to hyperio (the initial name for the package).
It seems that the only thing that needs to change is the way validation split is now handled internally. Two options:
Then the user will have to pass data in to Scan() slightly differently as well. So this needs to be thought about a little.
Also it seems that Keras fit_generator has some memory (leakish) issue which have been reported in many instances, so this will have to be looked in to as well.
On line 232 in Scan() there is the command K.clear_session()
which as far as I remember is there only for the sake of dealing with the memory leakage problem in old versions of TensorFlow. Using this is not possible to use word embeddings, so I've removed it and embeddings work just fine.
The other thing is that I'm passing the embeddings as a parameter, which is kind of useful as it allows trying several different embeddings.
This is fixed in dev today I think.
Thanks so much for coming here to raise an issue. Please take a moment to 'check' the below boxes:
I'm up-to-date with the latest release:
pip install -U talos
I've confirmed that my Keras model works outside of Talos.
If you still have an error, please submit complete trace and a code with:
You can provide the code in pastebin / gist or any other format you like.
Running the breast_cancer example with search_method='linear'
throws the error below.
Pastebin
I believe at handling.py:22
_choice = self.param_log.min()
should be
_choice = min(self.param_log)
since param_log
is a list, not a numpy array.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-13-bc599657bb69> in <module>()
63 params=p,
64 dataset_name='breast_cancer',
---> 65 experiment_no='k')
/usr/local/lib/python3.7/site-packages/talos/scan.py in __init__(self, x, y, params, dataset_name, experiment_no, model, val_split, shuffle, search_method, save_best_model, reduction_method, reduction_interval, reduction_window, grid_downsample, reduction_metric, talos_log_name, debug)
75 self.result = []
76 while len(self.param_log) != 0:
---> 77 self._null = self._run()
78
79 self = result_todf(self)
/usr/local/lib/python3.7/site-packages/talos/scan.py in _run(self)
84 def _run(self):
85
---> 86 round_params(self)
87
88 try:
/usr/local/lib/python3.7/site-packages/talos/parameters/handling.py in round_params(self)
6 '''PICK PARAMETERS FOR ROUND'''
7
----> 8 self, _p = run_param_pick(self)
9 self.params = run_param_todict(self, _p)
10
/usr/local/lib/python3.7/site-packages/talos/parameters/handling.py in run_param_pick(self)
20
21 elif self.search_method == 'linear':
---> 22 _choice = self.param_log.min()
23
24 elif self.search_method == 'reverse':
AttributeError: 'list' object has no attribute 'min'
I keep getting this tedious error:
ValueError: Error when checking input: expected gru_1_input to have 3 dimensions, but got array with shape (35361, 21)
Tried reshaping in every way but the error continues to come over... In my project I was using batch generator and fit_generator as model training method but this isn't yet supported by this library, so I'm trying to use Talos without them
p = {'lr': (2, 10, 30),
'first_neuron':[4, 8, 16, 32, 64, 128],
'hidden_layers':[1,2,3,4,5,6],
'batch_size': [2, 3, 4],
'epochs': [300],
'dropout': (0, 0.40, 10),
'optimizer': [Adam],
'losses': [mae],
'activation':[relu, elu],
'last_activation': [sigmoid, softmax]}
def iris_model(x_train, y_train, x_val, y_val, params):
model = Sequential()
model.add(GRU(params['first_neuron'], input_shape=(None, x_train.shape[1]), activation=params['activation']))
model.add(Dropout(params['dropout']))
model.add(Dense(y_train.shape[1], activation=params['last_activation']))
model.compile(optimizer=params['optimizer'](),
loss=params['losses'],
metrics=['acc'])
history = model.fit(x_train, y_train,
batch_size=params['batch_size'],
epochs=params['epochs'],
verbose=0,
validation_data=[x_val, y_val])
return history, model
#x_train = x_train.reshape(1, x_train.shape[0], x_train.shape[1])
#y_train = y_train.reshape(1, y_train.shape[0], y_train.shape[1])
print(x_train_scaled.shape)
print(y_train_scaled.shape)
h = ta.Scan(x_train,
y_train,
params=p,
dataset_name='first_test',
experiment_no = '1',
model=iris_model,
grid_downsample=0.5,
val_split=0.1,
shuffle = False,
reduction_metric='val_loss')
Shapes output:
X=> (39291, 21)
Y=> (39291, 3)
See #51 for more details.
Could use some more detailed comments. @mikkokotila if you let me know what TestLoadDatasets actually does let me know and I can comment the rest. I did write the code for it but only because it was included in the original testing suite. I still am not sure what it does. ๐
Also, are you done refactoring Reporting? I probably should wait until you are before I dig into it right?
Have you guys looked at Bayesian Optimisation? There is already a library you could integrate with
https://github.com/shibuiwilliam/keras_gpyopt
I assume your just doing grid search but if you abstracted out the optimiser to an interface, so one could choose Generic Algorithms or Bayes or X, Talos could become the Keras of hyper-parameter tuning :)
Perhaps tqdm?
Thanks so much for coming here to raise an issue. Please take a moment to 'check' the below boxes:
I'm up-to-date with the latest release:
pip install -U talos
I've confirmed that my Keras model works outside of Talos.
If you still have an error, please submit complete trace and a code with:
You can provide the code in pastebin / gist or any other format you like.
After fixing #30, I noticed that the first result is not saved to the .csv file.
The code here gives the following output:
Using TensorFlow backend.
2 scans will take roughly 0 seconds
Scan Finished!
csv output has only 2 lines instead of 3
round_epochs,val_loss,val_acc,val_fmeasure,loss,acc,fmeasure,first_neuron,hidden_layers,batch_size,epochs,dropout,kernel_initializer,optimizer,losses,activation,last_activation
2,0.19118814039648624,0.906432733201144,0.5248427495621798,0.2075809771241854,0.8542713519915863,0.4982654840203386,10,0,30,2,0,uniform,<class 'keras.optimizers.Nadam'>,<function mean_squared_error at 0x10b3162f0>,<function relu at 0x10bc2af28>,<function sigmoid at 0x10bc2d0d0>
I believe the culprit is the logic in results.py:run_round_results, which only returns the header if round_counter==0
. I think it should return both the header and first result, or the logic in scan.py
has to be changed.
If one wishes to augment their training data only (which is good practice) then this will just make the process easier.
We could add a data augmentation program into Talos, but since people augment their data in a variety of ways this could take time.
The first option is actually very easy. I will try to implement it.
Edit: I should clarify that this would be an optional input. Not like you need to get rid of the way its working now.
Hi!
I have a very little dataset consisting of 59 images of medical data.
I am classifying them with a Keras model of six layers. With such a little dataset, I have to use Leave One Out cross validation, being my objective function the accuracy obtained not on one run of the cross validation but the mean of all the 59 runs of the cv (I train 58 images, I test the first image; and so on with the second, third... 59th image).
Is it possible to optimize my model with Talos when I'm in this situation?
Thank you!
Would be good to have a few benchmarks datasets where we know the "gold standard" result, and then use Talos with a very broad starting boundary and using the reducer approach to show how fast it is to get to the "gold standard" result assuming poor knowledge of hyperparameters as a starting point.
@x94carbone do you have suggestions for such datasets?
If you check out the way shapes works (it's directly from Autonomio codes but made to work without the funny function name handling from strings)...it looks like there are some issues definitely. For example stairs just seems to minus 1 from the previous. So if first layer is 64, next is 63, 62 and then 61...and then last is 1. If you check the https://coveralls.io/builds/16994055/source?filename=talos/model/shapes.py you can see that many lines are skipped entirely even though that shape is invoked. Maybe there is a hint.
Could you look in to this?
For some reason when the parameters are handled in Windows, they take the type of numpy.str_ as opposed to 'float' in linux based systems. This then creates problems but strangely enough only when hidden_layers is invoked.
import talos as ta
from talos.model import hidden_layers
from talos.metrics.keras_metrics import fmeasure_acc
from keras.models import Sequential
from keras.layers import Dense
# load the data
dataset = pd.read_csv('https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv')
X = dataset.values[:,0:8]
Y = dataset.values[:,8]
# create the model
def diabetes_model(x_train, y_train, x_val, y_val, params):
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
# this line shows the issue
print(type(params['dropout']))
hidden_layers(model, params, 1)
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy', fmeasure_acc])
history = model.fit(x_train, y_train, verbose=0,
validation_data=[x_val, y_val],
epochs=params['epochs'], batch_size=params['batch_size'])
return history, model
p = {
'dropout':[0.1],
'batch_size': (2, 50, 10),
'epochs': [25, 50, 100],
'hidden_layers':[2],
'activation':['relu']
}
ta.Scan(x=X,
y=Y,
params=p,
dataset_name='solver_diabetes',
experiment_no='random_1',
model=diabetes_model)
Obviously this is something that could be done later on since it is not exactly something I'd expect the average user to use. However, tensorflow/Keras makes it very easy to implement parallel GPU training when available. I already have the code to do this cleanly. Its just a matter of how (and if) to implement this.
In principle this can only be a good thing since tensorflow/Keras multi_gpu_model
is essentially straightforward (I struggled with it for so long ๐) if you know what you're doing with it. The only level of parallelism is at the batch level, so everything in Talos could work on top of it.
@mikkokotila thoughts on this? Google.colab already accelerates any training I've seen done with Keras massively without any changes to code whatsoever, but to my knowledge you only get one GPU at a time. People with access to more could experience up to an 8-fold decrease in training time with a feature like this.
Just wondering if the test script is supposed/designed to work on anyone's machine in particular.
For instance, running it on mine (just trying to debug a bit before pushing to my fork), I get:
Using TensorFlow backend.
2018-07-25 16:17:54.691759: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU
supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
8 scans will take roughly 0 seconds
[1] 18118 segmentation fault python3 test_script.py
Sorta strange. The CPU info it spits out is normal for my machine. Not used to a seg fault though. I've been looking through the test script to see where this might pop up. It seems to have to do with the number of scans I implement. For instance, implementing 8 scans on the first Iris model seg faults after four runs (meaning four calls to _run
. However, changing the list shapes
from
'shapes': ['stairs', 'triangle', 'hexagon', 'diamond', 'brick', 'long_funnel', 'rhombus', 'funnel']
to
'shapes': ['stairs', 'triangle', 'hexagon']
causes no exit on the first Iris test (loops 3 time successfully) and then seg faults after the first run on the second Iris model.
Not sure if this is an issue with my machine or the test script since Talos works fine otherwise. Just thought I'd ask. @mikkokotila feel free to close it if its a nonissue. Thanks for your input!
An example of this would be when the user is not inputting the number of neurons on the output layer to hidden_layers() dense layer generator.
ValueError: Could not interpret optimizer identifier: <class 'keras.optimizers.Adam'> comes when no lr values are included in the scan, and where optimizer=optimizer['params] need to actually have the () behind it.
raise IndexError("single positional indexer is out-of-bounds") IndexError: single positional indexer is out-of-bounds > in the case when the grid downsampling reduces the sample to < 1 permutations
axis=self.obj._get_axis_name(axis))) KeyError: 'the label [val_acc] is not in the [index]' > reduction method will reduce to less than 1 in the first round (?!)
KeyError when one of the required parameters (like dropout) is not a single value of 0 in the dictionary.
The general case where something is wrong with the model that is being inputted (comes out as Unbound...)
AttributeError: 'Scan' object has no attribute '_reduce_keys' in the case where reduction_method is not stated or is stated wrongly
Contributor guidelines are totally missing.
Made some progress and got the scan running, initial problems were due to giving single integers as parameter and the scan was trying to iterate over a list of parameter values, but now I've run to a new error which is caused propably my model declaration (using model = Model(), in examples you use model=Sequential() etc.).
I take predictions from the model this way:
autoencoder.predict(x=[train_x, np.array(train_items, dtype=np.int32).reshape(len(train_items), 1)])
[ x] I'm up-to-date with the latest release:
pip install -U talos
[ x] I've confirmed that my Keras model works outside of Talos.
AttributeError Traceback (most recent call last)
in ()
7 params=p,
8 dataset_name='movie_lens',
----> 9 experiment_no='1')C:\ProgramData\Anaconda3\envs\DataScienceEnv\lib\site-packages\talos\scan\Scan.py in init(self, x, y, params, dataset_name, experiment_no, model, x_val, y_val, val_split, shuffle, search_method, reduction_method, reduction_interval, reduction_window, grid_downsample, reduction_threshold, reduction_metric, round_limit, talos_log_name, debug, seed, clear_tf_session, disable_progress_bar)
140 # input parameters section ends
141
--> 142 self._null = self.runtime()
143
144 def runtime(self):C:\ProgramData\Anaconda3\envs\DataScienceEnv\lib\site-packages\talos\scan\Scan.py in runtime(self)
145
146 self = scan_prepare(self)
--> 147 self = scan_run(self)C:\ProgramData\Anaconda3\envs\DataScienceEnv\lib\site-packages\talos\scan\scan_run.py in scan_run(self)
27 disable=self.disable_progress_bar)
28 while len(self.param_log) != 0:
---> 29 self = rounds_run(self)
30 self.pbar.update(1)
31 self.pbar.close()C:\ProgramData\Anaconda3\envs\DataScienceEnv\lib\site-packages\talos\scan\scan_run.py in rounds_run(self)
59
60 _hr_out = run_round_results(self, _hr_out)
---> 61 self._val_score = get_score(self)
62 write_log(self)
63 self.result.append(_hr_out)C:\ProgramData\Anaconda3\envs\DataScienceEnv\lib\site-packages\talos\metrics\score_model.py in get_score(self)
15
16 try:
---> 17 y_pred = self.keras_model.predict_classes(self.x_val)
18 return Performance(y_pred, self.y_val, self.shape, self.y_max).result
19AttributeError: 'Model' object has no attribute 'predict_classes'
Although it is true that Python 3 is slowly becoming the new standard (at least in my experience), lots of people still use Python 2. Is there anything in the code here that might not be backwards compatible? Keras is compatible with at worst 2.7.
Note this might be as simple as requesting that Travis build with Python 2.7 and then just debugging from a forked origin/dev branch.
This is really not an issue in the sense that the case where all labels are 0 seems not to be relevant for training a model, but the error might be good to handle in a meaningful way (which warns about all labels being 0).
I'm trying to follow the notebook but I have the following issue:
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-54-45124b1b47d4> in <module>()
7 from keras.layers import Conv1D, GlobalMaxPooling1D, Dense, Softmax
8 import hyperio as hy
----> 9 from hyperio import early_stopper
ImportError: cannot import name 'early_stopper'
@x94carbone thanks very much for suggesting this. Any contributions you would like to make in this respect are definitely most welcome.
Do you already have an idea what signals we would need to feed into an evolutionary algorithm and if yes, in which format those signals are needed out of Talos as part of the Scan() stage.
Thanks so much for coming here to raise an issue. Please take a moment to 'check' the below boxes:
I'm up-to-date with the latest release:
pip install -U talos
I've confirmed that my Keras model works outside of Talos.
If you still have an error, please submit complete trace and a code with:
You can provide the code in pastebin / gist or any other format you like.
Hey, I am new to python and programming just trying to learn autoencoders. I am trying to optimize my denoising autoencoder. But i am getting this error.
self.epoch_entropy.append(epoch_entropy((_hr_out)))
UnboundLocalError: local variable '_hr_out' referenced before assignment.
Cant figure out what needs to be done.
The code i am using is:
x_train_analytical, x_test_analytical = train_test_split( data_analytical,
test_size=0.20,
random_state=15)
x_train_experimental, x_test_experiment = train_test_split( data_experiment,
test_size=0.20,
random_state=15)
def AE(x_train_analytical,x_train_experimental,x_test_analytical,x_test_experiment,params):
#Denoising Autoencoder
regulariser_value = 0.00015
Input_Experiment = Input(x_train_analytical.shape[1])
encoded = Dense((params['first_neuron']), activation = params['activation'],
activity_regularizer=regularizers.l1(regulariser_value))(Input_Experiment)
decoded = Dense(500, activation=params['last_activation'])(encoded)
autoencoder = Model(Input_Experiment, decoded)
encoder = Model(Input_Experiment, encoded)
encoded_input = Input(shape=(encoding_dim,))
decoder_layer = autoencoder.layers[-1]
decoder = Model(encoded_input, decoder_layer(encoded_input))
model.compile(optimizer='adadelta', loss='mean_squared_error', metrics = ['acc'])
from keras import callbacks
callback = callbacks.TensorBoard(log_dir=r'xxxxxxxxxxxxxxxx/xxxxxxxxxxx',
histogram_freq=0,
write_graph=True,
write_images=True)
history = model.fit(x_train_analytical,x_train_experimental, epochs=params['epochs'],
batch_size = params['batch_size'],
shuffle = True,
validation_data=(x_test_analytical,x_test_experiment),
callbacks=[callback])
return history, model
t = ta.Scan(x = x_train_analytical, y = x_train_experimental,
model = AE, grid_downsample = 0.01,
params = p, dataset_name ='model_data' , experiment_no='1', shuffle = False )
Can anyone help me out in ?
This has to do with several things:
This was previously on by default without the ability to turn it off in Talos. Now it's made into a parameter in Scan() clear_tf_session which is set by default to False.
Related with the second case which relates with using a saved model to predict later, here is an active discussion on the topic in Keras issues.
When I import from talos.metrics.keras_metrics import fbeta_score
and compile the model with this metric, then run talos with the parameter reduction_metric="fbeta_score"
the output csv seems to list the the val_acc of the best epoch for val_acc, but only the first epoch's value for fbeta_score. This seems like something is going wrong, as if anything it should be producing the corresponding fbeta_score for that epoch I would have though.
I am not interested in accuracy due to class imbalance in my system, and the accuracy saturates after a few epochs, so I need to talos to store for each parameter combination either:
a) The result of the last epoch
b) Ideally the result with the best fbeta_score
Given that fbeta_score has been implemented, I assume this must be possible but I don't see how.
I am using the latest dev branch v0.2 (as I have augmented data, I needed the functionality to supply x_val and y_val as parameters). In order to run this code without bugs, I needed to change;
talos/metrics/score_model.py
line 17
from y_pred = self.keras_model.predict_classes(self.x_val)
to y_pred = self.keras_model.predict(self.x_val)
Which might be related to my problem.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.