lyhue1991 / eat_tensorflow2_in_30_days Goto Github PK
View Code? Open in Web Editor NEWTensorflow2.0 🍎🍊 is delicious, just eat it! 😋😋
License: Apache License 2.0
Tensorflow2.0 🍎🍊 is delicious, just eat it! 😋😋
License: Apache License 2.0
如果从二,DNN二分类模型
这部分开始运行代码
运行到这里
ds_train = tf.data.Dataset.from_tensor_slices((X[0:n*3//4,:],Y[0:n*3//4,:])) \
.shuffle(buffer_size = 1000).batch(20) \
.prefetch(tf.data.experimental.AUTOTUNE) \
.cache()
ds_valid = tf.data.Dataset.from_tensor_slices((X[n*3//4:,:],Y[n*3//4:,:])) \
.batch(20) \
.prefetch(tf.data.experimental.AUTOTUNE) \
.cache()
会出现NameError: name 'n' is not defined这个错误,我感觉您的意思训练集是总数据的75%,测试集是总数据的25%。
所以我建议改成
n = n_positive+n_negative
ds_train = tf.data.Dataset.from_tensor_slices((X[0:n*3//4,:],Y[0:n*3//4,:])) \
.shuffle(buffer_size = 1000).batch(20) \
.prefetch(tf.data.experimental.AUTOTUNE) \
.cache()
ds_valid = tf.data.Dataset.from_tensor_slices((X[n*3//4:,:],Y[n*3//4:,:])) \
.batch(20) \
.prefetch(tf.data.experimental.AUTOTUNE) \
.cache()
`
在1-1章中, 作者使用到的y_test = dftest_raw['Survived'].values
,其中dftest_raw是没有Survived
这一列的, 这个时候会报错。
不知道作者使用的test data是官方的test data,还是从train data中分割一部分出来成为test data呢? 谢谢!
Thank you for very nice repository in advance.
bye the way, I found a link(https://zhuanlan.zhihu.com/p/67466552) was broken in chapter 1-2.
could you please fix it?
1-1可以正常运行,但是1-3就会报错
软件版本:
Ubuntu18.04
CUDA: 10.0
CuDNN: 7.6.5
TensorFlow-gpu: 2.1.0
报错信息:
2020-04-29 10:23:08.233741: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory
2020-04-29 10:23:08.233797: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory
2020-04-29 10:23:08.233803: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
2020-04-29 10:23:08.758262: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-04-29 10:23:08.764938: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-29 10:23:08.765330: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2070 computeCapability: 7.5
coreClock: 1.71GHz coreCount: 36 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 417.29GiB/s
2020-04-29 10:23:08.765483: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-04-29 10:23:08.766542: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-04-29 10:23:08.767329: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-04-29 10:23:08.767509: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-04-29 10:23:08.768612: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-04-29 10:23:08.769447: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-04-29 10:23:08.771963: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-04-29 10:23:08.772061: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-29 10:23:08.772396: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-29 10:23:08.772661: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-04-29 10:23:08.772903: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-04-29 10:23:08.797181: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3600000000 Hz
2020-04-29 10:23:08.797494: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x555c37cbd2a0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-04-29 10:23:08.797521: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-04-29 10:23:08.870114: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-29 10:23:08.870464: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x555c3850a540 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-04-29 10:23:08.870477: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce RTX 2070, Compute Capability 7.5
2020-04-29 10:23:08.870591: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-29 10:23:08.870863: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2070 computeCapability: 7.5
coreClock: 1.71GHz coreCount: 36 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 417.29GiB/s
2020-04-29 10:23:08.870889: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-04-29 10:23:08.870899: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-04-29 10:23:08.870908: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-04-29 10:23:08.870916: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-04-29 10:23:08.870925: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-04-29 10:23:08.870933: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-04-29 10:23:08.870941: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-04-29 10:23:08.870976: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-29 10:23:08.871252: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-29 10:23:08.871502: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-04-29 10:23:08.871523: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-04-29 10:23:08.872180: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-04-29 10:23:08.872188: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0
2020-04-29 10:23:08.872192: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N
2020-04-29 10:23:08.872253: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-29 10:23:08.872536: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-29 10:23:08.872803: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6900 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2070, pci bus id: 0000:01:00.0, compute capability: 7.5)
[b'the', b'and', b'a', b'of', b'to', b'is', b'in', b'it', b'i', b'this', b'that', b'was', b'as', b'for', b'with', b'movie', b'but', b'film', b'on', b'not', b'you', b'his', b'are', b'have', b'be', b'he', b'one', b'its', b'at', b'all', b'by', b'an', b'they', b'from', b'who', b'so', b'like', b'her', b'just', b'or', b'about', b'has', b'if', b'out', b'some', b'there', b'what', b'good', b'more', b'when', b'very', b'she', b'even', b'my', b'no', b'would', b'up', b'time', b'only', b'which', b'story', b'really', b'their', b'were', b'had', b'see', b'can', b'me', b'than', b'we', b'much', b'well', b'get', b'been', b'will', b'into', b'people', b'also', b'other', b'do', b'bad', b'because', b'great', b'first', b'how', b'him', b'most', b'dont', b'made', b'then', b'them', b'films', b'movies', b'way', b'make', b'could', b'too', b'any', b'after', b'characters']
Model: "cnn_model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) multiple 70000
_________________________________________________________________
conv_1 (Conv1D) multiple 576
_________________________________________________________________
maxpool_1 (MaxPooling1D) multiple 0
_________________________________________________________________
conv_2 (Conv1D) multiple 4224
_________________________________________________________________
maxpool_2 (MaxPooling1D) multiple 0
_________________________________________________________________
flatten (Flatten) multiple 0
_________________________________________________________________
dense (Dense) multiple 6145
=================================================================
Total params: 80,945
Trainable params: 80,945
Non-trainable params: 0
_________________________________________________________________
2020-04-29 10:23:12.802249: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-04-29 10:23:12.968219: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-04-29 10:23:13.365572: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-29 10:23:13.378753: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-04-29 10:23:13.378835: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node cnn_model/conv_1/conv1d}}]]
[[Nadam/ReadVariableOp_3/_20]]
2020-04-29 10:23:13.378885: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node cnn_model/conv_1/conv1d}}]]
Traceback (most recent call last):
File "/home/huxiaoyang/PycharmProjects/eat_tensorflow2_in_30_days/1-3_text_data_modeling_process_example/example.py", line 170, in <module>
main()
File "/home/huxiaoyang/PycharmProjects/eat_tensorflow2_in_30_days/1-3_text_data_modeling_process_example/example.py", line 166, in main
train_model(model, ds_train, ds_test, epochs=6)
File "/home/huxiaoyang/PycharmProjects/eat_tensorflow2_in_30_days/1-3_text_data_modeling_process_example/example.py", line 148, in train_model
train_step(model, features, labels)
File "/home/huxiaoyang/miniconda3/envs/tf210/lib/python3.7/site-packages/tensorflow_core/python/eager/def_function.py", line 568, in __call__
result = self._call(*args, **kwds)
File "/home/huxiaoyang/miniconda3/envs/tf210/lib/python3.7/site-packages/tensorflow_core/python/eager/def_function.py", line 632, in _call
return self._stateless_fn(*args, **kwds)
File "/home/huxiaoyang/miniconda3/envs/tf210/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 2363, in __call__
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
File "/home/huxiaoyang/miniconda3/envs/tf210/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 1611, in _filtered_call
self.captured_inputs)
File "/home/huxiaoyang/miniconda3/envs/tf210/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 1692, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "/home/huxiaoyang/miniconda3/envs/tf210/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 545, in call
ctx=ctx)
File "/home/huxiaoyang/miniconda3/envs/tf210/lib/python3.7/site-packages/tensorflow_core/python/eager/execute.py", line 67, in quick_execute
six.raise_from(core._status_to_exception(e.code, message), None)
File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
(0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node cnn_model/conv_1/conv1d (defined at /PycharmProjects/eat_tensorflow2_in_30_days/1-3_text_data_modeling_process_example/example.py:71) ]]
[[Nadam/ReadVariableOp_3/_20]]
(1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node cnn_model/conv_1/conv1d (defined at /PycharmProjects/eat_tensorflow2_in_30_days/1-3_text_data_modeling_process_example/example.py:71) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_step_4773]
Function call stack:
train_step -> train_step
Process finished with exit code 1
Using a tf.Tensor
as a Python bool
is not allowed. Use if t is not None:
instead of if t:
to test if a tensor is defined, and use TensorFlow ops such as tf.cond to execute subgraphs conditioned on the value of a tensor.
class Model(models.Model):
def init(self):
super(Model, self).init()
def build(self, input_shape):
self.dense = tf.keras.layers.Dense(15, 20)
self.dense = tf.keras.layers.Dense(20, 10)
self.dense = tf.keras.layers.Dense(10, 1)
super(Model, self).build(input_shape)
def call(self, x):
x = self.dense(x)
x = tf.nn.relu(x)
x = self.dense(x)
x = tf.nn.relu(x)
x = self.dense(x)
x = tf.nn.sigmoid(x)
return (x)
model = Model()
print(model)
model.build(input_shape=(15,))
model.summary()
报错 :TypeError: Could not interpret activation function identifier: 20
def focal_loss(gamma=2., alpha=.25):
def focal_loss_fixed(y_true, y_pred):
pt_1 = tf.where(tf.equal(y_true, 1), y_pred, tf.ones_like(y_pred))
pt_0 = tf.where(tf.equal(y_true, 0), y_pred, tf.zeros_like(y_pred))
loss = -tf.sum(alpha * tf.pow(1. - pt_1, gamma) * tf.log(1e-07+pt_1)) \
-tf.sum((1-alpha) * tf.pow( pt_0, gamma) * tf.log(1. - pt_0 + 1e-07))
return loss
return focal_loss_fixed
提示 AttributeError: module 'tensorflow' has no attribute 'sum'
, 猜测应该更正为:
def focal_loss(gamma=2., alpha=.25):
def focal_loss_fixed(y_true, y_pred):
pt_1 = tf.where(tf.equal(y_true, 1), y_pred, tf.ones_like(y_pred))
pt_0 = tf.where(tf.equal(y_true, 0), y_pred, tf.zeros_like(y_pred))
loss = -tf.reduce_sum(alpha * tf.pow(1. - pt_1, gamma) * tf.math.log(1e-07+pt_1)) \
-tf.reduce_sum((1-alpha) * tf.pow( pt_0, gamma) * tf.math.log(1. - pt_0 + 1e-07))
return loss
return focal_loss_fixed
3-1 低阶API示范 构建数据管道迭代器data_iter(features, labels, batch_size=8)函数中,
yield tf.gather(X,indexs), tf.gather(Y,indexs)
是不是该写成
tf.gather(features,indexs), tf.gather(labels,indexs)
有个疑问,原生SavedModelBundle 、Session 类并没有实现serializable 接口,直接
val broads = sc.broadcast(bundle)
会报
Serialization stack: - object not serializable (class: org.tensorflow.SavedModelBundle, value: org.tensorflow.SavedModelBundle@6a1ebcff)
的异常,自己要修改原码增加 serializable 接口,要改不少代码,文中是如何做到这点的呢?
rt
tf serving预测会有错误帮忙看下
{ "error": "Malformed request: POST /v1/models/linear_model" }{ "error": "In[0] is not a matrix. Instead it has shape [3]\n\t [[{{node model/outputs/BiasAdd}}]]" }%
在创建了 Linear
类以后,第一次实例化这个类的时候 linear = Linear(units=8)
,系统报错。反复与原始代码比较,没发现不同的地方。
class Linear(layers.Layer):
def __init__(self, units=32, **kwargs):
super(Linear, self).__init__(**kwargs)
self.units = units
def build(self, input_shape):
self.w = self.add_weight('w', shape=(input_shape[-1], self.units),
initializer='random_normal',
trainable=True)
self.b = self.add_weight('b', shape=(self.units,),
initializer='random_normal',
trainable=True)
super(Linear, self).build(input_shape)
@tf.function
def call(self, inputs):
return tf.matmul(inputs, self.w) + self.b
def get_config(self):
config = super(Linear, self).get_config()
config.update({'units':self.units})
return config
linear = Linear(units=8)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-5-efa0d9cd5402> in <module>
----> 1 linear = Linear(units=8)
2 print(linear.built)
3 linear.build(input_shape=(None, 16))
4 print(linear.built)
TypeError: __call__() missing 1 required positional argument: 'inputs'
谢谢!
请问有使用PySpark 调用TF 模型的示例代码吗?
6-3里面有一句gpus = tf.config.list_physical_devices("GPU")
我在运行之后会报错,module 'tensorflow_core._api.v2.config' has no attribute 'list_physical_devices'
我改为tf.config.experimental.list_physical_devices("GPU")解决了,不知道其他人遇没遇见,建议可以修改一下。
I suggest a virtual environment for this tutorial
For one thing, it decouples the change in new relase of tf and the development environment we use. it saves authors' effort to answer tf version related problem and delegate them back to tf developers.
And it also saves readers effort to figure out missing package. e.g. When I run the 5-1, it tells me I miss the package pillow which is not explicitly imported.
Best
Neil
Unknown metric function:AUC
由于Windows和类Unix系统对于路径表示有差异,所以示例代码需要考虑兼容性才能在不同系统成功运行。以“1-2,图片数据建模流程范例”为例子,其中tf.strings.regex_full_match(img_path, "./automobile/.")就需要改为tf.strings.regex_full_match(img_path, ".automobile."),以及logdir = "./data/keras_model/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")也需要改为用os.path.join()函数连接的形式而不是硬编码。
源文档中:
Epoch=1,Loss:0.442317516,Accuracy:0.7695,Valid Loss:0.323672801,Valid Accuracy:0.8614
Epoch=2,Loss:0.245737702,Accuracy:0.90215,Valid Loss:0.356488883,Valid Accuracy:0.8554
Epoch=3,Loss:0.17360799,Accuracy:0.93455,Valid Loss:0.361132562,Valid Accuracy:0.8674
Epoch=4,Loss:0.113476314,Accuracy:0.95975,Valid Loss:0.483677238,Valid Accuracy:0.856
Epoch=5,Loss:0.0698405355,Accuracy:0.9768,Valid Loss:0.607856631,Valid Accuracy:0.857
Epoch=6,Loss:0.0366807655,Accuracy:0.98825,Valid Loss:0.745884955,Valid Accuracy:0.854
我复现后:
Epoch=1,Loss:0.679053724,Accuracy:0.55235,Valid Loss:0.572207093,Valid Accuracy:0.717
Epoch=2,Loss:0.467248648,Accuracy:0.7762,Valid Loss:0.491477,Valid Accuracy:0.7588
Epoch=3,Loss:0.349681437,Accuracy:0.8475,Valid Loss:0.514342368,Valid Accuracy:0.7628
Epoch=4,Loss:0.278649092,Accuracy:0.8863,Valid Loss:0.564446032,Valid Accuracy:0.763
Epoch=5,Loss:0.2197005,Accuracy:0.9159,Valid Loss:0.643948495,Valid Accuracy:0.7548
Epoch=6,Loss:0.163983703,Accuracy:0.94135,Valid Loss:0.770707726,Valid Accuracy:0.7524
可以看到Valid Loss在逐渐上升
模型保存
model.save('./data/tf_model_savedmodel', save_format="tf")
经测试,只能以这种方式保存,不能保存成keras的h5形式
模型加载
model_loaded = tf.keras.models.load_model('./data/tf_model_savedmodel')
error
ValueError: Could not find matching function to call loaded from the SavedModel. Got:
Positional arguments (2 total):
* Tensor("x:0", shape=(None, 200), dtype=int32)
* Tensor("training:0", shape=(), dtype=bool)
Keyword arguments: {}
Expected these arguments to match one of the following 4 option(s):
Option 1:
Positional arguments (2 total):
* TensorSpec(shape=(None, 200), dtype=tf.int32, name='input_1')
* True
Keyword arguments: {}
Option 2:
Positional arguments (2 total):
* TensorSpec(shape=(None, 200), dtype=tf.int32, name='x')
* False
Keyword arguments: {}
Option 3:
Positional arguments (2 total):
* TensorSpec(shape=(None, 200), dtype=tf.int32, name='x')
* True
Keyword arguments: {}
Option 4:
Positional arguments (2 total):
* TensorSpec(shape=(None, 200), dtype=tf.int32, name='input_1')
* False
Keyword arguments: {}
成功加载
load_model = tf.saved_model.load('./data/saved_model')
但是这样加载的模型没有编译,无法直接使用model.xxx方法
目前解决方法
以tensorflow serving的docker形式部署saved_model 格式的模型
你好,请问我在用子类化方法构建模型的时候,想将该模型嵌入另一个子类化模型中,
第一种方式是将嵌入的模型写成继承Layer类的方法,然后重写get_config 方法,
第二种方式是将嵌入的模型写成继承Model类的方法重写compute_output_shape方法。
请问这两种方法效果是否是一样的?或者有什么区别?
history = model.fit(x_train,y_train,batch_size= 64,epochs= 30, validation_split=0.2)报错:
validation_split
is only supported for Tensors or NumPy arrays, found following types in the input: [<class 'pandas.core.frame.DataFrame'>]
时间序列数据集好像没有啊
这章的实例code好像在最新的tensorflow下不能用会遇到
Traceback (most recent call last):
File "demo.py", line 40, in <module>
train_model(model,epochs = 200)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py", line 608, in __call__
result = self._call(*args, **kwds)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py", line 678, in _call
return self._concrete_stateful_fn._filtered_call(canon_args, canon_kwds) # pylint: disable=protected-access
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 1665, in _filtered_call
self.captured_inputs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 1746, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 598, in call
ctx=ctx)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
(0) Internal: No unary variant device copy function found for direction: 1 and Variant type_index: tensorflow::data::(anonymous namespace)::DatasetVariantWrapper
[[{{node while_input_5/_12}}]]
[[Func/while/body/_1/while/cond/then/_78/input/_91/_52]]
(1) Internal: No unary variant device copy function found for direction: 1 and Variant type_index: tensorflow::data::(anonymous namespace)::DatasetVariantWrapper
[[{{node while_input_5/_12}}]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_model_342]
Function call stack:
train_model -> train_model
我把示例code中的visualization的部分都去掉以便于重现这个问题:
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers,losses,metrics,optimizers
n = 400
X = tf.random.uniform([n,2],minval=-10,maxval=10)
w0 = tf.constant([[2.0],[-3.0]])
b0 = tf.constant([[3.0]])
Y = X@w0 + b0 + tf.random.normal([n,1],mean = 0.0,stddev= 2.0)
ds = tf.data.Dataset.from_tensor_slices((X,Y)) \
.shuffle(buffer_size = 100).batch(10) \
.prefetch(tf.data.experimental.AUTOTUNE)
model = layers.Dense(units = 1)
model.build(input_shape = (2,))
model.loss_func = losses.mean_squared_error
model.optimizer = optimizers.SGD(learning_rate=0.001)
@tf.function
def train_step(model, features, labels):
with tf.GradientTape() as tape:
predictions = model(features)
loss = model.loss_func(tf.reshape(labels,[-1]), tf.reshape(predictions,[-1]))
grads = tape.gradient(loss,model.variables)
model.optimizer.apply_gradients(zip(grads,model.variables))
return loss
@tf.function
def train_model(model,epochs):
for epoch in tf.range(1,epochs+1):
loss = tf.constant(0.0)
for features, labels in ds:
loss = train_step(model,features,labels)
if epoch%50==0:
tf.print("epoch =",epoch,"loss = ",loss)
tf.print("w =",model.variables[0])
tf.print("b =",model.variables[1])
train_model(model,epochs = 200)
问题应该是出现再train_model这个function里。如果把train_model上的@tf.function去掉,则没有问题。难道原因是不能在tf function里操作tf.dataset?
我使用的是tensorflow的nightly build。谢谢
在1-1结构化数据建模流程范例,为什么input_shape=(15,),而不是x_train.shape 即input_shape=(891,15)
按照高阶API的线性回归模型中的示范,建模过程应该是:
(1)通过 models.Sequential()的方式构建模型
(2)add()添加网络层
(3)定义loss, metric, optimizer
(4)通过compile()的方式配置模型训练中的各种参数
(5)model.fit()的方式训练模型
但是在高阶API示例的DNN模型的建模中,使用的是build()的方式,整体框架基本跟中阶API示例中DNN模型的建模过程一模一样,感觉这个是不是给错了示例了。
Chapter 1-3,文本数据建模流程范例
# 构建词典
def clean_text(text):
...
tf.strings.regex_replace(stripped_html,
'[%s]' % re.escape(string.punctuation),'')
In re.escape(string.punctuation),''
, should ''
be this->' '
?
Otherwise, we'll get "himbut" from "him,but".
Additionally, I'm considering we should remove "'" from string.punctuation
.
Otherwise, we'll get "It's a good" from "it s a good".
def clean_text(text):
# A string include all punctuations which has been escaped by re.
# Use '\\' for escape of metacharacters.
escaped_punctuation = re.escape(string.punctuation.replace("'", ""))
lowercase = tf.strings.lower(text)
stripped_html = tf.strings.regex_replace(lowercase, '<br />', ' ')
cleaned_punctuation = tf.strings.regex_replace(stripped_html,
'[%s]' % escaped_punctuation, ' ')
return cleaned_punctuation
会报错如下
(0) Internal: No unary variant device copy function found for direction: 1 and Variant type_index: class tensorflow::data::`anonymous namespace'::DatasetVariantWrapper [[{{node while_input_4/_12}}]] (1) Internal: No unary variant device copy function found for direction: 1 and Variant type_index: class tensorflow::data::`anonymous namespace'::DatasetVariantWrapper [[{{node while_input_4/_12}}]] [[Func/while/body/_1/input/_60/_20]]
可能会有需求离线阅读
`@tf.function
def update_state(self,y_true,y_pred):
y_true = tf.cast(tf.reshape(y_true,(-1,)),tf.bool)
y_pred = tf.cast(100*tf.reshape(y_pred,(-1,)),tf.int32)
for i in tf.range(0,tf.shape(y_true)[0]):
if y_true[i]:
self.true_positives[y_pred[i]].assign(
self.true_positives[y_pred[i]]+1.0)
else:
self.false_positives[y_pred[i]].assign(
self.false_positives[y_pred[i]]+1.0)
return (self.true_positives,self.false_positives)`
在2.1中 输入应该添加 sample_weight=None, 而且返回值只能选择一个
estimator是tf从1到2一直延续的重要api,层级上来看应该属于高阶api,可以直接定义model。
是否考虑把这一部分加入书籍呢?
为什么把这部分丢弃,是出于什么考虑呢
#矩阵特征值
tf.linalg.eigvalsh(a) 是self-adjoint matrix 的特征值, 而不是一般矩阵的矩阵特征值
when i want to acces the chapters using the links i get a 404 error
example:
https://github.com/lyhue1991/eat_tensorflow2_in_30_days/blob/master/english/Chapter6-3.md
`#使用并行化预处理num_parallel_calls 和预存数据prefetch来提升性能
ds_train = tf.data.Dataset.list_files("./data/cifar2/train//.jpg")
.map(load_image, num_parallel_calls=tf.data.experimental.AUTOTUNE)
.shuffle(buffer_size = 1000).batch(BATCH_SIZE)
.prefetch(tf.data.experimental.AUTOTUNE)
ds_test = tf.data.Dataset.list_files("./data/cifar2/test//.jpg")
.map(load_image, num_parallel_calls=tf.data.experimental.AUTOTUNE)
.batch(BATCH_SIZE)
.prefetch(tf.data.experimental.AUTOTUNE) `
我经过处理后打印标签都相同,不知何处问题?
1-1,结构化数据建模流程范例
plot_metric(history,"AUC")
should be
plot_metric(history,"auc")
In 4-3, it is not a bug but just a question that I dont understand. Why we can do this model.add(Linear(units = 1,input_shape = (2,))) without this parameter in init method "input_shape = (2,)"
class Linear(layers.Layer):
def __init__(self, units=32, **kwargs):
# super(Linear, self).__init__(**kwargs)
super().__init__(**kwargs)
self.units = units
# The trainable parameters are defined in build method
# Since we do not need the input_shape except the build function,
# we do not need to store then in the __init__ function
def build(self, input_shape):
self.w = self.add_weight("w",shape=(input_shape[-1], self.units),
initializer='random_normal',
trainable=True) # Parameter named "w" is compulsory or an error will be thrown out
self.b = self.add_weight("b",shape=(self.units,),
initializer='random_normal',
trainable=True)
super().build(input_shape) # Identical to self.built = True
# The logic of forward propagation is defined in call method, and is called by __call__ method
@tf.function
def call(self, inputs):
return tf.matmul(inputs, self.w) + self.b
# Use customized get-config method to save the model as h5 format, specifically for the model composed through Functional API with customized Layer
def get_config(self):
config = super().get_config()
config.update({'units': self.units})
return config
tf.keras.backend.clear_session()
model = models.Sequential()
# Note: the input_shape here will be modified by the model, so we don't have to fill None in the dimension representing the number of samples.
model.add(Linear(units = 1,input_shape = (2,)))
print("model.input_shape: ",model.input_shape)
print("model.output_shape: ",model.output_shape)
model.summary()
tf2 加入了tfx的扩展支持,其中tfr我觉得是最可能在工程中用到。请问作者考虑加入这部分的教程?
典型的一个场景是:
里面对apache beam的整合,可以让我们将线下训练和线上serving的data pipeline统一起来。这样子我们的model只要消费pipeline给的数据就好了。
tensorflow serving 用docker运行报错。。
如果在windows上使用绝对路径时,需要写成类似
logdir = 'C:\xx\autograph\%s' %stamp
![image](https://user-images.githubusercontent.com/55381998/79407314-c5b0ac00-7fcb-11ea-8546-54e90495fbf1.png
标签全为0,导致后续训练正确率均为1.
Train for 100 steps, validate for 20 steps
Epoch 1/10
100/100 [==============================] - 16s 162ms/step - loss: 0.0116 - accuracy: 0.9904 - val_loss: 1.2626e-09 - val_accuracy: 1.0000
Epoch 2/10
100/100 [==============================] - 11s 106ms/step - loss: 5.7853e-09 - accuracy: 1.0000 - val_loss: 1.2602e-09 - val_accuracy: 1.0000
Epoch 3/10
100/100 [==============================] - 11s 105ms/step - loss: 5.7422e-09 - accuracy: 1.0000 - val_loss: 1.2595e-09 - val_accuracy: 1.0000
...
今天我发现 3-1,低阶API示范 和 eat_tensorflow2_in_30_days/3-1,低阶API示范.md 的内容不一样,猜测应该是因为同步过去有延迟,想了解一下同步过去需要多久?
在3-1低阶API示范中准备数据的时候有一条注释是:
具体位置在3-1低阶API示范的“一、线性回归模型”的“1、准备数据”的第一段程序片的最后一行,已附上图片不知道能不能显示
而在tensorflow的API(matmul )中却这样写道:
Since python >= 3.5 the @ operator is supported (see PEP 465). In TensorFlow, it simply calls the tf.matmul() function, so the following lines are equivalent:
d = a @ b @ [[10], [11]]
d = tf.matmul(tf.matmul(a, b), [[10], [11]])
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.