larq / zookeeper Goto Github PK

A small library for managing deep learning models, hyperparameters and datasets

License: Apache License 2.0

Python 100.00%

command-line-interface deep-learning hyperparameter keras machine-learning python tensorflow tensorflow-datasets

zookeeper's Introduction

Larq is an open-source deep learning library for training neural networks with extremely low precision weights and activations, such as Binarized Neural Networks (BNNs).

Existing deep neural networks use 32 bits, 16 bits or 8 bits to encode each weight and activation, making them large, slow and power-hungry. This prohibits many applications in resource-constrained environments. Larq is the first step towards solving this. It is designed to provide an easy to use, composable way to train BNNs (1 bit) and other types of Quantized Neural Networks (QNNs) and is based on the tf.keras interface. Note that efficient inference using a trained BNN requires the use of an optimized inference engine; we provide these for several platforms in Larq Compute Engine.

Larq is part of a family of libraries for BNN development; you can also check out Larq Zoo for pretrained models and Larq Compute Engine for deployment on mobile and edge devices.

Getting Started

To build a QNN, Larq introduces the concept of quantized layers and quantizers. A quantizer defines the way of transforming a full precision input to a quantized output and the pseudo-gradient method used for the backwards pass. Each quantized layer requires an input_quantizer and a kernel_quantizer that describe the way of quantizing the incoming activations and weights of the layer respectively. If both input_quantizer and kernel_quantizer are None the layer is equivalent to a full precision layer.

You can define a simple binarized fully-connected Keras model using the Straight-Through Estimator the following way:

model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(),
        larq.layers.QuantDense(
            512, kernel_quantizer="ste_sign", kernel_constraint="weight_clip"
        ),
        larq.layers.QuantDense(
            10,
            input_quantizer="ste_sign",
            kernel_quantizer="ste_sign",
            kernel_constraint="weight_clip",
            activation="softmax",
        ),
    ]
)

This layer can be used inside a Keras model or with a custom training loop.

Examples

Check out our examples on how to train a Binarized Neural Network in just a few lines of code:

Installation

Before installing Larq, please install:

Python version 3.7, 3.8, 3.9, or 3.10
Tensorflow version 1.14, 1.15, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, or 2.10:
```
pip install tensorflow  # or tensorflow-gpu
```

You can install Larq with Python's pip package manager:

pip install larq

About

Larq is being developed by a team of deep learning researchers and engineers at Plumerai to help accelerate both our own research and the general adoption of Binarized Neural Networks.

zookeeper's People

Contributors

Stargazers

Watchers

Forkers

nschaffner adamhillier sib1 jwfromm lilujunai v-i-s-h michea cuausuarez sfalkena

zookeeper's Issues

Flag syntax for boolean params

At the moment, CLI params are passed with arguments of the form key=value, meaning that boolean parameters are set with e.g. use_bias=True or use_bias=False.

It would be nice if we could have the option of passing boolean arguments with --use-bias or --no-use-bias to set that parameter to True and False respectively.

Fields don't correctly `skip` generations.

I.e. a field with value declared on a component won't be picked up on the grandchild component correctly unless the intermediate child component also declares the type annotation, which should not be necessary.

Zookeeper should pick up fields from annotated @property definitions

The work-around for now is to do:

field_name: SomeType

@property
def field_name(self):
    return ...

ComponentField does not support nested lambda functions

Describe the bug

When one has nested components, it becomes impossible to define attributes of multiple types on the innermost component. Consider the following example:

    model = ComponentField(
        CustomComponent,
        some_attribute=lambda: CustomChildComponent(
            some_strings=lambda: [string_1", "string_2"],
        ),
    )

This will throw TypeError: Field 'some_strings' of component 'SomeClass.model.some_attribute' is annotated with type 'typing.List[str]', which is not satisfied by value <function SomeClass.<lambda>.<locals>.<lambda> at 0x7f16c1850830>.

Changing the outermost lambda to a ComponentField or PartialComponent raises

TypeError: Keyword arguments passed to `PartialComponent` must be either:
- An immutable value (int, float, bool, string, or None).
- A function or lambda accepting no arguments and returning the 
  value that should be passed to the component upon instantiation.
- Another `PartialComponent`.Wrapping non-immutable values in a function / lambda allows the values to be lazily evaluated; they won't be created at all if the partial component is never instantiated.

To Reproduce

See the code above. Sadly I don't have much time right now but I can create some example classes at a later time if necessary, they should be quite simple.

Expected behavior

Nested lambda functions should be correctly resolved (or nested ComponentFields should be supported)

Environment

TensorFlow version: 2.0.0
Zookeeper version: 1.0b7

Remove epochs as explicit argument

Epochs should be part of hparams

Support proper Python syntax highlighting for `str`/`repr`

It would be nicer to use proper Python syntax highlighting. prompt-toolkit has support for this already. Though this is not something we need to look into right now.

Originally posted by @lgeiger in #91 (comment)

Non-components should raise an error when being configured

Describe the bug

A subclass of a component that itself does not have an @component decorator can be initialized with configure without any errors. It will then not act as expected, since it is not a component.

To Reproduce

@component
class BaseComponent:
    value: int = Field(1)

class SubclassedComponent(BaseComponent):
    value = Field(2)
    
component = SubclassedComponent()
configure(component, {})
print(component.value)  # prints 1
assert component.value == 2  # fails

Expected behavior

During the call to configure an error should be thrown that the component being configured is not a component.

Environment

Zookeeper version: 1.0b7

Drop Python 3.6 support

Pytest 7.1.0 (March 2022) has dropped support for Python 3.6, causing our CI to fail. As a result, we haven't updated Pytest since. Perhaps we should also drop Python 3.6 support in general or in CI only?

Any thoughts, @lgeiger?

Support accurate dataset length computation in `TFDSDataset` components in TF2.3

From the TF 2.3 release notes:

tf.data.experimental.cardinality is now a method on tf.data.Dataset.

tf.data.Dataset now supports len(Dataset) when the cardinality is finite.

We should make use of these functions here:

zookeeper/zookeeper/tf/dataset.py

Lines 114 to 117 in b0b9bcf

    
           def num_examples(self, split) -> int: 
        
               """Compute the number of examples in a given split.""" 
        
               return sum(self.splits[s].num_examples for s in base_splits(split))

Factory cannot access its parent component's Field

Describe the bug

In the official factory.py, there is an example which is

    @factory
    class F:
        a: int = Field()
        def build(self):
            return self.a + 4

    @component
    class C:
        a: int = Field(3)
        f: int = ComponentField(F)

    print(C().f)

I test the same code using the latest zookeeper vision

from zookeeper import factory, Field, ComponentField, component

@factory
class F:
    a: int = Field()
    def build(self) -> int:
       return self.a + 4

@component
class C:
    a: int = Field(3)
    f: int = ComponentField(F)

print(C().f)

But it raised an error:

Traceback (most recent call last):
  File "test.py", line 14, in <module>
    print(C().f)
  File "/usr/local/lib/python3.6/dist-packages/zookeeper/core/component.py", line 221, in wrapped_fn
    return result.build()
  File "/usr/local/lib/python3.6/dist-packages/zookeeper/core/factory.py", line 20, in wrapped_fn
    result = fn(factory_instance)
  File "test.py", line 7, in build
    return self.a + 4
  File "/usr/local/lib/python3.6/dist-packages/zookeeper/core/component.py", line 217, in wrapped_fn
    result = base_wrapped_fn(instance, name)
  File "/usr/local/lib/python3.6/dist-packages/zookeeper/core/component.py", line 194, in base_wrapped_fn
    raise e from None
  File "/usr/local/lib/python3.6/dist-packages/zookeeper/core/component.py", line 181, in base_wrapped_fn
    result = field.get_default(instance)
  File "/usr/local/lib/python3.6/dist-packages/zookeeper/core/field.py", line 146, in get_default
    f"Field '{self.name}' has no default or configured value."
AttributeError: Field 'a' has no default or configured value.

It seems that the factory didn't use it's parent component to configure its Field. I don't know what to do.

To Reproduce

Expected behavior

Environment

TensorFlow version: 2.3.1
Zookeeper version: 1.0.4

The ordering of nested component fields can cause configuration to be silently ignored

Describe the bug

Because configuration of the component tree is recursively done depth-first, there are circumstances in which a component instance that lives at two or more levels in the tree (i.e. is inherited somewhere) gets configured not at the top level (where it should be) but at a nested level. This causes any configuration values that were intended for this component, scoped to the top-level, to be silently dropped.

To Reproduce

    @component
    class GrandChild:
        a: int = Field(5)

    @component
    class Child:
        grand_child: GrandChild = ComponentField()

    @component
    class Parent:
        child: Child = ComponentField(Child)
        grand_child: GrandChild = ComponentField(GrandChild)

    p = Parent()
    configure(p, {"grand_child.a": 3})

    assert p.grand_child.a == 3
    assert p.child.grand_child.a == 3

Expected behavior

The above test case should pass.

Environment

TensorFlow version: 2.4.1
Zookeeper version: Current master

ComponentFields passed through the CLI don't get their Fields resolved

Describe the bug

In the code below, let's say it's called script.py, child.some_attribute should be inherited from the Parent class. This works just fine if you execute this with python script.py, but if you instead excute python script.py child=Child, it will throw the following error:

  File "script.py", line 16, in run
    print(f"child attribute: {self.child.some_attribute}")
   ...
  File "<local_path>/zookeeper/core/field.py", line 144, in get_default
    raise AttributeError(
AttributeError: Field 'some_attribute' has no default or configured value.

So the Child is correctly resolved, but its Field never obtains the value from the parent class.

To Reproduce

from zookeeper import ComponentField, Field, cli, component, task


@component
class Child:
    some_attribute: bool = Field()  # inherited from parent


@task
class Parent:
    some_attribute: bool = Field(False)
    child: Child = ComponentField(Child)

    def run(self):
        print(f"Own attribute: {self.some_attribute}")
        print(f"child attribute: {self.child.some_attribute}")


if __name__ == "__main__":
    cli()

Environment

TensorFlow version: 2.2.1
Zookeeper version: 1.1.0

Include some sort of post-configure hook

We got rid of validate_configuration; maybe it can be replaced with __component_configured__ or something similar.

Dictionary fields

Feature motivation

If you have multiple variants of some kind of component, e.g. multiple preprocessing functions, it would be very nice to store them all in a dictionary rather than having to define separate fields (which is error-prone especially as component classes get bigger, subclass other components etc)

(CC @CNugteren)

Feature description

Would be nice to be able to do something like this:

preprocessing: Dict[str, Preprocessing] = {
    "default": ComponentField(),
    "custom": ComponentField(),
}

Of course the fields would still have to be CLI overridable, e.g. MyComponent.preprocessing.default.some_attribute="value"

Feature implementation

I don't think this will necessarily be easy, but I also don't think it should be too difficult. We'll mainly have to make sure the config "paths" are handled correctly and the inheritance works properly. I'd hope this could be done in a day or so.

Error when configuring components with values inherited from factories

The following fails:

    @factory
    class IntFactory:
        def build(self) -> int:
            return 5

    @component
    class Child:
        x: int = ComponentField()

    @component
    class Parent:
        child: Child = ComponentField(Child)
        x: int = ComponentField(IntFactory)

    p = Parent()
    configure(p, {})
    assert p.x == 5
    assert p.child.x == 5

with error AttributeError: 'int' object has no attribute '__component_configured__'.

This happens because configure() tries to recursively configure p.child.x, as it is a ComponentField (shadowing p.x, which is a ComponentField). But p.child.x resolves to an integer. We can solve this by checking that the thing we try and recursively configure is a component instance (this will not break type-checking, which happens separately).

Add shell completion

I don't think there's any reason that this shouldn't be straightforward using the Click support for shell completion.

Attribute typecheck does not work for floats

Describe the bug

When calling configure, passing a bool instead of an float does not throw an error, presumably because python could theoretically convert it to 1.0.

To Reproduce

from zookeeper import Component

class A(Component):
    z: float

a = A()
a.configure({"z": True})  # This works fine, even though z should be a float.
print(a)

Outputs:

A(
    z = True
)

Expected behavior

We might want to throw an error if we receive a bool, since these may not always be interchangeable (or at least that's not what you would expect, so it may make it difficult to debug).

Environment

Python version: 3.7
TensorFlow version: 2.0
Zookeeper version: 1.0.dev2

Let HParams support optional and non-optional parameters

Problem statement & motivation

Let's say I have a class of parameters for which I want to define the structure, but not the values. For example:

class TestParams(HParams):
    value_1: float
    value_2: Optional[float]

In this case, each instance of TestParams should have some property value_1, and potentially a value_2.

HParams currently don't support this:

p = TestParams(value_2=0.5) # This will not throw any errors
print(p.value_1) # Will throw an AttributeNotFoundError

Unexpected behaviour when configuring nested components.

Describe the bug

The parameter value injection means that in the component hierarchy the same component instance can be in multiple places (only one instance exists, but there are pointers to it from multiple sub-components at different levels of nesting). This means there can be unintended side-effects when passing configuration values applied to the same object but scoped to different levels. It's best to understand this with an example:

Minimal reproducable example:

from zookeeper import Component

class A(Component):
    x: int = 5

class B(Component):
    a: A

class Parent(Component):
    a: A
    b: B

p = Parent()

p.configure({
    "a.x": 3,
    "a.b.x": 4,
})

print(p)

Expected (or perhaps intended) behaviour:

Parent(
    a = A(
        x = 3
    ),
    b = B(
        a = A(
            x = 4
        )
    )
)

Actual behaviour:

Parent(
    a = A(
        x = 4
    ),
    b = B(
        a = A(
            x = 4
        )
    )
)

Environment

Zookeeper version: 1.0.dev2

RFC: Caching field values

Feature description and motivation

At the moment, field values on component instances behave much like instance attributes of generic Python class instances. One value exists per instance, and if it is mutable then an access after modification will return the same, modified value, e.g.:

@component
class A:
    foo: List[int] = Field(lambda: [1, 2, 3])

class B:
    def __init__(self):
        self.foo = [1, 2, 3]

a = A()
b = B()

assert a.foo == b.foo == [1, 2, 3]

a.foo.append(4)
b.foo.append(4)

assert a.foo == b.foo == [1, 2, 3, 4]

We could change this behaviour so that field values instead behave much more like @property values, i.e. the value is not 'cached' on the instance and instead re-generated on every access. See discussion here for a motivation of this different behaviour: larq/zoo#148 (comment).

Current implementation

For a full explanation of how components access field values, see the docstring of the _wrap_getattribute method in component.py:

zookeeper/zookeeper/core/component.py

Lines 116 to 144 in 2b8812a

    
               """ 
        
               The logic for this overriden `__getattribute__` is as follows: 
        
               During component instantiation, any values passed to `__init__` are stored 
        
               in a dict on the instance `__component_instantiated_field_values__`. This 
        
               means that a priori the `__dict__` of a component instance is empty (of 
        
               non-dunder attributes). 
        
               Field values can come from three sources, in descending order of priority: 
        
                 1) A value that was passed into `configure` (e.g. via the CLI), which is 
        
                    stored in the `__component_configured_field_values__` dict on the 
        
                    component instance or some parent component instance. 
        
                 2) A value that was passed in at instantiation, which is stored in the 
        
                    `__component_instantiated_field_values__` dict on the current component 
        
                    instance (but not any parent instance). 
        
                 3) A default value obtained from the `get_default` factory method of a 
        
                    field defined on the component class of the current instance if it has 
        
                    one, or otherwise from the factory of the field on the component class 
        
                    of the nearest parent component instance with a field of the same name, 
        
                    et cetera. 
        
               Once we find a field value from one of these three sources, we set the value 
        
               on the instance `__dict__` (i.e. we 'cache' it). 
        
               This means that if we find a value in the instance `__dict__` we can 
        
               immediately return it without worrying about checking the three cases above 
        
               in order. It also means that each look-up other than the first will incur no 
        
               substantial time penalty. 
        
               """

New implementation

It would be straightforward to implement @property-esqe behaviour for default values which are passed into fields, as mutable default values are already generated from lambdas, and there's no issue with immutable default values being cached .

However, it would be much more difficult to implement for values passed in through the CLI. Consider the configuration CLI argument foo=[1, 2, 3]. We receive this as a string, and parse it into a Python value (in this case a list) to be used as the value for the field foo. If we wanted to return a new instance of this list on each access of foo, we would either need to be able to deep-clone generic mutable objects, or we would have to hold on to the configuration value as a string, and re-parse it into a Python value each time.

It's an open question whether we are happy for the behaviour of default values vs cli-overriden values to be different.

Improve help message

It would be great if the help message could show a summary of parameters that can be configured via the CLI.

Add support for passing keyword arguments to tfds.load

tfds.load has gained some useful options in recent versions of tfds so it would be great to be able to pass them in.

In particular supporting read_config which has been introduced in [email protected] would be quite useful to support.

Expose a method to get component values after configuration

It might be good idea to have a method that returns a dictionary of configured values for all components. This is needed for instance when one wants to pass the config to experiment tracking libraries such as wandb or polyaxon.

This can currently be done by accessing internal guts of zookeper:

resolved_config = {k: getattr(exp, k) for k in exp.__component_fields__}

`getattribute` can incorrectly call `build()` on factory instances

Implicit calling of build() is desired behaviour when accessing a field, but the current behaviour is for this to happen always, including e.g. when accessing instance.__component_parent__, as is frequently done internally. This is incorrect behaviour, and causes failures with confusing error messages (and sometimes infinite recursion). The behaviour in question comes from here:

zookeeper/zookeeper/core/component.py

Lines 215 to 220 in c8eaab4

    
           @functools.wraps(base_wrapped_fn) 
        
           def wrapped_fn(instance, name): 
        
               result = base_wrapped_fn(instance, name) 
        
               if utils.is_factory_instance(result): 
        
                   return result.build() 
        
               return result

We need to check that name is a valid field name before calling build() implicitly.

Apologies to @timdebruin for this bug....

Obscure error if assign is used instead of an annotation.

If an assignment is used instead of an annotation for a nested component there is an obscure error.

E.g. if a user defines a model as:

class FancyModel(Model):
    dataset = Dataset
    ...

and runs configure, either directly or through the CLI, the following error is produced: TypeError: configure() missing 1 required positional argument: 'conf'.

The correct line should either be dataset: Dataset or dataset = Dataset(...).

We should detect this mistake and print a warning.

Add usage example

The pull-request label enforcer is broken.

When a PR is updated with a new commit, you have to remove and then re-add any pre-existing label to satisfy the PR label check.

Make `ComponentField` and `PartialComponent` support `self` argument in lambda

Feature motivation

Let's assume I have an extended model that takes an existing model (both models are zookeeper factories) and adds a bunch of layers on top. The most straightforward way to do this would be as follows:

    extended_model: tf.keras.models.Model = ComponentField(
        ExtendedModel, base_model=lambda self: self.model
    )

However, this raises

TypeError: Keyword arguments passed to `PartialComponent` must be either:
- An immutable value (int, float, bool, string, or None).
- A function or lambda accepting no arguments and returning the 
  value that should be passed to the component upon instantiation.
- Another `PartialComponent`.Wrapping non-immutable values in a function / lambda allows the values to be lazily evaluated; they won't be created at all if the partial component is never instantiated.

because the lambda function passed to base_model is not allowed to have any arguments. This is problematic, because it simply needs to access the existing model, which is an attribute of the surrounding task. It is also not possible to define this as a property, because ComponentField cannot be used to decorate properties.

For now, I simply have to make the base model a ComponentField in the ExtendModel class, and rely on zookeeper to properly configure it from the surrounding Experiment class, but it looks a bit confusing.

Feature description

It would be very useful if ComponentField detects whether the argument to the lambda function is self, and passes the correct value if this is the case.

`__pre_configure__` doesn't work with nested components

Describe the bug

When __pre_configure__ is used to modify the configuration dict of a component, the modified config is (incorrectly) not passed to configure when called on its sub-components.

To Reproduce

    @component
    class Child:
        a: int = Field(4)

    @component
    class Parent:
        a: int = Field(2)
        child: Child = ComponentField()

        def __pre_configure__(self, conf):
            if "a" in conf:
                return {"child.a": conf["a"] * 7, **conf}
            return conf

    parent = Parent()
    configure(parent, {"a": 6})
    assert parent.a == 6
    assert parent.child.a == 42

The second assertion fails (the value is 6).

Expected behavior

The value should be 6 * 7 = 42.

Environment

TensorFlow version: N/A
Zookeeper version: v1.3.1

Nicer `str`

At the moment, if a component appears multiple times in the nested component hierarchy, it will get printed multiple times in __str__. It would be nice if it would only be printed once, ideally at the 'highest' level.

ImportError: cannot import name 'print_formatted_text'

Describe the bug

Larq Zoo cannot be imported. The stack trace points to a file in Zookeeper with the following error:

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-2-0fabe46a7c79> in <module>()
      3 
      4 import larq
----> 5 import larq_zoo as lq

6 frames
/usr/local/lib/python3.6/dist-packages/larq_zoo/__init__.py in <module>()
----> 1 import larq_zoo.literature as literature
      2 import larq_zoo.sota as sota
      3 from larq_zoo.core.utils import decode_predictions
      4 from larq_zoo.training import datasets
      5 from larq_zoo.training.data import preprocess_input

/usr/local/lib/python3.6/dist-packages/larq_zoo/literature/__init__.py in <module>()
----> 1 from larq_zoo.literature.binary_alex_net import BinaryAlexNet
      2 from larq_zoo.literature.birealnet import BiRealNet
      3 from larq_zoo.literature.densenet import (
      4     BinaryDenseNet28,
      5     BinaryDenseNet37,

/usr/local/lib/python3.6/dist-packages/larq_zoo/literature/binary_alex_net.py in <module>()
      3 import larq as lq
      4 import tensorflow as tf
----> 5 from zookeeper import Field, factory
      6 
      7 from larq_zoo.core import utils

/usr/local/lib/python3.6/dist-packages/zookeeper/__init__.py in <module>()
----> 1 from zookeeper.core import (
      2     ComponentField,
      3     Field,
      4     PartialComponent,
      5     cli,

/usr/local/lib/python3.6/dist-packages/zookeeper/core/__init__.py in <module>()
----> 1 from zookeeper.core.cli import cli
      2 from zookeeper.core.component import component, configure
      3 from zookeeper.core.factory import factory
      4 from zookeeper.core.field import ComponentField, Field
      5 from zookeeper.core.partial_component import PartialComponent

/usr/local/lib/python3.6/dist-packages/zookeeper/core/cli.py in <module>()
      3 import click
      4 
----> 5 from zookeeper.core.utils import convert_to_snake_case, parse_value_from_string
      6 
      7 

/usr/local/lib/python3.6/dist-packages/zookeeper/core/utils.py in <module>()
      6 import click
      7 import typeguard
----> 8 from prompt_toolkit import print_formatted_text, prompt
      9 
     10 

ImportError: cannot import name 'print_formatted_text'

---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------

To Reproduce

# Installing Larq and Larq-zoo in a Colab notebook
!pip -q install larq
!pip install larq-zoo

import larq
import larq_zoo as lq

Expected behavior

larq_zoo being imported successfully with no other output from the cell in Google Colab

Environment

TensorFlow version: 2.2.0-rc3
Zookeeper version: 1.0.0
Larq zoo: 1.0.1
Larq: 0.9.4

Easier access to factory component attributes after building

Feature motivation

Sometimes factory components have Fields that we need to access from the code. However once the factory component has already built what it should build these attributes are hard to access. The current way to do this would be for instance:

student = self.__base_getattribute__("model").student_model

Feature description

It would be nice to have some syntactic sugar for the above, for example:

student = self._model.student_model

Where the _name notation would be the way to access the factory component that created whatever the factory is tasked with producing (which would be self.model)

Improve docs and readme

Docs are currently really bare-bones and only consist of an example. We should improve that.

Throw error if component has `__post_configure__` but is not configured

Feature motivation

It's currently possible to define a component with some crucial asserts in the __post_configure__, and then use it without configuring it, hence never triggering those asserts.

Feature description

It would be nice to check if a component has a custom __post_configure__, but I'm not sure where the best place to do that would be. It probably gets tricky quite quickly and may need to be done in e.g. __getattribute__.

Docs: add walk-through examples of how to use Zookeeper concepts in experiments

I think maybe it'd be good to have some simple examples of how these Zookeeper concepts (Fields, @components, and ComponentFields) can/should be used in the context of an experiment/@task, perhaps alongside or replacing the more abstract child/parent examples we have in README.md now. I've still learned this mostly from pattern matching what I've seen others do across Zoo and other places. It'd be nice to have a reference of the most Pythonic Zookeperic ways of doing different things.

(From #134.)

Support nested HParams

If we have a model with multiple independent components, e.g. an encoder and a decoder, or a base network and some auxilliary structure, it would be nice if the HParams class could nest other HParams classes, one for each of the components.

This would facilitate easy model modularity, e.g. a build_model function could internally call build_encoder(hparams.encoder_hparams) and build_decoder(hparams.decoder_hparams).

Setting non-existent fields should cause an error.

Currently, key-value pairs set through the CLI (or more generally through configure) will have no effect if the key does not exist on the component (or some child thereof). This could lead to unintentional errors, so we should throw in such cases.

The `plot()` cli command doesn't work

To reproduce: clone the research template using cookie cutter, install the requirements, and then run name_of_project plot cifar10.

This results in the following error:

zookeeper.registry.PreprocessNotFoundError: No preprocessing functions registered for dataset cifar10.

which is definitely an error because there are pre-processing functions defined for cifar10. This happens for every dataset you try. I think the codepaths which register the pre-processing functions aren't being run for some reason.

This is clearly a little-used cli command so I'm not sure how long it hasn't worked. It is odd to me that the plot cli command is defined in zookeeper/cli whereas other commands such as train and netron are defined in name_of_project/train.py -- I suspect that it is this that causes the pre-processing functions not to be registered, but I don't know for sure.

	def num_examples(self, split) -> int:
	"""Compute the number of examples in a given split."""

	return sum(self.splits[s].num_examples for s in base_splits(split))

	"""
	The logic for this overriden `__getattribute__` is as follows:

	During component instantiation, any values passed to `__init__` are stored
	in a dict on the instance `__component_instantiated_field_values__`. This
	means that a priori the `__dict__` of a component instance is empty (of
	non-dunder attributes).

	Field values can come from three sources, in descending order of priority:
	1) A value that was passed into `configure` (e.g. via the CLI), which is
	stored in the `__component_configured_field_values__` dict on the
	component instance or some parent component instance.
	2) A value that was passed in at instantiation, which is stored in the
	`__component_instantiated_field_values__` dict on the current component
	instance (but not any parent instance).
	3) A default value obtained from the `get_default` factory method of a
	field defined on the component class of the current instance if it has
	one, or otherwise from the factory of the field on the component class
	of the nearest parent component instance with a field of the same name,
	et cetera.

	Once we find a field value from one of these three sources, we set the value
	on the instance `__dict__` (i.e. we 'cache' it).

	This means that if we find a value in the instance `__dict__` we can
	immediately return it without worrying about checking the three cases above
	in order. It also means that each look-up other than the first will incur no
	substantial time penalty.
	"""

	@functools.wraps(base_wrapped_fn)
	def wrapped_fn(instance, name):
	result = base_wrapped_fn(instance, name)
	if utils.is_factory_instance(result):
	return result.build()
	return result

larq / zookeeper Goto Github PK

zookeeper's Introduction

Getting Started

Examples

Installation

About

zookeeper's People

Contributors

Stargazers

Watchers

Forkers

zookeeper's Issues

Describe the bug

To Reproduce

Expected behavior

Environment

Describe the bug

To Reproduce

Expected behavior

Environment

Describe the bug

To Reproduce

Expected behavior

Environment

Describe the bug

To Reproduce

Expected behavior

Environment

Describe the bug

To Reproduce

Environment

Feature motivation

Feature description

Feature implementation

Describe the bug

To Reproduce

Expected behavior

Environment

Problem statement & motivation

Suggested solution

Describe the bug

Minimal reproducable example:

Expected (or perhaps intended) behaviour:

Actual behaviour:

Environment

Feature description and motivation

Current implementation

New implementation

Feature motivation

Feature description

Describe the bug

To Reproduce

Expected behavior

Environment

Describe the bug

To Reproduce

Expected behavior

Environment

Feature motivation

Feature description

Feature motivation

Feature description

Recommend Projects

Recommend Topics

Recommend Org