brohrer / robot-brain-project Goto Github PK

This project forked from alito/becca

a general purpose learning agent

Python 100.00%

robot-brain-project's Introduction

Becca is a general learning program for use in any robot or embodied system. When using Becca, a robot learns to do whatever it is rewarded to do, and continues learning throughout its lifetime.

How do I take a quick look at becca?

Pull down the code from Pypi.

pip3 install becca

becca_test, becca_viz, and becca_toolbox install automatically when you install becca. It will also install numpy, numba, and matplotlib if you don't have those in place already.

Run it on your local machine.

python3
>>>import becca_test.test as test
>>>test.suite()

How do I install becca for development?

If you want to integrate becca with your robot, simulation, or reinforcement learning benchmark, or you'd like to contribute to the code, you'll need to clone the GitHub repositories and install them locally. Here is the walkthrough.

What can becca do?

Some videos show it in action.

How do I use becca?

A Hello-World example walks you through the process.

What can becca do for me?

Becca aspires to be a brain for any robot, doing anything. It's not there yet, but it's getting closer. It may be able to drive your robot. Hook it up and see. Feel free to shoot me an email ([email protected]) if you'd like to talk it through.

How does becca 10 work?

I owe you this. It's on my To-Do list.

In the meantime, the reinforcement learner is similar to the one from Becca 7 (described in this video) and the unsupervised ziptie algorithm hasn't changed from Becca 6 (described on pages 3-6 of this pdf).

The code is also generously documented. I explain all my algorithmic tricks and justify some of my design decisions. I recommend starting at connector.py and walking through from there.

Next steps.

The good folks at OpenAI have created a playground called Gym for becca and agents like it. Learning on simulated robots of all types and complexities is a great opportunity to show what becca can do. Getting becca integrated with Gym is my next development goal. There are some intermediate steps, and I'll be working through them for the next several months.

Join in

We could use your help! There are several issues tagged entrypoint. These are a fine place to start if you are coming to the project for the first time and want to get your feet wet. Tehy aren't necessarily small tasks, or easy ones, but they don't presuppose a deep understanding of the code.

Questions? Comments? Snide remarks?

Feel free to add or comment on GitHub issues, tag the becca project on Twitter, or send me a personal email ([email protected]), as befits the situation.

robot-brain-project's People

Contributors

Stargazers

Watchers

Forkers

eimirae automenta pperezrubio fdoperezi btapo shiva16 mickeyshaughnessy pursh2002 mdbconsulting gitter-badger matt2000 alkavaev kanghaiyang codeaudit arita37 elijahhezekiah12 frankyyyt vivimarani microgold robustfengbin happyyang bobquest33 pb-pravin jmapost romeopatrick11 neerajk12345 ashdtu adrianbzg ameeransari parinfuture niilante darcese hal2001 deepakkumarroy somsirsa stjordanis subhodeep alyeskabear tanthml shooter2062424 neuralmeatbot saikat1506 afcarl counter34 dbarbedillo nickstillman

robot-brain-project's Issues

tests, CI and pep8

@brohrer I suggest that so as to avoid any breaking changes we make through out developing Becca, we should write tests for each module and use some sort of continuous integration like Travis CI which requires a .travis.yml file something like this.

We can use tox to automate our tests.
I also recommend enforcing pep8 test along with tox by using something like flake8.

Raising this issue to discuss these three things.

Build basic cerebellum

Add functionality and documentation for modeling the world and predicting features and actions at the next time step.

Build basic amygdala

Add functionality and documentation to assign value to features.

Clean up following decimal points.

Now that becca is no longer supporting python 2, there is no need to initialize floats with decimal points to distinguish them from integers. There are a lot of following decimals that are only clutter now.

For example
new_list = [0.]
should be
new_list = [0]

Build basic ganglia

Build basic functionality and documentation for choosing actions based on the valuation of the amygdala and the predictions of the cerebellum.

Build a basic cortex

The cortex will observe feature activities over time and use them to create new features. The created features will be higher order (more abstract) than the ones that contribute to them.

Attend a single feature or transition effect at each time step

In a mechanism that mimics the human ability to attend to features in their environment, BECCA will pay attention to just one feature at each time step. It can also choose to attend to a single transition effect in its long-term memory instead.

BECCA will choose which to attend, based on which has the highest salience. Salience of features is affected by magnitude, change in magnitude, and time since the feature was attended. It may also be affected by surprise. Salience of transition effects affected by transition priming, change in priming, time since attended, reward magnitude.
Surprise is calculated by how much feature activity exceeds priming.

Fatigue, recency of when a feature was last attended, decreases salience.
Salience factor F for a feature observed t timesteps ago is F(t) = 1 - 1/(t/beta +1), where beta is a positive constant giving the 50% decay time
Salience factor R for the reward, r, associated with a transition is R(r) = (1 + gamma |r|) / (gamma + 1), where gamma is a positive constant weighting the importance of reward in salience

Recently attended features are tracked in a way mimicking human short-term learning.
Each attended feature is assigned a magnitude, m, based on the feature activity or transition priming. The magnitude of recently attended features is decayed using the function M, and the time since the feature was attended, t. M = alpha m/t
New attention reinforces by summation. The new magnitude, m_new, is the sum of the decayed magnitude of the previous attention instance, and the feature activity-based magnitude, m: m_new = m_ + M(t)

Build a basic hippocampus and cingulate

The cingulate will filter features down to one per time step (attention). The hippocampus will use attended features to build a model of the world, make predictions, and evaluate actions.

Update transitions based on short term memory

Using the short-term memory described in issue 4, preferentially update the long term memory cause-effect transition strengths.

Incorporate grid_1D_delayed task

Troubleshoot the grid_1D_delayed task. RIght now it's producing odd results, but it's something that I think BECCA ought to be able to handle. I want to add it into the set of benchmark tasks.

Evolve the catch world

Make more complex, more intuitive, more aesthetically appealing.
Make world bigger.
Incorporate aspects that require deep learning.

Make often-repeated actions automatic

Daisychains within cogs learn approximate transition probabilities between attended features. This enables them to predict likely next features. When actions are strongly predicted, execute them.

Add logging

There are a lot of "info" and "debug" level print statements in the code. It would be great to log these out to a file, python style, using the logging package.

Add type hints

Python type hints provide a nice bit of built-in documentation and, when an appropriate linter is used, an extra catch for subtle bugs. Adding them would make the code a bit stronger.

Select goals based on expected reward

Using the priming values implemented in issue 2, select features as goals.

Implement multi-step lookahead

In the goal-selection mechanism, choose whether to select a goal based on a parameter value. The parameter represents the urgency of acting.

The grid_1D_ms might a a good first world to test this in.

Change pypi dependencies to include numba, numpy, matplotlib

In order to simplify installation steps, remove the requirement to install or update anaconda. Pull the relevant packages from pypi directly. This requires modifying setup.py.

Inconsistency in action values

It is expected that world.step receives a array of binary values as returned by brain.sense_act_learn which is returned by self.postprocessor.convert_to_actions but it's documentation says that it returns a A set of actions for the world, each between 0 and 1. The return of an action array of floats is inconsistent with the demands of openai's gym.
Did I miss anything @brohrer ?

Integrate becca with a simulated robot

This is exactly what becca is built for. Integrating it with simulated (or physical!) autonomous system of any sort will provide both a great demonstration of what its current capabilities are, as well as a good way to discover bugs and areas for improvement.

A broad collection of simulated environments that are ripe for this are available through OpenAI's Gym interface. There's already a Gym-specific task (#39).
If you decide to integrate with a new environment or interface, create a new issue specific to your project, and leave this one in place for others to see.

See the wiki for a list of environments and interfaces.

tune up the hub

Profile BECCA and look for parts of the hub that can be sped up using numba. Write the numba functions to accelerated these and test the results.

Also, I don't think running activity in the hub is doing what it's supposed to. I think it's nearly 0 most of the time.

Record cause-effect feature-to-feature transitions

In a structure mimicking human long term declarative memory, implement cause-effect transitions, each with a single feature cause and a single feature effect. Each also has an expected reward associated with it. The transitions are learned based on current and recent feature activities. The expected reward is learned based on current and near future rewards.

Each transition will also have a priming value associated with it. Priming serves both as a prediction and a planning mechanism. The priming will be a function of current and recent feature activities.

Improve first-touch demo

For a user's first experience with becca the workflow is

pip install becca
python3
>>> import becca_test.test as test
>>> test.suite()

Which runs the test suite.

This could be improved in several ways:

Have a more interesting world run, such as a simulated robot.
Have a demo() method available through the becca package, rather than having to import becca_test
Have more verbose console feedback about what's going on, to give the user a sense of what's happening behind the scenes.
Have any output images saved to an easy-to-reach directory, perhaps the pwd() or pwd()/demo.
Have the world create its own static visualizations to illustrate what is happening.
[Gold medal, over-the-top] Have the world serve its own live animation of what is happening.

Finish the visualization

Visualization is currently working well enough to run, but it's still buggy and incomplete. Here is my todo list:

Autoscale curiosity and reward values to make them more visible.
Verify goal collection.
Verify decay and reset of goals.
Verify goal value.
Show selected goal.
Anything else to show from actor?
Break actor into a separate frame.
Show goals pass back through input filter, and through featurizer.

Integrate with the Robot Operating System

Becca is intended for controlling physical robots. The superhighway to robot integration is through the Robot Operating System (ROS). This will allow becca to act as the brain of a large number of robots.

Dustin Franklin implemented a ROS integration for Becca four years ago, but Becca and ROS have changed so much since then as to be unrecognizeable. Feel free to use Dustin's work as a starting point if it's helpful.

Meta parameters for attention and goal selection

In a mechanism mimicking human emotion, Implement meta parameters for attention and goal selection.

These include
arousal or fight/flight
tendency to look ahead vs act immediately
willingness to explore vs play it safe
sadness? happiness? (recent reward history)
anger/fear/anxiety? tendency to act quickly and (big punishment is expected)
anticipation/excitement? tendency to strive (big reward is expected)

Integrate becca with OpenAI Gym

Specific instance of task #34.

OpenAI Gym is collection of simulation tasks of several varieties. Some are based on the commercial MuJoCo simulator, some are based on the Open Source Bullet simulator, as well as retro arcade games and classic benchmarks.

Limit curiosity to valid features

Increase early learning efficiency by not developing curiosity about unpopulated features.

Perhaps set the number of times tried high and reset when a feature is populated. Or perhaps set a conditional and check whether features are in range.

Unbreak becca

Neither the master repository nor the version 9 tag are working right now. This is obviously unacceptable for public open source code. My current plan is to forge ahead with version 10 and get it working, then maintain both a working master and version 10 thereafter.

Write "Getting started with becca" tutorial video and blog post

With the release of version 10 or shortly after, show new users Hello World for becca.

Cannot import "becca.brain" and "becca.base_world"

Represent time

BECCA's underlying representation of features does't represent time inherently, but its experience is embedded in time.

[becca_viz] Reverse the goals axis in the model visualization

In the visualizations, the inputs and commands in the left pane are shown with inputs on the left (low to hi sensors) and commands on the right (low to high actions). This convention gets reversed in the model_viz panel. This task is to reverse the indexing on the goals axis to make these consistent.

Do everything a five year old child can do

Becca was created to be a general purpose learner. It's difficult to evaluate progress on this goal, but human children are the gold standard for general purpose learning. This feature request is incomplete in that it doesn't specify a quantitative comparison method.

Debug performance

Performance on the test worlds is still unsatisfactorily low. Use the brain visualization to debug each piece of the algorithm an make sure it's working well.

Hold off on writing How Becca Works (#32) until this is done.

No tester.py or benchmark.py work with current commits

c:\root\devel\becca\trunk>tester.py
Traceback (most recent call last):
File "C:\root\devel\becca\trunk\tester.py", line 27, in <
from becca_world_chase_ball.chase import World
ImportError: No module named becca_world_chase_ball.chase

c:\root\devel\becca\trunk>benchmark.py
Traceback (most recent call last):
File "C:\root\devel\becca\trunk\benchmark.py", line 17, in
import tester
File "C:\root\devel\becca\trunk\tester.py", line 27, in <mo
from becca_world_chase_ball.chase import World
ImportError: No module named becca_world_chase_ball.chase

OS: Windows 7_64, Python 2.7. It used to work before. but I think I had matt2000 release back then.

Becca test error on macos Sierra running Anaconda

Hi everybody,

I'd like to try Becca but can'g get past the first test.
I'm on macos Sierra 10.12.2 with anaconda 4.2.0. The pip install works well but I get the following error on the import becca_test.test

python
>>>import becca_test.test

Here is the output:

Python 3.5.2 |Anaconda 4.2.0 (x86_64)| (default, Jul  2 2016, 17:52:12)
[GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import becca_test.test
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/DLU/anaconda/lib/python3.5/site-packages/becca_test/test.py", line 31, in <module>
    import becca.connector
  File "/Users/DLU/anaconda/lib/python3.5/site-packages/becca/connector.py", line 7, in <module>
    from becca.brain import Brain
  File "/Users/DLU/anaconda/lib/python3.5/site-packages/becca/brain.py", line 279
    except pickle.PickleError, err:
                             ^
SyntaxError: invalid syntax

Any ideas? Thanks in advance.
David

Create a visualization of becca--the discretizer, the featurizer, and the model

This is currently under way in the 'visualization' branch. It has also prompted re-writing some of the algorithmic functionality as bugs have become apparent. It has grown to be a large set of code changes.

Use connector to control visualization frequency

Set brain visualization interval and world visualization interval from connector, providing reasonable defaults. Also provide a flag to control whether the brain visualization is rendered.

Make documentation consistent with pydoc as implemented in numpy

Numpy has done a pretty good job with documentation. Plus, pydoc is a pretty simple and universally available. BECCA will be better off for emulating it.

Write "How becca works"

Videos and/or posts on becca and its parts would be helpful. Due to big changes in the works, these should wait until after version 10 is out.

Cool App!

Hey Brandon,

I just downloaded and installed becca. I had to tweak a few things in 3.5 in order to get it to work properly. Here are my notes:

This is awesome. Thanks for sharing this! I added a few notes to help me get this working. It may help others running with python 3.5

conda install numba if not already installed

In brain.py line 6 remove import pickle as CPickle
and change to import pickle

fixed a tabbing issue and syntax error on line 283 of brain.py
remove these lines:
except pickle.PickleError, err:
print('Error unpickling world: {0}'.format(err))
add these lines:
except pickle.PickleError:
print('Error unpickling world: ')

import becca_toolbox.feature_tools as ft, this line fails in
In line 18 of image_1D.py comment out
import becca_toolbox.feature_tools as ft

line 55 of tools.py change long to int (in 3.5 there is only int)
if isinstance(shape, (int, int)):

Automatically determine the number of sensors

If n_sensors isn't passed to the brain during initialization, automatically determine it.

Refine deep learner

Keep training watch and MNIST worlds.
Adjust parameters.
Clean, condense, and comment ziptie
When performance is adequate, tag BECCA, watch and MNIST.
Make and publish a video of features generated by each.