usnistgov / optbayesexpt Goto Github PK

Optimal Bayesian Experiment Design

Home Page: http://pages.nist.gov/optbayesexpt

License: Other

Python 100.00%

experimental-settings measurement-settings parametric-model decision-making experiment-design bayesian-inference utility-function

optbayesexpt's Introduction

OptBayesExpt Overview

R. D. McMichael [email protected]
National Institute of Standards and Technology
revision: April 24, 2024

What is it for?

Optimal Bayesian Experiment Design is for making smart setting choices in measurements. The optbayesexpt python package is for cases with

a known parametric model, i.e. an equation that relates unknown parameters and experimental settings to measurement predictions. Fitting functions used in least-squares fitting are good examples of parametric models.
an experiment (possibly computational) that uses a set-measure-repeat sequence with opportunities to change settings between measurements.

The benefit of these methods is that they choose settings that have a good chance of making the parameter estimates more precise. This feature is very helpful in situations where the measurements are expensive.

It is not primarily designed for analyzing existing data, but some of the code could be used for Bayesian inference of parameter values.

Note that Bayesian optimization addresses a different problem: finding a maximum or minimum of an unknown function.

What does it do?

It chooses measurement settings "live" based on accumulated data.

The sequential Bayesian experimental design algorithms play the role of an impatient experimenter who monitors data from a running experiment and changes the measurement settings in order to get better, more meaningful data. Note the two steps here. The first step, looking at the data, is really an act of extracting meaning from the numbers, learning something about the system from the existing measurements. The second step, a decision-making step, is using that knowledge to improve the measurement strategy.

In the "looking at the data" role, the method uses Bayesian inference to extract and update information about model parameters as new measurement data arrives. Then, in the "decision making" role, the methods use the updated parameter knowledge to select settings that have the best chance of refining the parameters.

The most important role is the responsibility of the user. As delivered, the BayesOptExpt is ignorant of the world, and it's the user's responsibility to describe the world in terms of a reliable model, reasonable parameters, and reasonable experimental settings. As with most computer programs, "the garbage in, garbage out" rule applies.

What's next?

Documentation is offered at this project's web page. The website includes a manual, a quick start guide, a gallery of demo programs, and the API documentation.

Legal stuff

Disclaimer

Certain commercial firms and trade names are identified in this document in order to specify the installation and usage procedures adequately. Such identification is not intended to imply recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that related products are necessarily the best available for the purpose.

Terms of Use

This software was developed by employees of the National Institute of Standards and Technology (NIST), an agency of the Federal Government and is being made available as a public service. Pursuant to title 17 United States Code Section 105, works of NIST employees are not subject to copyright protection in the United States. This software may be subject to foreign copyright. Permission in the United States and in foreign countries, to the extent that NIST may hold copyright, to use, copy, modify, create derivative works, and distribute this software and its documentation without fee is hereby granted on a non-exclusive basis, provided that this notice and disclaimer of warranty appears in all copies.

THE SOFTWARE IS PROVIDED 'AS IS' WITHOUT ANY WARRANTY OF ANY KIND, EITHER EXPRESSED, IMPLIED, OR STATUTORY, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTY THAT THE SOFTWARE WILL CONFORM TO SPECIFICATIONS, ANY IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND FREEDOM FROM INFRINGEMENT, AND ANY WARRANTY THAT THE DOCUMENTATION WILL CONFORM TO THE SOFTWARE, OR ANY WARRANTY THAT THE SOFTWARE WILL BE ERROR FREE. IN NO EVENT SHALL NIST BE LIABLE FOR ANY DAMAGES, INCLUDING, BUT NOT LIMITED TO, DIRECT, INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES, ARISING OUT OF, RESULTING FROM, OR IN ANY WAY CONNECTED WITH THIS SOFTWARE, WHETHER OR NOT BASED UPON WARRANTY, CONTRACT, TORT, OR OTHERWISE, WHETHER OR NOT INJURY WAS SUSTAINED BY PERSONS OR PROPERTY OR OTHERWISE, AND WHETHER OR NOT LOSS WAS SUSTAINED FROM, OR AROSE OUT OF THE RESULTS OF, OR USE OF, THE SOFTWARE OR SERVICES PROVIDED HEREUNDER.

optbayesexpt's People

Contributors

Stargazers

Watchers

Forkers

nnquynh rishirelan codexxxyyyy zhantaochen njm-qsensing pkairys michaelcshn mark-fein

optbayesexpt's Issues

Sockets on Linux

The demos/instrument_controller.py script runs good under Windows 10 ( 3 out of 3 tested). But on linux machines, there are error messages, etc. See below. Same result for 2 our of 2 tested, both Ubuntu.

rdm@Jetson-1:~/Downloads/optbayesexpt/demos/server$ python3 instrument_controller.py 3 runs, each measuring a randomly located peak and specified measurement settings

25 measurement settings between 2 GHz and 3 GHz. Notice the discrete settings
Traceback (most recent call last):
File "instrument_controller.py", line 261, in
main()
File "instrument_controller.py", line 52, in main
measurement_run(2, 3, 25)
File "instrument_controller.py", line 92, in measurement_run
tcpcmd(message)
File "instrument_controller.py", line 253, in tcpcmd
connection = connect(default_ip_address, default_port)
File "instrument_controller.py", line 182, in connect
connection.connect((ip_address, port))
ConnectionRefusedError: [Errno 111] Connection refused
rdm@Jetson-1:~/Downloads/optbayesexpt/demos/server$
SERVER READY

rdm@Jetson-1:~/Downloads/optbayesexpt/demos/server$ python3 instrument_controller.py
3 runs, each measuring a randomly located peak and specified measurement settings

25 measurement settings between 2 GHz and 3 GHz. Notice the discrete settings
Traceback (most recent call last):
File "/home/rdm/Downloads/optbayesexpt/demos/server/server_script.py", line 91, in
nanny = CustomServer(initial_args=args)
File "/home/rdm/Downloads/optbayesexpt/demos/server/server_script.py", line 65, in init
ip_address=ip_address, port=port)
File "/usr/local/lib/python3.6/dist-packages/optbayesexpt-1.0.0-py3.6.egg/optbayesexpt/obe_server.py", line 47, in init
File "/usr/local/lib/python3.6/dist-packages/optbayesexpt-1.0.0-py3.6.egg/optbayesexpt/obe_socket.py", line 57, in init
OSError: [Errno 98] Address already in use
"X" out to proceed.

Higher resolution now. 200 measurement settings between 2 GHz and 3 GHz.
"X" out to proceed.

A tricky one. 200 Measurement settings from 1 GHz to 4 GHz.
The measurements probably won't extend all the way from 1 to 4, though.
This is because the prior (see server script) only expects center values from 1.5 to 3.5
"X" out to proceed.
rdm@Jetson-1:~/Downloads/optbayesexpt/demos/server$

Error when running "py.test" during installation

Hi! I am new to this package, and I experience some issues installing it on my computer (which is running on Ubuntu 18.04.5 LTS) when I try to run the py.test command in the terminal. I receive the following error message from pytest:

======================================== test session starts ========================================
platform linux -- Python 3.8.5, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
rootdir: /home/sam/Dropbox/ETH Quantum Engineering/Semester 3/Semester Project/OptBayesExpt
collected 26 items                                                                                  

tests/test_optbayesexpt.py .....                                                              [ 19%]
tests/test_particlepdf.py ........                                                            [ 50%]
tests/test_server.py ....F                                                                    [ 69%]
tests/test_socket.py ....                                                                     [ 84%]
tests/test_utils.py ....                                                                      [100%]

============================================= FAILURES ==============================================
___________________________________________ test_run_pdf ____________________________________________

    def test_run_pdf():
        # on a non-default port
        # start the script
        cwd = os.getcwd()
        server_script = os.path.join(cwd, "tests\\server_script_61984.py")
        server_pipe = Popen(['python', server_script], cwd=cwd)
    
        sock = Socket('client', port=61984)
    
        right_mean = [1.5, 2.5]
>       reply = sock.tcpcmd({'command': 'getmean'})

tests/test_server.py:150: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/optbayesexpt/obe_socket.py:154: in tcpcmd
    self.send(command)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <optbayesexpt.obe_socket.Socket object at 0x7f39e893e8e0>, contents = {'command': 'getmean'}

    def send(self, contents):
        """
        Formats and sends a message
    
        This method formats the :code:`contents` argument into the message
        format, opens a connection and sends the :code:`contents` as a message.
    
        Args:
            contents: Any JSON format-able object. Briefly, python's
                :obj:`str`, :obj:`int`, :obj:`float`, :obj:`list`,
                :obj:`tuple`, and :obj:`dict` objects.
    
        Important:
            json.dumps() is not able to format numpy arrays.  To send numpy
            arrays, the :code:`numpy.tolist()` method is a convenient way to
            list-ify a numpy array.  For example::
    
                mySocket.send(myarray.tolist())
        """
    
        if self.role == 'client':
            self.connection = socket(AF_INET, SOCK_STREAM)
>           self.connection.connect((self.ip_address, self.port))
E           ConnectionRefusedError: [Errno 111] Connection refused

/home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/optbayesexpt/obe_socket.py:86: ConnectionRefusedError
------------------------------------- Captured stderr teardown --------------------------------------
python: can't open file '/home/sam/Dropbox/ETH Quantum Engineering/Semester 3/Semester Project/OptBayesExpt/tests\server_script_61984.py': [Errno 2] No such file or directory
========================================= warnings summary ==========================================
tests/test_server.py::test_make_obe
  /home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/numpy/testing/_private/utils.py:703: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
    x = array(x, copy=False, subok=True)

tests/test_server.py::test_make_obe
  /home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/numpy/testing/_private/utils.py:704: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
    y = array(y, copy=False, subok=True)

-- Docs: https://docs.pytest.org/en/stable/warnings.html
====================================== short test summary info ======================================
FAILED tests/test_server.py::test_run_pdf - ConnectionRefusedError: [Errno 111] Connection refused
============================= 1 failed, 25 passed, 2 warnings in 0.30s ==============================

Any help would be much appreciated. Thanks in advance!

User-specified resampling methods

I'm creating this issue to discuss user-specified resamplers. This would be useful to specify different resamplers for different applications. For example when the posterior is multimodal. ]. It would also help to optimize the resampling methods separately from the OBE classes in order to increase performance.

Thoughts?

Probabilities containing NaNs when choosing n_samples too small

I'm experiencing an error which I have trouble finding the root cause of. Apparently some probabilities contain NaN values when choosing the number of samples for the prior(s) (n_samples) too small. I found that the chance of getting this error increases as the n_sample decreases. Please find the error below.

/home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/optbayesexpt/particlepdf.py:167: RuntimeWarning: invalid value encountered in true_divide
  self.particle_weights = temp / np.sum(temp)
Traceback (most recent call last):
  File "benchmark_protocol_2.8.py", line 468, in <module>
    unknown_res.fit("Unknown resonance", "Simple Bayesian", unknown_res.SimpleBayesian, 
  File "benchmark_protocol_2.8.py", line 163, in fit
    self.save_full_metrics(*protocol_function(sample, *exp_parameters))
  File "benchmark_protocol_2.8.py", line 391, in SimpleBayesian
    single_probe_freq = obe.opt_setting()
  File "/home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/optbayesexpt/obe_base.py", line 387, in opt_setting
    utility = self.utility()
  File "/home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/optbayesexpt/obe_base.py", line 364, in utility
    var_p = self.yvar_from_parameter_draws()
  File "/home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/optbayesexpt/obe_base.py", line 299, in yvar_from_parameter_draws
    paramsets = self.randdraw(self.N_DRAWS).T
  File "/home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/optbayesexpt/particlepdf.py", line 250, in randdraw
    indices = self.rng.choice(self.n_particles, size=n_draws,\
  File "_generator.pyx", line 644, in numpy.random._generator.Generator.choice
ValueError: probabilities contain NaN

Any help is much appreciated. Thank you in advance.

Help setting up a manual experiment

Hi there! I was recently directed towards this package, and it looks really lovely! I'm hoping to use it to help optimize a micromachining/drilling application, drilling small holes in glass. The parameter space to explore is pretty large, and the cost to evaluate each sample is very high (set up the toolpath manually, run the CNC machine, photograph the results and quantify the hole quality in ImageJ). I was hoping to use this package to help explore the parameter space a bit more intelligently than the usual brute-forcing method 😁

I've attempted to modify the "server" demo but think I'm fundamentally misunderstanding how the evaluation model and settings/parameters interact.

There are three experimental settings to adjust: spindle RPM, how quickly the machine moves (feed rate) and a "pecking" depth when drilling

# RPM 10,000 - 30,000
rpms = np.linspace(10000, 30000, 100)

# feed rates 0.01 - 1.0 inches/s
feedrates = np.linspace(0.01, 1.0, 100)

# pecking depth 0 - 0.002"
pecking = np.linspace(0, 0.002, 100)

There are two parameters which are recorded from each test: the "chipout" area around each hole, and the material removal rate (MRR). From the docs, it seems I should set the prior's according to the expected distribution of values:

# Generate a *prior* distribution of parameters
n_samples = 1000

#chipout area ~100k - 400k pixels
chipoutVals = np.random.uniform(100000, 400000, n_samples)

# mrr ~0-100 mm^3/s
mrrVals = np.random.uniform(0, 100, n_samples)


settings = (rpms, feedrates, pecking,)
parameters = (chipoutVals, mrrVals, )
constants = ()

The goal is to minimize chipout area and maximize MRR. The minimum hole size with no chipout is ~100k pixels, so I combine the two in a fairly crude fashion as a first test:

def drilling_model(settings, parameters, constants):
    rpm, feedrate, pecking, = settings
    chipout, mrr = parameters

    # some debug to see what's going on
    print("settings", rpm, feedrate, pecking)
    print("parameters", chipout, mrr)

    chipout = min(100000, chipout)
    return ((100000/chipout)*50) + mrr

And then over in the client, we ask for a new setting, have the user enter the measured chipout area and MRR and send back to the server:

    # get a new recommended setting
    f_setting = tcpcmd({'command': 'goodset', 'pickiness': 6})
    print(f_setting)

    print("Enter Chipout Area:")
    chipout = input()

    print("Enter MRR:")
    mrr = input()

    # report measurement results to the OBE_Server
    tcpcmd({'command': 'newdat', 'x': (f_setting[0], f_setting[1], f_setting[2]), 'y': (chipout, mrr), 's':0.005})

At least in theory 🙂 The server errors out because it appears the parameters provided to the evaluation function are actually the priors that I defined earlier, instead of the chipout/mrr variables the client is sending in y? I might just have a syntax issue, but I suspect I'm actually misunderstanding a critical part of how this is all supposed to work. I tweaked things for a while and have re-read the docs, but afraid I haven't figured out what I'm doing wrong.

Any help would be greatly appreciated! Thanks! Attached are the full client/server files in case that's easier to see what I'm doing/doing wrong :)

server.zip