algorithmsbooks / decisionmaking Goto Github PK

View Code? Open in Web Editor NEW

506.0 506.0 53.0 47 KB

Algorithms for Decision Making textbook

decisionmaking's People

Contributors

Stargazers

Watchers

Forkers

jjwang01 frankieliu standardgalactic zdm87 polivio gap101 myxyzy cs0510 keshava wfule frankfan007 shuguangsun foeinlove leo23 muxinghan victor3387 allensmile traveldriffter leiloong weiyaping2006 ivanchenph lileikaka xrosliang yiliu001 andongzhengyang ianliyi1996 987419640 shengguanwsu avmi oagbaneje sandeepsony zeta1999 dscausality bulentsoykan carlosal1015 ps2u wgy230041 voquangtuong kgurtu pluflou jcmartinmu deepakgthomas paulomoekotte mlorbeta wilsonify sunnypiggggy mlindsk tae73 herveyrobot blenature nlhoahiep ethicalsecurity-agency geolingquan

decisionmaking's Issues

A small confusing implementation on assignments in Algorithm 2.1 and Example 3.1.

I tried to run Example 3.1 based on codes in Sec. 2 and Sec.3 as follows:

X = Variable(:x, 2); Y = Variable(:y, 2); Z = Variable(:z, 2)

# Table of X, Y (above) and Y, Z (below) at p.45
φ1 = Factor([X, Y], FactorTable((x=0, y=0) => 0.3, (x=0, y=1) => 0.4, (x=1, y=0) => 0.2, (x=1, y=1) => 0.1))
φ2 = Factor([Y, Z], FactorTable((y=0, z=0) => 0.2, (y=0, z=1) => 0.0, (y=1, z=0) => 0.3, (y=1, z=1) => 0.5))

and I got the below result after applying Base.:* function:

using PrettyPrint
φ12 = φ1 * φ2;
pprint(φ12.table)

# output is
http://localhost:8888/lab?{
  {:y : 0, :z : 1, :x : 0} : 0.0,
  {:y : 0, :z : 2, :x : 0} : 0.0,
  {:y : 1, :z : 1, :x : 1} : 0.05,
  {:y : 1, :z : 2, :x : 0} : 0.0,
  {:y : 1, :z : 2, :x : 1} : 0.0,
  {:y : 0, :z : 1, :x : 1} : 0.0,
  {:y : 0, :z : 2, :x : 1} : 0.0,
  {:y : 1, :z : 1, :x : 0} : 0.2,
}

This result did not equal to what I expected; The result table at p.45 of X, Y, Z can be obtained by replacing 0 -> 1, 1-> 2, and 2->3 in our definition. After checking the code of assignment in Algorithm 2.1, I'm thinking that this result is from 1:v.m of

function assignments(vars::AbstractVector{Variable})
    names = [var.name for var in vars]
    return vec([Assignment(n=>v for (n,v) in zip(names, values)) for values in product((1:v.m for v in vars)...)])
end

So, some parts of examples are a bit confusing for me. I know that the value in Julia is often assumed as 1-origin, but for probabilities we often uses {0,1} and I'm sorry I have no answer about this point. However, some remarks or comments are useful to learn deeply from the textbook with Julia.

error in example 2.8 (conditional independence via d-separation)

In Example 2.8, to determine whether D and B are conditionally independent given F, we check the two paths between them, which have inverted forks. It is stated that "Since F is a descendant of C, there is d-separation along the inverted fork", but this is the opposite of what is prescribed in Section 2.6: namely, because F is a descendant of the fork, there is not d-separation.

Typo?

Page 13, section 1.5:

improving the safety of aircraft.

Shouldn't it be aircrafts?

Mailing list for release notification

I'd suggest the setup of a low-traffic newsletter to notify readers about new releases of the book with the "changelog" of each release. At the same time it's a useful tool to notify the release of the final (print) version.

My two cents

Basic BN description could be more clear

I've been studying Bayesian nets for a while now (I'm a very beginner here) and something that's never quite clear is : what does an edge actually mean ? Why is there an arrow ? In 2.5:

Directed edges connect pairs of nodes, with cycles in the graph being prohibited. The arrows
indicate direct probabilistic relationships. Associated with each node Xi is a
conditional distribution P(Xi | Pa( Xi )), where Pa( Xi ) represents the parents of

So, what is exactly that "direct probabilistic relationship" ? Causation, correlation, conditional dependence ?

Moreover, on 2.6, it is said :

The reason that a Bayesian network can represent a joint distribution with fewer
independent parameters than would normally be required is due to the condi-
tional independence assumptions encoded in its graphical structure.

However conditional independence is not represented in a BN (that is, there's no graphical representation), it is deduced from the BN (via d-separation if I understand correctly). Worse, what is represented with edges is conditional dependence, not independence. So it's hard to understand.

So, unless I still got no clue about what's going on (which is quite possible), I think the explanation could make things easier if it distinguished between what is represented visually and what meaning it has.

Cross-entropy method figure has wrong Normal

In policy_search.tex you create the MvNormal with mean [0, 1] and covariance I, but then redefine the variable p with:

 p = MvNormal([mean(xdomain), mean(ydomain)], Matrix{Float64}(2I, 2, 2))

So the caption in the cross-entropy method figure (10.5) needs to be updated with 𝒩([0, 3], 2I)

Simple, routine economics application

Hi guys and thanks for posting your book!

Economists (like myself) solve MDPs all the time and even created some toolkits: VFIToolkit.m, QuantEcon.jl etc.
Unfortunately our toolkits & learning materials are segmented from others who also solve these types of problems.
It would be awesome if economists, engineers etc all cooperated on the same packages.

Would it be possible to include a very simple, routine economics problem?
Perhaps the one I posted on POMDPs.jl?

d-separation in section 2.6 needs a graphical example

The concept is hard to visualize without a diagram.

Disappointment that has no analogy Algorithms for decision making.

In equation 9.7 n->m

In equation 9.7, you have a single n - I think it should be an m

(FYI) Latest version of font

FYI, there's been an update (right) to the monospaced font since the version used for the draft of 2021-01-05 15:52:36-08:00, revision 2f3d6e6 (left).

(There had been some criticism that the ϕ characters could be confused. 😀)

Sentence ordering may be relooked at section 2.1

Section 2.1 Degrees of Belief and Probability

The three sentences below may be reorganized. What are A and B. Followed by how to represent A > B and A == B. The flow below may be a bit out of sync.

We would like to be able to represent, for example, that the plausibility of a proposition A is greater than the plausibility of another proposition B. If A represents ‘‘I will wear a coat’’, and B represents ‘‘it will rain’’, then we would write A � B. If we believe A and B are equally plausible, then we write A � B.

Satellite Monitoring problem statement is missing (page 34).

Section 2.5: Not clear what is the satellite monitoring problem. It kind of seems, it's a very well know problem that authors have referred to. In that case a reference to the problem discussed else where should be provided.

Alternatively, it may help explaining Bayesian Network through the satellite monitoring sample. The mention of DAG, and function Pa(x) or parent is kind of too abstract for someone to conceive. It may help to establish it with an example and then discuss the related math.

EPUB or MOBI support

Any future plans to support for EPUB or MOBI so users can read on e-readers?

Currently, we can use Calibre software but that won't compile properly because it has code but not just text.

Thanks.

Explanation for the shapes of ellipses in linear+quadratic MDP

From [2021-01-05 15:52:36-08:00, revision 2f3d6e6]

I am reading the Example 7.4. "An example solving a finite-horizon MDP with a linear transition function and quadratic reward."

I could not understand the shapes of the ellipses.

The legend says

"The blue contour lines show the Gaussian distributions over the state at each iteration".

The text says

"w is drawn from a zero-mean multivariate Gaussian distribution with covariance 0.1I."

Since the covariance matrix is proportional to the identity matrix, I would have expected a circular shape. It is the case for the first step. But then it seems that the covariance matrix is not diagonal and its [0, 0] and [1, 1] elements are not even equal.

Just for illustration:

Maybe I have misunderstood something here, but I think you could add an explanation of these shapes and their evolution in the legend.

Typo in example 2.5 page 34

The fourth line in the code example should be

vars = [B, S, E, C, D]

not

vars = [B, S, E, D]

Suggestion: Show value indications on Fig 22.2

Hey guys, the book looks great :)

In Figure 22.2 it might be neat to include indications of the value estimate, Q(h, a) at each action node, perhaps with color. Then it will be easy to see that the branches with higher value are favored.

It may also be useful to indicate the number of times a branch has been simulated with the thickness of the line.

I'm not sure what the aesthetic consequences of doing this are though :)

I didn't look but I assume there is also a figure in the MDP section that could benefit similarly.

Typo?

Last line of section 1.4.4.

Algorithms that combine both symbolic and connectionist paradigms remains an active area of research today.

Should remains not be remain?

typo? "admissable" instead of "admissible"

reading 9.8. labeled heuristic search 193 in [2021-01-05 15:52:36-08:00, revision 2f3d6e6].

The margin says:

"8 Such a heuristic is referred to as an admissable heuristic."

I think it should be "admissible", as used in Exercise 9.2. for instance.

Fuji/Xerox Printer erroring after Page 28

I'd sent the PDF to my local OfficeWorks in Brisbane, Australia to have a hard copy.

The job failed more than once after page 28, according to the staff there.
They were not able to offer more technical details, but I imagine there might be recursion of buffer limits in PDF streams.

Note that Letter paper is not generally available in Australia, so it was being resized for A4.

Missing Mention of NamedTupleTools in Algorithm 2.1

You should add a mention of NamedTupleTools in the figure text to algorithm 2.1 on page 26.

Further, it's not clear why the functions

variablenames
assignments

are present not are they used in the chapter.

Asside from that, thanks for the book :)

Algorithm 2.1 (page 26) run error

I'm grateful for a book that explains decision-making in detail, but I'm curious why the first example code in the book is wrong. I use julia 1.5.3, which version is suitable for this book?

the code is

...
X = Variable(:x, 2)
Y = Variable(:y, 2)
Z = Variable(:z, 2)
φ = Factor([X, Y, Z], FactorTable(
(x=1, y=1, z=1) => 0.08,
(x=1, y=1, z=2) => 0.31,
(x=1, y=2, z=1) => 0.09,
(x=1, y=2, z=2) => 0.37,
(x=2, y=1, z=1) => 0.01,
(x=2, y=1, z=2) => 0.05,
(x=2, y=2, z=1) => 0.02,
(x=2, y=2, z=2) => 0.07,
))

The output is

ERROR: LoadError: MethodError: Cannot convert an object of type NamedTuple{(:x, :y, :z),Tuple{Int64,Int64,Int64}} to an object of type Dict{Symbol,Int64}
Closest candidates are:
convert(::Type{T}, ::T) where T<:AbstractDict at abstractdict.jl:515
convert(::Type{T}, ::AbstractDict) where T<:AbstractDict at abstractdict.jl:517
convert(::Type{T}, ::T) where T at essentials.jl:171
...
Stacktrace:
[1] setindex!(::Dict{Dict{Symbol,Int64},Float64}, ::Float64, ::NamedTuple{(:x, :y, :z),Tuple{Int64,Int64,Int64}}) at ./dict.jl:372
[2] Dict{Dict{Symbol,Int64},Float64}(::Pair{NamedTuple{(:x, :y, :z),Tuple{Int64,Int64,Int64}},Float64}) at ./dict.jl:107
[3] top-level scope at /home/hanxj/code/julia/demo.jl:36
[4] include(::Function, ::Module, ::String) at ./Base.jl:380
[5] include(::Module, ::String) at ./Base.jl:368
[6] exec_options(::Base.JLOptions) at ./client.jl:296
[7] _start() at ./client.jl:506
in expression starting at /home/hanxj/code/julia/demo.jl:36

[minor] Formatting oddity: occasionally bold operators

I noticed a minor oddity on line 5 here: the arithmetic operators, normally grey, are black. I only noticed because the asterisk is quite bold with this font weight, and it caught my eye. :)

(Of course, you don't need the * in this particular location since 2(n + λ) is OK... :))

README typo (extra "by")

This book by provides a

Typo, page 3, section 1.2.2 Automated Driving

"...in order to navigate safety to our destination." Replace safety with safely

Example 17.3 missing ∇Q definition

For the GradientQLearning example 17.3, add the following after the Q(θ,s,a) = ... line in the juliaverbatim block:

∇Q(θ,s,a) = β(s,a)

This is in model-uncertainty/model-free-methods.tex line 933.

Forward Search tough to interpret

Hey guys, I have just spent quite some time trying to interpret the Forward Search algorithm 9.2, but it is still not quite clear to me how it is supposed to work. In particular, I don't understand the

U'(s) = forward_search(P, s, d-1, U).u

line. It seems like there should be an s' there and a for loop over states.

It might make more sense to implement it more like Alg 4.6 in the DMU book - that one is really clear.

Eqn. 4.37: Extra s in "D_{miss}"

Should be $D_{mis}$

Reference for transitivity, universal comparability -> real valued probability

In Chapter 2, you say "Universal comparability and transitivity assumptions lead to an ability to represent plausibility by a real-valued function". It might not be immediately obvious why this is true (it is not immediately obvious to me, and I am teaching it tomorrow 😰). In DMU, you have a reference for that statement (so I can point students to it if they ask). I think it would be helpful to also have that reference in this book.

Malformed Link

On page xxii (22), the link to github.com/cormullion/juliamono actually points to https://algorithmsbook.com/files/github.com/cormullion/juliamono, which doesn't exist.

opensource the book

release the Latex source code

Section 2.6 - Mixed variable names for describing d-separatio

Says "suppose we want to find out if A and B are d-separated by C", then talks about X,Y, and C (but not A or B) in the examples. I think A and B should be replaced by X and Y, right?

An error in Example 2.3 Julia v.1.5.3. The Assingments in FactorTable should be Dictionary not NamedTuple

It worked for me like this
φ = Factor([X, Y, Z], FactorTable(
Dict(:x=>1, :y=>1, :z=>1) => 0.08, Dict(:x=>1, :y=>1, :z=>2) => 0.31,
Dict(:x=>1, :y=>2, :z=>1) => 0.09, Dict(:x=>1, :y=>2, :z=>2) => 0.37,
Dict(:x=>2, :y=>2, :z=>1) => 0.02, Dict(:x=>2, :y=>2, :z=>2) => 0.07,
))

Any plans for a dead tree version?

I definitely want to express my appreciation and gratitude to you guys for putting this out for free. This is great stuff! However, for me, nothing beats the UX of actually flipping through pages. I'd be happy to pay a good premium to do so. Fair enough if there aren't enough people like me to justify this, but I'd be curious to know if you guys have kicked the tires on this at all.

Small typo on page 25 of Probabilistic Reasoning Chapter

Last sentence is missing the word "to". It currently reads: In order represent a probability distribution,...
It should read: In order to represent a probability distribution,...

Typo on page 34 example 2.5

The fourth line in the code example should be

vars = [B, S, E, C, D]

not

vars = [B, S, E, D]

Figure 2.10 typo

Figure 2.10 specifies that $\theta_2=1$, even though $\theta_2$ is varying across different values.

Redundant sentence bottom of page 83

" From this distribution, we might use the mean or mode of this distribution to impute the missing values."

Independence of discrete probabilties

The clarification in section 2.3.1 could have been written in section 2.2.1 as well.

For example,
θ₆ = 1 - (θ₁ +θ₂ +...+θ₅)

What is written is:

However, we need only five independent parameters to uniquely specify the distribution over the outcomes of the roll because we know that the distribution must sum to 1.

MPC Convexity requires linear dynamics

In 9.9.1, you say "A common approximation to make U(a1:d) convex is to assume deterministic dynamics." Unless, I am mistaken, to be convex, the dynamics must not only be deterministic, but also linear to make the problem convex. This is because the dynamics constraint is an equality constraint and the only type of equality constraint that is convex is a linear constraint.

Also, later you say "sampling a random transition once for each state-action input pair" is a way to get a deterministic transition, but it is not clear to me how you would do this in practice if you are dealing with uncountably infinite state and action spaces - it does not seem like you could do this in a way that results in a convex problem - in fact it is not clear to me that this would result in a continuous objective function, so even nonlinear solvers could not handle it.

GaussSeidelValueIteration and ValueIteration algorithms are identical?

reading [2021-01-05 15:52:36-08:00, revision 2f3d6e6] in 7.5. value iteration and 7.6. asynchronous value iteration.

Maybe I misunderstand, but it seems that the yellow-highlighted sections do exactly the same job. The rest being identical.
One uses list comprehension while the other uses a for loop.
But both actually iterate through the whole state space.

In GaussSeidelValueIteration, should not backup be applied only on a subset of the state space in each iteration?

PS: In addition a small detail: in the GaussSeidelValueIteration algorithm, many variables are instantiated from the P::MDP (S, A, T, R, γ = P.S, P.A, P.T, P.R, P.γ) but only P.S is used. In ValueIteration and PolicyIteration, only what is needed is instantiated, which makes it easier to understand I think.

Ch 13 issue

I guess wrong file has been uploaded for actor-critic methods (ch 13).

General Wonderful Reference for MDP Planning

In particular, I think referencing the chapters on Heuristic Search and some of Approximation Algs would be super useful (maybe in the appendix?)

Mausam, Andrey Kolobov. "Planning with markov decision processes: an ai perspective." Morgan & Claypool Publishers (2012).

https://www.morganclaypool.com/doi/abs/10.2200/S00426ED1V01Y201206AIM017

"Learning Curve" in index

it would be nice to have "learning curve" in the index, so I can tell my students to "plot a learning curve" for your RL algorithm and they could go look it up if they didn't know what it meant

Page 33, Simple Illustration of concepts in a Bayesian network would add clarity

I think adding an illustration of simple 3 step Bayesian network to explain the the notation of pa(x) would really help with clarity, I would also suggest introducing the probability expression before starting with the code.

Do you have any assumed versions of library for Julia?

Do you assume any specific versions of libraries (e.g., LightGraphs.jl) for this book to run codes?

This question happens when I read and run programs on Sec 3.5.
As noted in Appendix G, codes in the PDF are compatible with Julia 1.5, but others are not mentioned.
The current LightGraphs.jl in my environment is v1.3.5, and the function name is topological_sort_by_dfs, but LigtGraphs.jl repository seems to update the function now for the next version of LightGraphs.jl. I guess the next version have the function topological_sort.

I suggest putting the version of assumed libraries in Appendix for increasing the reproducibility.

Algorithm 2.3 Might need more explanation

Thank you very much for putting the book on the web - this is absolutely amazing.

It seems algorithm 2.3 either has a bug or might require more explanation. For example, is it possible to compute the probability from the text that nothing is wrong - probability(bn, Assignment (b=1, s=1, e=1, d=1, c=1))?

Thank you very much,
-Simon

y

page 621
# formally there is no y here
julia> append!(x, [2, 3]) # append y to the end of x

in julia/base/bitarray.jl :
function append!(B::BitVector, items::BitVector)

Clarity of legend on figure about Gaussian Mixture

reading [2021-01-05 15:52:36-08:00, revision 2f3d6e6] page 48.
First of all, congratulations for you great job!!

Here a small detail about the legend of this figure

I am used to seeing density plots where areas under plots are 1.
If the area under the blue one should be 1, then the areas under each individual red density plots must be way smaller than 1 (which is unusual to see!).
True, you explain it well "we plot the density of two components scaled by their weights". But in the legend, the red plots are only said to be "components".

To avoid confusions, I would suggest to define the red legend as "scaled components".

Appendices for Problems

Wouldn't be great to upload at least some part of Appendices such as Problems part. The reader has no sense about the hex world that has been extensively used in the book. The same goes for Mathematical Concepts for contraction mapping.