algorithmsbooks / decisionmaking Goto Github PK
View Code? Open in Web Editor NEWAlgorithms for Decision Making textbook
Algorithms for Decision Making textbook
I tried to run Example 3.1 based on codes in Sec. 2 and Sec.3 as follows:
X = Variable(:x, 2); Y = Variable(:y, 2); Z = Variable(:z, 2)
# Table of X, Y (above) and Y, Z (below) at p.45
φ1 = Factor([X, Y], FactorTable((x=0, y=0) => 0.3, (x=0, y=1) => 0.4, (x=1, y=0) => 0.2, (x=1, y=1) => 0.1))
φ2 = Factor([Y, Z], FactorTable((y=0, z=0) => 0.2, (y=0, z=1) => 0.0, (y=1, z=0) => 0.3, (y=1, z=1) => 0.5))
and I got the below result after applying Base.:* function:
using PrettyPrint
φ12 = φ1 * φ2;
pprint(φ12.table)
# output is
http://localhost:8888/lab?{
{:y : 0, :z : 1, :x : 0} : 0.0,
{:y : 0, :z : 2, :x : 0} : 0.0,
{:y : 1, :z : 1, :x : 1} : 0.05,
{:y : 1, :z : 2, :x : 0} : 0.0,
{:y : 1, :z : 2, :x : 1} : 0.0,
{:y : 0, :z : 1, :x : 1} : 0.0,
{:y : 0, :z : 2, :x : 1} : 0.0,
{:y : 1, :z : 1, :x : 0} : 0.2,
}
This result did not equal to what I expected; The result table at p.45 of X, Y, Z can be obtained by replacing 0 -> 1, 1-> 2, and 2->3 in our definition. After checking the code of assignment in Algorithm 2.1, I'm thinking that this result is from 1:v.m of
function assignments(vars::AbstractVector{Variable})
names = [var.name for var in vars]
return vec([Assignment(n=>v for (n,v) in zip(names, values)) for values in product((1:v.m for v in vars)...)])
end
So, some parts of examples are a bit confusing for me. I know that the value in Julia is often assumed as 1-origin, but for probabilities we often uses {0,1} and I'm sorry I have no answer about this point. However, some remarks or comments are useful to learn deeply from the textbook with Julia.
In Example 2.8, to determine whether D and B are conditionally independent given F, we check the two paths between them, which have inverted forks. It is stated that "Since F is a descendant of C, there is d-separation along the inverted fork", but this is the opposite of what is prescribed in Section 2.6: namely, because F is a descendant of the fork, there is not d-separation.
Page 13, section 1.5:
improving the safety of aircraft.
Shouldn't it be aircrafts?
I'd suggest the setup of a low-traffic newsletter to notify readers about new releases of the book with the "changelog" of each release. At the same time it's a useful tool to notify the release of the final (print) version.
My two cents
I've been studying Bayesian nets for a while now (I'm a very beginner here) and something that's never quite clear is : what does an edge actually mean ? Why is there an arrow ? In 2.5:
Directed edges connect pairs of nodes, with cycles in the graph being prohibited. The arrows
indicate direct probabilistic relationships. Associated with each node Xi is a
conditional distribution P(Xi | Pa( Xi )), where Pa( Xi ) represents the parents of
So, what is exactly that "direct probabilistic relationship" ? Causation, correlation, conditional dependence ?
Moreover, on 2.6, it is said :
The reason that a Bayesian network can represent a joint distribution with fewer
independent parameters than would normally be required is due to the condi-
tional independence assumptions encoded in its graphical structure.
However conditional independence is not represented in a BN (that is, there's no graphical representation), it is deduced from the BN (via d-separation if I understand correctly). Worse, what is represented with edges is conditional dependence, not independence. So it's hard to understand.
So, unless I still got no clue about what's going on (which is quite possible), I think the explanation could make things easier if it distinguished between what is represented visually and what meaning it has.
In policy_search.tex
you create the MvNormal
with mean [0, 1]
and covariance I
, but then redefine the variable p
with:
p = MvNormal([mean(xdomain), mean(ydomain)], Matrix{Float64}(2I, 2, 2))
So the caption in the cross-entropy method figure (10.5) needs to be updated with 𝒩([0, 3], 2I)
Hi guys and thanks for posting your book!
Economists (like myself) solve MDPs all the time and even created some toolkits: VFIToolkit.m, QuantEcon.jl etc.
Unfortunately our toolkits & learning materials are segmented from others who also solve these types of problems.
It would be awesome if economists, engineers etc all cooperated on the same packages.
Would it be possible to include a very simple, routine economics problem?
Perhaps the one I posted on POMDPs.jl?
The concept is hard to visualize without a diagram.
Disappointment that has no analogy Algorithms for decision making.
In equation 9.7, you have a single n - I think it should be an m
Section 2.1 Degrees of Belief and Probability
The three sentences below may be reorganized. What are A and B. Followed by how to represent A > B and A == B. The flow below may be a bit out of sync.
We would like to be able to represent, for example, that the plausibility of a proposition A is greater than the plausibility of another proposition B. If A represents ‘‘I will wear a coat’’, and B represents ‘‘it will rain’’, then we would write A � B. If we believe A and B are equally plausible, then we write A � B.
Section 2.5: Not clear what is the satellite monitoring problem. It kind of seems, it's a very well know problem that authors have referred to. In that case a reference to the problem discussed else where should be provided.
Alternatively, it may help explaining Bayesian Network through the satellite monitoring sample. The mention of DAG, and function Pa(x) or parent is kind of too abstract for someone to conceive. It may help to establish it with an example and then discuss the related math.
Any future plans to support for EPUB or MOBI so users can read on e-readers?
Currently, we can use Calibre software but that won't compile properly because it has code but not just text.
Thanks.
From [2021-01-05 15:52:36-08:00, revision 2f3d6e6]
I am reading the Example 7.4. "An example solving a finite-horizon MDP with a linear transition function and quadratic reward."
I could not understand the shapes of the ellipses.
The legend says
"The blue contour lines show the Gaussian distributions over the state at each iteration".
The text says
"w is drawn from a zero-mean multivariate Gaussian distribution with covariance
0.1I
."
Since the covariance matrix is proportional to the identity matrix, I would have expected a circular shape. It is the case for the first step. But then it seems that the covariance matrix is not diagonal and its [0, 0]
and [1, 1]
elements are not even equal.
Maybe I have misunderstood something here, but I think you could add an explanation of these shapes and their evolution in the legend.
The fourth line in the code example should be
vars = [B, S, E, C, D]
not
vars = [B, S, E, D]
Hey guys, the book looks great :)
In Figure 22.2 it might be neat to include indications of the value estimate, Q(h, a) at each action node, perhaps with color. Then it will be easy to see that the branches with higher value are favored.
It may also be useful to indicate the number of times a branch has been simulated with the thickness of the line.
I'm not sure what the aesthetic consequences of doing this are though :)
I didn't look but I assume there is also a figure in the MDP section that could benefit similarly.
Last line of section 1.4.4.
Algorithms that combine both symbolic and connectionist paradigms remains an active area of research today.
Should remains
not be remain
?
I'd sent the PDF to my local OfficeWorks in Brisbane, Australia to have a hard copy.
The job failed more than once after page 28, according to the staff there.
They were not able to offer more technical details, but I imagine there might be recursion of buffer limits in PDF streams.
Note that Letter paper is not generally available in Australia, so it was being resized for A4.
You should add a mention of NamedTupleTools
in the figure text to algorithm 2.1 on page 26.
Further, it's not clear why the functions
variablenames
assignments
are present not are they used in the chapter.
Asside from that, thanks for the book :)
I'm grateful for a book that explains decision-making in detail, but I'm curious why the first example code in the book is wrong. I use julia 1.5.3, which version is suitable for this book?
the code is
...
X = Variable(:x, 2)
Y = Variable(:y, 2)
Z = Variable(:z, 2)
φ = Factor([X, Y, Z], FactorTable(
(x=1, y=1, z=1) => 0.08,
(x=1, y=1, z=2) => 0.31,
(x=1, y=2, z=1) => 0.09,
(x=1, y=2, z=2) => 0.37,
(x=2, y=1, z=1) => 0.01,
(x=2, y=1, z=2) => 0.05,
(x=2, y=2, z=1) => 0.02,
(x=2, y=2, z=2) => 0.07,
))
The output is
ERROR: LoadError: MethodError: Cannot
convert
an object of type NamedTuple{(:x, :y, :z),Tuple{Int64,Int64,Int64}} to an object of type Dict{Symbol,Int64}
Closest candidates are:
convert(::Type{T}, ::T) where T<:AbstractDict at abstractdict.jl:515
convert(::Type{T}, ::AbstractDict) where T<:AbstractDict at abstractdict.jl:517
convert(::Type{T}, ::T) where T at essentials.jl:171
...
Stacktrace:
[1] setindex!(::Dict{Dict{Symbol,Int64},Float64}, ::Float64, ::NamedTuple{(:x, :y, :z),Tuple{Int64,Int64,Int64}}) at ./dict.jl:372
[2] Dict{Dict{Symbol,Int64},Float64}(::Pair{NamedTuple{(:x, :y, :z),Tuple{Int64,Int64,Int64}},Float64}) at ./dict.jl:107
[3] top-level scope at /home/hanxj/code/julia/demo.jl:36
[4] include(::Function, ::Module, ::String) at ./Base.jl:380
[5] include(::Module, ::String) at ./Base.jl:368
[6] exec_options(::Base.JLOptions) at ./client.jl:296
[7] _start() at ./client.jl:506
in expression starting at /home/hanxj/code/julia/demo.jl:36
This book
byprovides a
"...in order to navigate safety to our destination." Replace safety with safely
For the GradientQLearning
example 17.3, add the following after the Q(θ,s,a) = ...
line in the juliaverbatim
block:
∇Q(θ,s,a) = β(s,a)
This is in model-uncertainty/model-free-methods.tex
line 933.
Hey guys, I have just spent quite some time trying to interpret the Forward Search algorithm 9.2, but it is still not quite clear to me how it is supposed to work. In particular, I don't understand the
U'(s) = forward_search(P, s, d-1, U).u
line. It seems like there should be an s'
there and a for loop over states.
It might make more sense to implement it more like Alg 4.6 in the DMU book - that one is really clear.
Should be
In Chapter 2, you say "Universal comparability and transitivity assumptions lead to an ability to represent plausibility by a real-valued function". It might not be immediately obvious why this is true (it is not immediately obvious to me, and I am teaching it tomorrow 😰). In DMU, you have a reference for that statement (so I can point students to it if they ask). I think it would be helpful to also have that reference in this book.
On page xxii (22), the link to github.com/cormullion/juliamono
actually points to https://algorithmsbook.com/files/github.com/cormullion/juliamono
, which doesn't exist.
release the Latex source code
Says "suppose we want to find out if A and B are d-separated by C", then talks about X,Y, and C (but not A or B) in the examples. I think A and B should be replaced by X and Y, right?
It worked for me like this
φ = Factor([X, Y, Z], FactorTable(
Dict(:x=>1, :y=>1, :z=>1) => 0.08, Dict(:x=>1, :y=>1, :z=>2) => 0.31,
Dict(:x=>1, :y=>2, :z=>1) => 0.09, Dict(:x=>1, :y=>2, :z=>2) => 0.37,
Dict(:x=>2, :y=>2, :z=>1) => 0.02, Dict(:x=>2, :y=>2, :z=>2) => 0.07,
))
I definitely want to express my appreciation and gratitude to you guys for putting this out for free. This is great stuff! However, for me, nothing beats the UX of actually flipping through pages. I'd be happy to pay a good premium to do so. Fair enough if there aren't enough people like me to justify this, but I'd be curious to know if you guys have kicked the tires on this at all.
Last sentence is missing the word "to". It currently reads: In order represent a probability distribution,...
It should read: In order to represent a probability distribution,...
The fourth line in the code example should be
vars = [B, S, E, C, D]
not
vars = [B, S, E, D]
Figure 2.10 specifies that
" From this distribution, we might use the mean or mode of this distribution to impute the missing values."
The clarification in section 2.3.1 could have been written in section 2.2.1 as well.
For example,
θ6 = 1 - (θ1 +θ2 +...+θ5)
What is written is:
However, we need only five independent parameters to uniquely specify the distribution over the outcomes of the roll because we know that the distribution must sum to 1.
In 9.9.1, you say "A common approximation to make U(a1:d) convex is to assume deterministic dynamics." Unless, I am mistaken, to be convex, the dynamics must not only be deterministic, but also linear to make the problem convex. This is because the dynamics constraint is an equality constraint and the only type of equality constraint that is convex is a linear constraint.
Also, later you say "sampling a random transition once for each state-action input pair" is a way to get a deterministic transition, but it is not clear to me how you would do this in practice if you are dealing with uncountably infinite state and action spaces - it does not seem like you could do this in a way that results in a convex problem - in fact it is not clear to me that this would result in a continuous objective function, so even nonlinear solvers could not handle it.
reading [2021-01-05 15:52:36-08:00, revision 2f3d6e6] in 7.5. value iteration
and 7.6. asynchronous value iteration
.
Maybe I misunderstand, but it seems that the yellow-highlighted sections do exactly the same job. The rest being identical.
One uses list comprehension while the other uses a for
loop.
But both actually iterate through the whole state space.
In GaussSeidelValueIteration
, should not backup
be applied only on a subset of the state space in each iteration?
PS: In addition a small detail: in the GaussSeidelValueIteration
algorithm, many variables are instantiated from the P::MDP
(S, A, T, R, γ = P.S, P.A, P.T, P.R, P.γ
) but only P.S
is used. In ValueIteration
and PolicyIteration
, only what is needed is instantiated, which makes it easier to understand I think.
I guess wrong file has been uploaded for actor-critic methods (ch 13).
In particular, I think referencing the chapters on Heuristic Search and some of Approximation Algs would be super useful (maybe in the appendix?)
Mausam, Andrey Kolobov. "Planning with markov decision processes: an ai perspective." Morgan & Claypool Publishers (2012).
https://www.morganclaypool.com/doi/abs/10.2200/S00426ED1V01Y201206AIM017
it would be nice to have "learning curve" in the index, so I can tell my students to "plot a learning curve" for your RL algorithm and they could go look it up if they didn't know what it meant
I think adding an illustration of simple 3 step Bayesian network to explain the the notation of pa(x)
would really help with clarity, I would also suggest introducing the probability expression before starting with the code.
Do you assume any specific versions of libraries (e.g., LightGraphs.jl) for this book to run codes?
This question happens when I read and run programs on Sec 3.5.
As noted in Appendix G, codes in the PDF are compatible with Julia 1.5, but others are not mentioned.
The current LightGraphs.jl in my environment is v1.3.5, and the function name is topological_sort_by_dfs, but LigtGraphs.jl repository seems to update the function now for the next version of LightGraphs.jl. I guess the next version have the function topological_sort.
I suggest putting the version of assumed libraries in Appendix for increasing the reproducibility.
Thank you very much for putting the book on the web - this is absolutely amazing.
It seems algorithm 2.3 either has a bug or might require more explanation. For example, is it possible to compute the probability from the text that nothing is wrong - probability(bn, Assignment (b=1, s=1, e=1, d=1, c=1))?
Thank you very much,
-Simon
page 621
# formally there is no y here
julia> append!(x, [2, 3]) # append y to the end of x
in julia/base/bitarray.jl :
function append!(B::BitVector, items::BitVector)
reading [2021-01-05 15:52:36-08:00, revision 2f3d6e6]
page 48.
First of all, congratulations for you great job!!
Here a small detail about the legend of this figure
I am used to seeing density plots where areas under plots are 1.
If the area under the blue one should be 1, then the areas under each individual red density plots must be way smaller than 1 (which is unusual to see!).
True, you explain it well "we plot the density of two components scaled by their weights". But in the legend, the red plots are only said to be "components".
To avoid confusions, I would suggest to define the red legend as "scaled components".
Wouldn't be great to upload at least some part of Appendices such as Problems part. The reader has no sense about the hex world that has been extensively used in the book. The same goes for Mathematical Concepts for contraction mapping.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.