Coder Social home page Coder Social logo

Comments (12)

mykelk avatar mykelk commented on June 26, 2024 1

This is a good idea. I'll mention the versions in the Julia appendix.

from decisionmaking.

mykelk avatar mykelk commented on June 26, 2024 1

Ah, we created this. I'll insert in the appendix tomorrow, but the code looks like this:

struct SetCategorical{S}
    elements::Vector{S} # Set elements (could be repeated)
    distr::Categorical # Categorical distribution over set elements

    function SetCategorical(elements::AbstractVector{S}) where S
        weights = ones(length(elements))
        return new{S}(elements, Categorical(normalize(weights, 1)))
    end

    function SetCategorical(elements::AbstractVector{S}, weights::AbstractVector{Float64}) where S
        ℓ₁ = norm(weights,1)
        if ℓ₁ < 1e-6 || isinf(ℓ₁)
            return SetCategorical(elements)
        end
        distr = Categorical(normalize(weights, 1))
        return new{S}(elements, distr)
    end
end

Distributions.rand(D::SetCategorical) = D.elements[rand(D.distr)]
Distributions.rand(D::SetCategorical, n::Int) = D.elements[rand(D.distr, n)]
function Distributions.pdf(D::SetCategorical, x)
    sum(e == x ? w : 0.0 for (e,w) in zip(D.elements, D.distr.p))
end

from decisionmaking.

mykelk avatar mykelk commented on June 26, 2024 1

Ah! Got it. There are some kinds of problems that made additional assumptions about the model (e.g., that the dynamics are linear Gaussian). We first see something like this in Alg. 7.11. For these other kids of problems, we just assume that P contains the necessary fields. We don't include this in a code block, but basically you can create your own struct or use a named tuple with the various components defined. I'll inject some additional explanation into the book to reduce confusion. Thanks!

from decisionmaking.

mykelk avatar mykelk commented on June 26, 2024 1

Ah, the two argument version of findmax with the first argument being a function is given in Appendix G.5. That second argument is a collection. This will appear in Julia 1.7.

from decisionmaking.

NicoMandel avatar NicoMandel commented on June 26, 2024

I have a question along the same lines: In the section on State Uncertainty, specifically the Particle Filter variations, a function called ''SetCategorical'' is used, for which I somehow seem to be unable to find any package or documentation online. Could you please point me to a resource, if there is any online?

from decisionmaking.

NicoMandel avatar NicoMandel commented on June 26, 2024

Thank you very much for the quick response, impressive. I hope you don't mind me asking two more things to double checkon this:
The struct for POMDPs is on p. 373 in Chapter 19, while the first actual use happens on page 401, is this a correct observation? I was slightly confused by the split of both, especially since the majority of functions in between take an argument P. After a little closer observation I realised though that the P seems to denote a collection of probability distributions, the transition probability function and the observation probability function, is this also correct?
What would be the usual procedure for this, would this implementation conform with standard distributions supplied by JuliaStats?

from decisionmaking.

mykelk avatar mykelk commented on June 26, 2024

The POMDP struct is used in the very next algorithm (Alg. 19.2). It shows up as the P in the argument list. The P is used throughout the book to define the problem (MDP, POMDP, MG, POMG, DecPOMDP, etc.) To avoid confusion, though, I can add a small note in the caption of 19.2 to clarify what is meant by P. Thanks!

from decisionmaking.

mykelk avatar mykelk commented on June 26, 2024

I think SetCategorical would adhere to the API of Distributions.jl. They define Categorical, which defines a discrete distribution over 1:n, but not over arbitrary sets. If something like this gets merged into Distributions.jl within the next month, we'd be happy to switch to using that.

from decisionmaking.

NicoMandel avatar NicoMandel commented on June 26, 2024

Thank you for the answer. I see that makes sense now, especially with the O(a ...) in Algorithm 19.2. I was really confused because b was cast to a specific type, but P was not. I was just aware from particle filters from Thrun's Probabilistic Robotics, so I assumed that P was to represent any collection of state, transition and observation distributions, with the a similar to the robotics u control input. What fully led me to assume this was the use of Σs and Σo as child members of P in Algorithm 19.3, as well as the slight deviation of notation in the Gaussian family to P.Ta, P.Os for the KF or P.fT and P.fO for the UKF and EKF.
Can I just assume that the P (sorry, I am not sure which letter to use in html/markdown, is it Ρ) should be cast to a POMDP struct for all Algorithms of Section 19 and 20?

from decisionmaking.

NicoMandel avatar NicoMandel commented on June 26, 2024

Sorry to bother again. I fully well appreciate your time for getting back to me and do not want to bother you. I am not coming from a Julia background and looking through the functions and understanding them is a way for me to see how things work. I do not expect you to make changes caused by the feedback and questions I pose, I rather see this as the documentation of a field test what someone from a different programming background might stumble across. Please let me know whether this is helpful or just a nuisance and whether I should continue, move to a new Issue, change the format or just stop.

It appears that function expand() in Algorithm 20.8 on page 409 points to a function which is first defined a few pages further in the next Algorithm, 20.9. It took me some time to find this, since the same Algorithm 20.8 refers to a struct implementation from the MDP section, ValueIteration, which was my first point of call for looking for this function.
(EDIT: Directly after writing this I continued reading and saw the reference to Algorithm 20.9, please apologize that mishap)
I also spent quite a while researching the definition of the syntax function(π::LookaheadAlphaVectorPolicy)(b) from Algorithm 20.4 until I found the (I assume so...) solution in Appendix G.2.3, apparently overloading the call to AlphaVectorPolicy? But why are the returns turned into a named return with u= and a=?

A similar confusion overcame me when looking at the greedy function in Algorithm 20.5 on page 405, which seems to run a through an anonymous function (according to Appendix G.2.2), which takes itself as an argument, and -- according to findmax -- maps it to the dimensions of P.A? Is that anywhere near correct?

Thank you for taking the time to even read the comments, I genuinely appreciate it.

from decisionmaking.

mykelk avatar mykelk commented on June 26, 2024

It does appear that the AlphaVectorPolicy and the LookaheadAlphaVectorPolicy has a slightly different API. I'll look into this and make it consistent, most likely changing function(π::AlphaVectorPolicy)(b) to return an action instead of a named tuple. This might take a couple days since I'll need to test the changes.

For Alg 20.5, the findmax will return the maximum and the maximum element of P.A when lookahead is applied. It should return u as as real number and a as a single element of P.A.

from decisionmaking.

NicoMandel avatar NicoMandel commented on June 26, 2024

Thank you for the information, I look forward to the updates. The returns of findmax are pretty clear, however I struggle with the call to the function, especially the a → (EDIT: apparently Unicode does not work in code brackets for markdown...) to the lookahead function, which takes a as an argument itself. Is this a Julia-specific form of recursion, or related to the Appendix G.2.2 anonymous function implementation?
The second argument to findmax adds to the confusion, because I am unable see how P.A would indicate the dimension along which findmax operates...

from decisionmaking.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.