Comments (4)
I think you should say something like "A common approximation to make U(a1:d) amenable to optimization is to assume deterministic dynamics. Moreover, if these deterministic dynamics are linear, and the reward function is convex, the problem is convex."
from decisionmaking.
Addressing your first comment now. As for sampling a random transition for each state-action input pair, you are right that you can't do that explicitly for an infinite set. However, you can seed the random generator. It makes things deterministic, but it doesn't necessarily result in a convex formulation. You can also take your samples deterministically using say the 10th percentile from a Gaussian distribution if you want to perform well even with pessimistic predictions.
from decisionmaking.
Right, regarding the second part, I just think you should change the "for each state-action input pair", probably by just referencing 9.9.2 and 9.9.3
from decisionmaking.
Sounds good. I'll post a new version tonight.
from decisionmaking.
Related Issues (20)
- 4 Parameter Learning. 4.1 Maximum Likelihood Parameter Learning HOT 2
- Norm in SetCategorical HOT 3
- "Backpropagation" in MCTS implementation HOT 2
- Chapter 10, Eq (10.13) HOT 3
- Wording in Chapter 10.6: Mirror Sampling HOT 2
- Wrong reference HOT 1
- Suggestion for line plot in Example 9.8 HOT 6
- Example 22.5 - Incorrect observation used in the second update HOT 1
- Exercise 22.3 - Inconsistent based on Algorithm 22.2 and incorrect observation probabilities HOT 3
- Typo in Algorithm 21.9 HOT 7
- Error in equation D.6 HOT 1
- A small question on page 115, Eq.(6.3) HOT 1
- Caption alg 21.8 HOT 1
- Page 27, Example 2.4 - Equivalence of joint prob. table and tree representation HOT 3
- Equation 7.22 HOT 2
- Missing loop code for algorithms in chapter 5 on descent methods HOT 4
- LightGraphs.jl / topological_sort HOT 4
- Example 4.1 HOT 4
- Example 4.2 HOT 2
- Chapter 8, fig. 8.13 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from decisionmaking.