Decision making under uncertainty : theory and application

Kochenderfer, Mykel J.

Decision making under uncertainty : theory and application - 1st ed. - Cambridge : MIT Press, 2015 - xxv, 323 p. : il. - MIT Lincoln Laboratory Series .

Incluye índice.

1. Introduction -- 1.1. Decision Making -- 1.2. Example Applications -- 1.2.1. Traffic Alert and Collision Avoidance System -- 1.2.2. Unmanned Aircraft Persistent Surveillance -- 1.3. Methods for Designing Decision Agents -- 1.3.1. Explicit Programming -- 1.3.2. Supervised Learning -- 1.3.3. Optimization -- 1.3.4. Planning -- 1.3.5. Reinforcement Learning -- 1.4. Overview -- 1.5. Further Reading References -- 2. Probabilistic Models -- 2.1. Representation -- 2.1.1. Degrees of Belief and Probability -- 2.1.2. Probability Distributions -- 2.1.3. Joint Distributions -- 2.1.4. Bayesian Network Representation -- 2.1.5. Conditional Independence -- 2.1.6. Hybrid Bayesian Networks -- 2.1.7. Temporal Models -- 2.2. Inference -- 2.2.1. Inference for Classification -- 2.2.2. Inference in Temporal Models -- 2.2.3. Exact Inference -- 2.2.4. Complexity of Exact Inference -- 2.2.5. Approximate Inference -- 2.3. Parameter Learning -- 2.3.1. Maximum Likelihood Parameter Learning -- 2.3.2. Bayesian Parameter Learning -- 2.3.3. Nonparametric Learning -- 2.4. Structure Learning -- 2.4.1. Bayesian Structure Scoring -- 2.4.2. Directed Graph Search -- 2.4.3. Markov Equivalence Classes -- 2.4.4. Partially Directed Graph Search -- 2.5. Summary -- 2.6. Further Reading References -- 3. Decision Problems -- 3.1. Utility Theory -- 3.1.1. Constraints on Rational Preferences -- 3.1.2. Utility Functions -- 3.1.3. Maximum Expected Utility Principle -- 3.1.4. Utility Elicitation -- 3.1.5. Utility of Money -- 3.1.6. Multiple Variable Utility Functions -- 3.1.7. Irrationality -- 3.2. Decision Networks -- 3.2.1. Evaluating Decision Networks -- 3.2.2. Value of Information -- 3.2.3. Creating Decision Networks -- 3.3. Games -- 3.3.1. Dominant Strategy Equilibrium -- 3.3.2. Nash Equilibrium -- 3.3.3. Behavioral Game Theory -- 3.4. Summary -- 3.5. Further Reading References -- 4. Sequential Problems -- 4.1. Formulation -- 4.1.1. Markov Decision Processes -- 4.1.2. Utility and Reward -- 4.2. Dynamic Programming -- 4.2.1. Policies and Utilities -- 4.2.2. Policy Evaluation -- 4.2.3. Policy Iteration -- 4.2.4. Value Iteration -- 4.2.5. Grid World Example -- 4.2.6. Asynchronous Value Iteration -- 4.2.7. Closed- and Open-Loop Planning -- 4.3. Structured Representations -- 4.3.1. Factored Markov Decision Processes -- 4.3.2. Structured Dynamic Programming -- 4.4. Linear Representations -- 4.5. Approximate Dynamic Programming -- 4.5.1. Local Approximation -- 4.5.2. Global Approximation -- 4.6. Online Methods -- 4.6.1. Forward Search -- 4.6.2. Branch and Bound Search -- 4.6.3. Sparse Sampling -- 4.6.4. Monte Carlo Tree Search -- 4.7. Direct Policy Search -- 4.7.1. Objective Function -- 4.7.2. Local Search Methods -- 4.7.3. Cross Entropy Methods -- 4.7.4. Evolutionary Methods -- 4.8. Summary -- 4.9. Further Reading References -- 5. Model Uncertainty -- 5.1. Exploration and Exploitation -- 5.1.1. Multi-Armed Bandit Problems -- 5.1.2. Bayesian Model Estimation -- 5.1.3. Ad Hoc Exploration Strategies -- 5.1.4. Optimal Exploration Strategies -- 5.2. Maximum Likelihood Model-Based Methods -- 5.2.1. Randomized Updates -- 5.2.2. Prioritized Updates -- 5.3. Bayesian Model-Based Methods -- 5.3.1. Problem Structure -- 5.3.2. Beliefs over Model Parameters -- 5.3.3. Bayes-Adaptive Markov Decision Processes -- 5.3.4. Solution Methods -- 5.4. Model-Free Methods -- 5.4.1. Incremental Estimation -- 5.4.2. Q-Learning -- 5.4.3. Sarsa -- 5.4.4. Eligibility Traces -- 5.5. Generalization -- 5.5.1. Local Approximation -- 5.5.2. Global Approximation -- 5.5.3. Abstraction Methods -- 5.6. Summary -- 5.7. Further Reading References -- 6. State Uncertainty -- 6.1. Formulation -- 6.1.1. Example Problem -- 6.1.2. Partially Observable Markov Decision Processes -- 6.1.3. Policy Execution -- 6.1.4. Belief-State Markov Decision Processes -- 6.2. Belief Updating -- 6.2.1. Discrete State Filter -- 6.2.2. Linear-Gaussian Filter -- 6.2.3. Particle Filter -- 6.3. Exact Solution Methods -- 6.3.1. Alpha Vectors -- 6.3.2. Conditional Plans -- 6.3.3. Value Iteration -- 6.4. Offline Methods -- 6.4.1. Fully Observable Value Approximation -- 6.4.2. Fast Informed Bound -- 6.4.3. Point-Based Value Iteration -- 6.4.4. Randomized Point-Based Value Iteration -- 6.4.5. Point Selection -- 6.4.6. Linear Policies -- 6.5. Online Methods -- 6.5.1. Lookahead with Approximate Value Function -- 6.5.2. Forward Search -- 6.5.3. Branch and Bound -- 6.5.4. Monte Carlo Tree Search -- 6.6. Summary -- 6.7. Further Reading References -- 7. Cooperative Decision Making -- 7.1. Formulation -- 7.1.1. Decentralized POMDPs -- 7.1.2. Example Problem -- 7.1.3. Solution Representations -- 7.2. Properties -- 7.2.1. Differences with POMDPs -- 7.2.2. Dec-POMDP Complexity -- 7.2.3. Generalized Belief States -- 7.3. Notable Subclasses -- 7.3.1. Dec-MDPs -- 7.3.2. ND-POMDPs -- 7.3.3. MMDPs -- 7.4. Exact Solution Methods -- 7.4.1. Dynamic Programming -- 7.4.2. Heuristic Search -- 7.4.3. Policy Iteration -- 7.5. Approximate Solution Methods -- 7.5.1. Memory-Bounded Dynamic Programming -- 7.5.2. Joint Equilibrium Search -- 7.6. Communication -- 7.7. Summary -- 7.8. Further Reading References -- 8. Probabilistic Surveillance Video Search -- 8.1. Attribute-Based Person Search -- 8.1.1. Applications -- 8.1.2. Person Detection -- 8.1.3. Retrieval and Scoring -- 8.2. Probabilistic Appearance Model -- 8.2.1. Observed States -- 8.2.2. Basic Model Structure -- 8.2.3. Model Extensions -- 8.3. Learning and Inference Techniques -- 8.3.1. Parameter Learning -- 8.3.2. Hidden State Inference -- 8.3.3. Scoring Algorithm -- 8.4. Performance -- 8.4.1. Search Accuracy -- 8.4.2. Search Timing -- 8.5. Interactive Search Tool -- 8.6. Summary References -- 9. Dynamic Models for Speech Applications -- 9.1. Modeling Speech Signals -- 9.1.1. Feature Extraction -- 9.1.2. Hidden Markov Models -- 9.1.3. Gaussian Mixture Models -- 9.1.4. Expectation-Maximization Algorithm -- 9.2. Speech Recognition -- 9.3. Topic Identification -- 9.4. Language Recognition -- 9.5. Speaker Identification -- 9.5.1. Forensic Speaker Recognition -- 9.6. Machine Translation -- 9.7. Summary References -- 10. Optimized Airborne Collision Avoidance -- 10.1. Airborne Collision Avoidance Systems -- 10.1.1. Traffic Alert and Collision Avoidance System -- 10.1.2. Limitations of Existing System -- 10.1.3. Unmanned Aircraft Sense and Avoid -- 10.1.4. Airborne Collision Avoidance System X -- 10.2. Collision Avoidance Problem Formulation -- 10.2.1. Resolution Advisories -- 10.2.2. Dynamic Model -- 10.2.3. Reward Function -- 10.2.4. Dynamic Programming -- 10.3. State Estimation -- 10.3.1. Sensor Error -- 10.3.2. Pilot Response -- 10.3.3. Time to Potential Collision -- 10.4. Real-Time Execution -- 10.4.1. Online Costs -- 10.4.2. Multiple Threats -- 10.4.3. Traffic Alerts -- 10.5. Evaluation -- 10.5.1. Safety Analysis -- 10.5.2. Operational Suitability and Acceptability -- 10.5.3. Parameter Tuning -- 10.5.4. Flight Test -- 10.6. Summary References -- 11. Multiagent Planning for Persistent Surveillance -- 11.1. Mission Description -- 11.2. Centralized Problem Formulation -- 11.2.1. State Space -- 11.2.2. Action Space -- 11.2.3. State Transition Model -- 11.2.4. Reward Function -- 11.3. Decentralized Approximate Formulations -- 11.3.1. Factored Decomposition -- 11.3.2. Group Aggregate Decomposition -- 11.3.3. Planning -- 11.4. Model Learning -- 11.5. Flight Test -- 11.6. Summary References -- 12. Integrating Automation with Humans -- 12.1. Human Capabilities and Coping -- 12.1.1. Perceptual and Cognitive Capabilities -- 12.1.2. Naturalistic Decision Making -- 12.2. Considering the Human in Design -- 12.2.1. Trust and Value of Decision Logic Transparency -- 12.2.2. Designing for Different Levels of Certainty -- 12.2.3. Supporting Decisions over Long Timescales -- 12.3. A Systems View of Implementation -- 12.3.1. Interface, Training, and Procedures -- 12.3.2. Measuring Decision Support Effectiveness -- 12.3.3. Organization Influences on System Effectiveness -- 12.4. Summary -- References -- Index

9780262029254

DIF006610


SISTEMAS DE SOPORTE A LA TOMA DE DECISIONES
INTELIGENCIA ARTIFICIAL
SISTEMAS INTELIGENTES

toma de decisiones bajo incertidumbre

Powered by Koha