Scan barcode
![Regret Analysis of Stochastic and Nonstochastic Multi-Armed Bandit Problems by Cesa-Bianchi Nicolo, Sebastien Bubeck](https://558130.bdp32.group/rails/active_storage/representations/redirect/eyJfcmFpbHMiOnsibWVzc2FnZSI6IkJBaHBBMzJlU2c9PSIsImV4cCI6bnVsbCwicHVyIjoiYmxvYl9pZCJ9fQ==--f6ef3326fee8f9646d3a490a542ea6756f7f9a50/eyJfcmFpbHMiOnsibWVzc2FnZSI6IkJBaDdCem9MWm05eWJXRjBTU0lJYW5CbkJqb0dSVlE2RkhKbGMybDZaVjkwYjE5c2FXMXBkRnNIYVFJc0FXa0M5QUU9IiwiZXhwIjpudWxsLCJwdXIiOiJ2YXJpYXRpb24ifX0=--038335c90cf75c275ae4d36968ac417dc4a0a3e3/Regret%20Analysis%20of%20Stochastic%20and%20Nonstochastic%20Multi-Armed%20Bandit%20Problems.jpg)
138 pages • missing pub info (editions)
ISBN/UID: 9781601986269
Format: Paperback
Language: English
Publisher: Now Publishers
Publication date: 12 December 2012
Description
A multi-armed bandit problem - or, simply, a bandit problem - is a sequential allocation problem defined by a set of actions. At each time step, a unit resource is allocated to an action and some observable payoff is obtained. The goal is to maxim...
Community Reviews
Content Warnings
![Regret Analysis of Stochastic and Nonstochastic Multi-Armed Bandit Problems by Cesa-Bianchi Nicolo, Sebastien Bubeck](https://558130.bdp32.group/rails/active_storage/representations/redirect/eyJfcmFpbHMiOnsibWVzc2FnZSI6IkJBaHBBMzJlU2c9PSIsImV4cCI6bnVsbCwicHVyIjoiYmxvYl9pZCJ9fQ==--f6ef3326fee8f9646d3a490a542ea6756f7f9a50/eyJfcmFpbHMiOnsibWVzc2FnZSI6IkJBaDdCem9MWm05eWJXRjBTU0lJYW5CbkJqb0dSVlE2RkhKbGMybDZaVjkwYjE5c2FXMXBkRnNIYVFJc0FXa0M5QUU9IiwiZXhwIjpudWxsLCJwdXIiOiJ2YXJpYXRpb24ifX0=--038335c90cf75c275ae4d36968ac417dc4a0a3e3/Regret%20Analysis%20of%20Stochastic%20and%20Nonstochastic%20Multi-Armed%20Bandit%20Problems.jpg)
138 pages • missing pub info (editions)
ISBN/UID: 9781601986269
Format: Paperback
Language: English
Publisher: Now Publishers
Publication date: 12 December 2012
Description
A multi-armed bandit problem - or, simply, a bandit problem - is a sequential allocation problem defined by a set of actions. At each time step, a unit resource is allocated to an action and some observable payoff is obtained. The goal is to maxim...