Normal Bayesian two-armed bandits
Date
Authors
Major Professor
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Altmetrics
Authors
Research Projects
Organizational Units
Journal Issue
Is Version Of
Versions
Series
Department
Abstract
The undiscounted normal two-armed bandit is examined from a Bayesian point of view for independent and singular priors on the mean vector ((theta)(,1),(theta)(,2)). Quantification is given to the well-accepted notion that an apparently inferior source needs to be sampled now and then. The optimal strategy is defined in terms of the source differential function, (DELTA)('n) = V(,y)('n) - V(,x)('n), where V(,x)('n) and V(,y)('n) are the valuations of sampling the two respective sources. For the independent prior case, bounds and linear approximations for (DELTA)('n) are obtained by recursion. The limiting behavior of (DELTA)('n) is discussed, in terms of certain summary parameters of location and information. In the more tractable singular case, the optimal strategy is myopic in the case of equal prior information on both sources.