Abstract—A primary challenge of agent-based policy learning in complex and uncertain environments is escalating computational complexity with the size of the task space and the number of agents. Nonetheless, there is ample evidence in the natural world that high functioning social mammals learn to solve complex problems with ease. This ability to solve computationally intractable problems stems in part from brain circuits for hierarchical representation of state and action spaces and learned policies arising from these representations. Using such mechanisms for state representation and action abstraction, we constrain state-action choices in reinforcement learning in order to improve learning efficiency and generalization of learned policies within a single-agent herding task. We show that satisficing and generalizable policies emerge, which reduce computational cost, and/or memory resources.
Index Terms—Markov decision process; reinforcement learning; hierarchical state representation; robotic herding.
The authors are with Thayer School of Engineering at Dartmouth College, Hanover, NH 03755, USA (e-mail: email@example.com, firstname.lastname@example.org).
Cite: Tao Mao and Laura E. Ray, "Hierarchical State Representation and Action Abstractions in Q-Learning for Agent-Based Herding," International Journal of Information and Electronics Engineering vol. 2, no. 4, pp.538-542, 2012.