Playing Rings and Poles with HTNs
In this tutorial, we’ll see how we can create an agent that uses Hierarchical Task Networks (HTNs) to play the game of Rings and Poles.
We’ll be using the command line interface (CLI) as well as the Python library.
Install and configure the CLI if you haven’t yet. You can also initialize a directory for the sync
command if you want to use it. That’s the easiest way to get started creating content and uploading it to the platform. For an example of using the sync
command, see the Create and Run a Simple Pipeline tutorial.
We will create three things in this tutorial:
- a package with the action space and HTN methods for the game,
- an agent that uses the package to play the game, and
- a Python script that interacts with the agent to show the game play.
For now, we can put everything that isn’t Python or YAML into the same file.
Let’s call it packages/rings-and-poles.cog
.
We can start by declaring a package and domain. A package is a collection of domains designed to work together. A domain is just a collection of actions, events, behaviors, functions, and HTN tasks or methods that deliver a particular capability. How you divide up your code into packages and domains is up to you.
package com.ursafrontier.tutorials
domain games.rings-and-poles
Action Space
The action space is the set of actions and events that we use in the HTN. We can think of actions as things an agent will want to do, and events as the things that happen in the environment in which the agent is running.
When an agent runs, it will plan out a sequence of actions that it will propose. The environment (the Python script mentioned above in the case of this tutorial) will then decide whether to accept or reject the proposed actions, resulting in one or more events that are returned to the agent.
For our game, we want two actions. We can use PlaceRing
to put a ring on a pole to set up the game, and MoveRing
to move a ring from one pole to another.
Each of these can be successful or fail, so we’ll have four events: RingPlaced
, RingFiledToPlace
, RingMoved
, and RingFailedToMove
.
Each of these actions and events has parameters such as which ring and which pole. The parameter types are not enforced by the platform for now and are just for documentation purposes.
Generally, we use imperatives for actions and predicates for events, but this is only a convention and not enforced by the platform.
action PlaceRing {
integer ring
string pole
}
event RingPlaced {
integer ring
string pole
}
event RingFailedToPlace {
integer ring
string pole
}
action MoveRing {
integer ring
string from
string to
}
event RingMoved {
integer ring
string from
string to
}
event RingFailedToMove {
integer ring
string from
string to
}
Behaviors
Behaviors represent the things an agent will react to in the environment. In our example, we can declare an event that acts as a trigger for an agent to start playing the game. In this case, we use an imperative to name the event because it’s acting as a command to the agent.
Each trigger in the behavior is declared using the when
keyword followed by an event description. This could just be the event name, as used here, or an event with some of its parameters specified. For example, we could have when PlayGame(rings), rings > 3 { ... }
to trigger only when playing the game with more than three rings.
All of the event parameters are available in the body of the trigger as local variables.
In this case, we’re just passing the number of rings to play to the play
HTN macro.
event PlayGame {
integer rings
}
behavior PlayRingsAndPoles {
when PlayGame() {
play(rings)
}
}
HTN Methods
We start by setting up the game and then moving the rings from one pole to another. We assume that the rings with a higher number are larger than rings with a lower number. We can imagine that the ring number is the size of the ring in centimeters.
play(rings) when rings > 0 {
initialize-game(rings)
move-rings(rings, "A", "C", "B")
}
This is an HTN macro, which is the same as a method except it’s not attached to a task.
A macro and a mathod can have preconditions or guards (the part after the when
keyword).
If the guard evaluates to false, then the corresponding body is not considered when planning.
If we didn’t have a body (the part between the {
and }
), then this would be a task.
A task may have preconditions and implications. These are not used by the planner, but they are used to document what the task expects and what any methods should do.
We aren’t using tasks or methods in this tutorial.
Game Initialization
We initialize the game by placing any rings that aren’t already on a pole on the first pole (A). This allows us to set up a game and give that setup to the agent when we create the agent instance, but the HTN we’re developing in this tutorial won’t know how to play the game if the rings don’t begin on the first pole. We leave it as an exercise to the reader to create a more flexible HTN that can handle arbitrary starting states.
There’s nothing to do if there aren’t any rings remaining to place.
initialize-game(rings) when rings == 0 { }
If there are rings remaining, we place the next ring on the first pole as long as it’s not already placed on a pole.
The ring()
predicate is true if the ring is on the given pole, and false otherwise.
Predicates are part of the agent state.
The implies ...
clause changes the agent state.
Implying a predicate marks it as true.
Once we’ve placed a ring, we recursively call initialize-game
with one fewer ring to place the remaining rings.
initialize-game(ring) when ring > 0, not ring(ring, "A"), not ring(ring, "B"), not ring(ring, "C") {
PlaceRing(ring=ring, pole="A") {
RingPlaced(ring=ring) is success
RingFailedToPlace(ring=ring) is failure
}
implies ring(ring, "A")
initialize-game(ring - 1)
}
If the ring is already placed on a pole, we just recursively call initialize-game
with one fewer ring to place the remaining rings.
initialize-game(ring) when ring > 0, ring(ring, "A") or ring(ring, "B") or ring(ring, "C") {
initialize-game(ring - 1)
}
The planner will explore all of the possible ways to achieve a task. This means that if a method or macro has multiple bodies with guards that evaluate to true, the planner will consider all of them rather than the first one that matches.
Game Play
There are a number of ways to play the game.
In this version, we assume that all of the rings begin on pole A
and that we want to move them to pole C
.
We are using the recursive implementation from the Wikipedia article linked above.
move-rings(ring, source, dest, temp) when rings > 0 {
move-rings(ring - 1, source, temp, dest)
MoveRing(ring=ring, from=source, to=dest) {
RingMoved(ring=ring, from=source, to=dest) is success
RingFailedToMove(ring=ring, from=source, to=dest) is failure
}
implies not ring(ring, source), ring(ring, dest)
move-rings(ring - 1, temp, dest, source)
}
move-rings(ring, source, dest, temp) when ring == 0 { }
That’s all there is to the HTN definitions. Now we can create an agent that uses these definitions to play the game.
Agent
Defining an agent is pretty simple. We just need to give it a name and a list of packages to use. The agent will have available all of the domains in the listed packages. We can select which behaviors to use from those domains.
Put the following content into a file named agents/rings-and-poles-agent.yaml
.
name: com.ursafrontier.player
packages:
- com.ursafrontier.tutorials
behaviors:
- games.rings-and-poles.PlayRingsAndPoles
Playing the Game
Now we can create a Python script that will start up an agent and show the agent playing the game. We’ll use terminedia
to draw the game in the terminal. This isn’t as fancy as a web page, but it gets the job done.
Game Mechanics
Let’s create a class that can hold game information. This class provides methods to implement the placement and movement of rings. This is where we put the logic to make sure rules aren’t broken, but we aren’t putting any information about the action space here. This is a general-purpose class for playing the game.
class RingsAndPolesGame:
def __init__(self, n):
self.n = n
self.poles = {'A': [], 'B': [], 'C': []}
def place_ring(self, pole, ring):
for set in self.poles.values():
if ring in set:
return False
if ring > self.n or ring < 1:
return False
# if a lower disk is already on the peg, then we can't place this disk
for d in self.poles[pole]:
if d < ring:
return False
self.poles[pole].append(ring)
return True
def move_ring(self, from_pole, to_pole, ring):
if ring not in self.poles[from_pole]:
return False
for d in self.poles[from_pole]:
if d < ring:
return False
for d in self.poles[to_pole]:
if d < ring:
return False
self.poles[from_pole].remove(ring)
self.poles[to_pole].append(ring)
return True
Drawing the Game
Next, let’s create a class that captures how we want to present the game while it’s being played. This class will use terminedia
to draw the game in the terminal. We’re keeping it simple and redrawing everything each time rather than erasing and drawing the differences.
class GameViewer:
def __init__(self, scr, game):
self.scr = scr
self.game = game
def draw(self):
self.scr.clear()
self.draw_poles()
self.draw_rings('A', 20)
self.draw_rings('B', 40)
self.draw_rings('C', 60)
def draw_poles(self):
n = self.game.n
for i in [20, 40, 60]:
self.scr.draw.line((i - n - 1, 20), (i + n + 1, 20))
self.scr.draw.line((i, 20), (i, 20 - n - 1))
def draw_rings(self, pole, center):
rings = self.game.poles[pole]
for i in range(len(rings)):
self.scr.draw.line((center - rings[i], 20 - i - 1), (center + rings[i], 20 - i - 1))
Adjudicating Actions
We need to be able to tell the agent whether or not it succeeded in its action. We’ll use a simple function that takes a game and an action and returns any events that were generated by the action.
This is an example of translating between the action space of an agent and the simulation that the agent might be running in. In this case, the “simulation” is the game.
class GameAdjudicator:
def __init__(self, game):
self.game = game
def adjudicate(self, actions):
events = []
for action in actions:
if action['domain'] != 'games.rings-and-poles':
continue
if action['name'] == 'PlaceRing':
if self.game.place_ring(action['params']['pole'], action['params']['ring']):
events.append({'domain': 'games.rings-and-poles', 'name': 'RingPlaced', 'params': action['params']})
else:
events.append({'domain': 'games.rings-and-poles', 'name': 'RingFailedToPlace', 'params': action['params']})
elif action['name'] == 'MoveRing':
if self.game.move_ring(action['params']['from'], action['params']['to'], action['params']['ring']):
events.append({'domain': 'games.rings-and-poles', 'name': 'RingMoved', 'params': action['params']})
else:
events.append({'domain': 'games.rings-and-poles', 'name': 'RingFailedToMove', 'params': action['params']})
return events
The GameAdjudicator
class takes the actions from the agent and applies them to the game state. It returns the list of events that the agent should receive in response to the actions.
We only care about actions in the games.rings-and-poles
domain, so we ignore any other actions. In more recent versions of Python, we could use a match
statement to do this.
Event Loop
One last thing before we tie everything together: the event loop that will run the game. This is a simple loop that gets the actions from the agent, adjudicates them, and then draws the game. It will run until the game is over.
def make_moves(drawer, running_agent, adjudicator, n):
actions = running_agent.send_event('games.rings-and-poles', 'PlayGame', {'rings': n})
while actions:
events = adjudicator.adjudicate(actions)
drawer.draw()
actions = running_agent.send_events(events)
We can see that it’s pretty easy to work with the agent. We use send_event
to send an
event to the agent. This returns a list of actions that the agent wants to take.
We then process the actions and send the resulting events back to the agent.
Putting it All Together
Now we have all the parts we need. Let’s put them together in a script that will play the game.
from ursactl.core.project import Project
from terminedia import Screen
def enter_number_of_rings_to_use():
while True:
try:
k = input("How many rings do you want\nto use ? (3 to 9) ")
n = int(k)
except ValueError:
continue
break
return n
def play_game():
agent = Project().agent('com.ursafrontier.player')
n = enter_number_of_rings_to_use()
with Screen() as scr:
game = RingsAndPolesGame(n)
drawer = GameViewer(scr, game)
adjudicator = GameAdjudicator(game)
drawer.draw()
with agent.run() as player:
make_moves(drawer, player, adjudicator, n)
if __name__ == '__main__':
play_game()