This paper develops a simple interpretation of learning how to learn: it is ordinary learning, but from point sets, rather than points. This is an alternative to the Bayesian viewpoint of ``learning a prior'' (Baxter, 1996b). The idea behind learning how to learn is to partition the data into separate learning tasks, learn a model for the tasks, and then apply this model to new tasks. Ordinary learning methods do the same thing, but with individual data points as the ``tasks.'' The partitioning for learning how to learn can be recovered automatically, generalizing the idea of ``task clustering'' (Thrun and O'Sullivan, 1996). Virtually all existing algorithms fit naturally into this unifying framework, including learning a distance metric and learning internal representations.