So I’ve got this web-based image gallery. The images in it are organized with simple tags (there’s a many-to-many relationship between images and tags). Every image has at least one tag, and almost all images have several tags. When a user wants to look up an image in the gallery, they select one or more tags and are shown thumbnails for all images which have at least those tags. At first the image thumbnails were simply sorted alphabetically by file name, and this worked fine when the size of the collection was modest, but that didn’t last. Soon the collection grew to the point where most queries would give forty or more thumbnails, and it became pretty challenging to find your desired image in the pile. The search was even trickier if you had only a general idea of what you were looking for and couldn’t select lots of tags to narrow things down.
To improve the usability of this thing I had to come up with a way to arrange the image thumbnails in a particular order such that images with similar characteristics were placed near each other. This would make it much, much easier to visually scan the mass of thumbnails. In this post I’ll elaborate on the technical details of the problem and how I developed the algorithm I’m currently using.