This essay by Jeff Veen (found via Column Two) provides a method with which to catalogue the static html files of a site in preparation for converting to a content management system. The article also includes a download of the spreadsheet, which is handy.
Here is an important bit to read to yourself several times before embarking on this kind of project:
After you’ve filled in a couple hundred lines of the spreadsheet, you’ll inevitably start to wonder if there is something – anything! – that can speed this process up. Surely technology can come to the rescue. Sorry. The best we’ve been able to do is enlist the help of a programmer to write us a script that will crawl a Web site and spit out the URLs it finds. And that merely ensures that we don’t miss any pages. Even with this head start, we always go through the pages by hand. A content inventory is a decidedly human task. In fact, we find that the process can often be as valuable as the final spreadsheet. If you invest the time in scouring your Web site and deconstructing every page (or at least a good selection of pages), you will end up as the uncontested expert in how it all goes together. And that’s invaluable knowledge to possess when redesigning your site.
That matches our experience when we went through this process during our conversion from static files to a database-driven CMS. It was long and tedious but you really know your content afterwards.
The spreadsheet we developed for our project also included some rows that we used for mapping the existing content to a new location since we had redesigned our overall site structure during the conversion.