There are registries of digital collections that make their content available through OAI-PMH or some other mechanism. But I have yet to see a resource for digital collections that just make their resources available on the Web the way the Web works. Sitemaps are the main mechanism for listing Web resources for automated crawlers. This is an initial attempt to begin to collect information on digital collections that make sitemaps available.
For more information on why you should open up your digital collection site to crawlers see the Robots Are Our Friends Campaign page:
http://wiki.code4lib.org/index.php/Robots_Are_Our_FriendsMany collections are still not discoverable on the open Web. I thought that one way to encourage folks to make their collections accessible to crawlers was to start a list of all digital collections that make a sitemap available. Then make that available to anyone who wants to do research on those digital collections or create a new service around the information.
Some institutions will have to fill out this form more than once as they have more than one sitemap in order to cover all their various digital collections. If you have an index sitemap that would be enough.
This data will be made available to anyone who asks for it and may be opened up to the world. This data may be incorporated into a future site which is publicly accessible on the Web. And to make it explicit the data will carry a CC0 license:
http://creativecommons.org/publicdomain/zero/1.0/If you have any questions about this form or the data, please contact Jason Ronallo (
jronallo@gmail.com).