Ever had a site you were about to reference and found upon checking that it is no longer online? Increasingly this is a problem in a culture which is actively producing digital artefacts. If you have the URL you can always search the Internet Archive as it has been archiving parts of the web since 1996. The archive aims to preserve and make available digital documents for researchers, historians, and scholars. Since 1996 Alexa Internet has been crawling the web which has resulted in a massive library online.
You can search for sites which are no longer live as these may have been archived. In some cases you can trace back a number of versions of the site. For academics and writers the Internet archive can be used and referenced as you would any other site. For artists if your site has been archived you can identify copyrighted work that you claim if any infringement occurs.
You can the Alexa bot to crawl your site and usually within 8 weeks of submission your site will be indexed. Although the Internet archive contains approximately 1 petabyte of data and is growing by 20 terabytes a month unfortunately not all sites have been archived. This is because their automated crawlers may have missed them or are unaware of the site. More recently however dynamic content has caused problems as the Internet archive is an automated system and some dynamic sites break apart when archived. This happens when a page contains interactive elements which requires the originating server. Elements such as server side image maps, forms and some JavaScript all cause problems and will prevent the functionality of the site being preserved. So there is a lesson here, as for the moment is although the Internet Archive is able to store some dynamic content if you are interested having your site archived use HTML firstly because it is the easiest for automated systems to find but also it is the simplest to archive as well.

