A Darren Hardy A Michael F. Schwartz T Essence: A Resource Discovery System Based on Semantic File Indexing J Proceedings of the USENIX Winter Conference C San Diego, California D January 1993 P 361-374 K RD, resource discovery X Available from ftp:ftp.cs.colorado.edupubcstechreportsschwartzEssence.Conf.ps.Z Or ftp:ftp.cs.colorado.edupubcstechreportsschwartzEssence.Conf.txt.Z X A revised and extended journal version of this paper is available as "Customized Information Extraction as a Basis for Resource Discovery". X Note: the Essence prototype is available by anonymous FTP from ftp.cs.colorado.edu, in the directory pubcsdistribsessence. You can also retrieve a WAIS "src" file for Essence in this directory. This will allow you to try Essence out. X Abstract: "Discovering different types of file resources (such as documentation, programs, and images) in the vast amount of data contained within network file systems is useful for both users and system administrators. In this paper we discuss the \fIEssence\fP resource discovery system, which exploits file semantics to index both textual and binary files. By exploiting semantics, Essence extracts keywords that summarize a file, and generates a compact yet representative index. Essence nderstands nested file structures (such as uuencoded, compressed, ``tar'' files), and recursively unravels such files to generate summaries for them. These features allow Essence to be used in a number of useful settings, such as anonymous FTP archives. We present measurements of our prototype and compare them to related projects, such as the Wide Area Information Servers (WAIS) system and the MIT Semantic File System (SFS). We demonstrate that Essence can index more data types, generate sma ler indexes, and in some casesndex data faster than these systems. Our prototype generates WAIS-compatible indexes, allowing WAIS users to take advantage of the Essence indexing methods."