TI An Information Retrieval Approach for Automatically Constructing Software Libraries LT CUCS-049-90 OR COLUM YR 1990 AU Yoelle S. Maarek AU Daniel M. Berry AU Gail E. Kaiser AB Although software reuse presents clear advantages for programmer productivity and code reliability, it is not practiced enough. One of the reasons for the only moderate success of reuse is the lack of software libraries that facilitate the actual locating and understanding of reusable components. This paper describes a technology for automatically assembling large software libraries that promote software reuse by helping the user locate the components closest to herhis needs. Software libraries are automatically assembled from a set of unorganized components by using information retrieval techniques. The construction of the library is done in two steps. First attributes are automatically extracted from natural language documentation by using a new indexing scheme based on the notions of lexical affinities and quantity of information. Then, a hierarchy for browsing is automatically generated using a clustering technique that draws only on the information provided by the attributes. Thanks to the fretext indexing scheme, tools following this approach can accept free-style natural language queries. This technology has been implemented in the {\sc Guru} system, which has been applied to construct an organized library of {\sc Aix} utilities. An experiment was conducted in order to evaluate the retrieval effectiveness of {\sc Guru} as compared to {\sc InfoExplorer} a hypertext library system for {\sc Aix} 3 on the IBM RISC System6000 series. We followed the usual evaluation procedure used in information retrieval, based upon recall and precision measures, and determined that our system performs 15\ better on a random test set, while being much less expensive to build than {\sc InfoExplorer}. } }