Loren Data Corp.

'

 
 

COMMERCE BUSINESS DAILY ISSUE OF DECEMBER 18,1995 PSA#1492

Advanced Research Projects Agency (ARPA), Contracts Management Office (CMO), 3701 North Fairfax Drive, Arlington, VA 22203-1714

A -- ARPA RESEARCH ON TEXT RETRIEVAL AND UNDERSTANDING (TIPTER TEXT PHASE III) SOL BAA 96-08 DUE 021396 POC Mr. David Gunning, (Technical) ARPA/ITO, FAX (703) 696-2202. BROAD AGENCY ANNOUNCEMENT (BAA96-08):The Advanced Research Projects Agency (ARPA) is soliciting proposals for continued Research and Development (R&D) activities and development of an Architecture and Capabilities Platform for the TIPSTER Text Program. The scope of this program includes the following technologies: (1) Document Detection, which includes technologies related to locating relevant texts such as retrospective search & information retrieval, automatic routing, profile development, and selective dissemination of information, (2) Information Extraction, which includes (a) automatic database filling where the data to be extracted may describe entities (people, organizations, places, etc.), events, or relationships, and (b) entity and information tagging, and (3) Summarization and abstracting of free text. The scope of the program includes foreign languages as well as English.Proposals will be entertained as part of this BAA to meet either the R&D or the Architecture and Capabilities Platform elements of the program (see below). Proposals for R&D will be separately evaluated from proposals for the Architecture and Capabilities Platform. All proposals must be in the context of the TIPSTER program as a whole, which also includes the role of the TIPSTER Systems Engineering/Configuration Management (SE/CM) support contractor, TIPSTER Metrics-based Evaluations, and TIPSTER Demonstration Projects, as well as the elements being procured in this BAA. The program is more fully described in the Proposer Information Packet (PIP).RESEARCH AND DEVELOPMENT ACTIVITIES. The primary goal of the Research and Development portion of the TIPSTER program is to continue to dramatically improve the state of the art of document detection, information extraction, and summarization, as defined in the first paragraph above. Innovative approaches that lead to or enable revolutionary advances are encouraged. Proposals which will improve the state-of-the-art by building on existing progress are also encouraged. Specifically excluded is research which primarily results in evolutionary improvemet to the existing state of practice or focuses on a specific system or hardware solution. Small individual research projects (perhaps thesis- related) as well as larger efforts are welcome. The TIPSTER Program is interested in research tasks such as the following (no priority is implied by the order, and the list is not meant to exclude proposals for closely related research):(1) Multilingual capabilities, including: Port tools and techniques working in one language (e.g. English) to work in languages other than English, Build resources (e.g. lexicons, word lists, grammar) in languages other than English, Build tools that assist in porting resources to new languages or in extending language resources, Do query development in English and retrieval against multilingual collections, Summarize in English documents in other languages. (2) Detection, including: Increase accuracy (including improvement of search algorithms, ranking to distinguish fine- grained differences between topics, cascaded searching(e.g. successively more precise searches), interactive searching, etc.), Merge results from different document stores and different search engines, Support fine tuning to specific domains via indexing or querying, Work on topic clustering (including automated structuring of large text corpora), Apply to search of Internet data, Work on degraded data, such as from OCR, poorly spelled or garbled messages, or transcribed speech.(3) Extraction, including: Develop a common language for pattern specification (different tools now have different ways of expressing patterns, with a common language, different tools could read each other's patterns), Work to make extraction portable to a new subject domain for a low cost (e.g., by developing tools to make it easier for a system administrator or a superuser to tune the system to a new domain), Raise accuracy (e.g., by improving coreference resolution), Automatically compare and fuse information from multiple text sources (e.g., by making use of the existing database that the extraction is filling to assist with the extraction from new documents, or by comparing and fusing information from more than one document), and, Work on degraded data, such as from OCR, poorly spelled or garbled messages, or transcribed speech.(4) Summarizing, including: Produce text document summaries in reasonable English, Produce metadata (e.g. concise descriptions of large collections and other summaries of large text corpora, Produce summaries of formatted data on a specific topic, Produce a single summary of multiple documents, including multiple documents on a specific topic.(5) Cross- Technology, including: Work on identifying duplicate documents and repetitive information, Increase the sharing of information between detection and extraction, Experiment with fusion of extracted information derived from different media and different types of sources, Detect intent, argument and feelings expressed directly or indirectly in text, Work on degraded text (e.g. OCR data, poorly spelled messages or automatically transcribed speech).DEVELOPMENT OF AN ARCHITECTURE AND CAPABILITIES PLATFORM. Areas of interest under this heading include (no priority is implied by the order, and the list is not meant to exclude proposals for other closely related activities):(1) Build a CORBA compliant capabilities platform, instantiating the complete architecture and integrating multiple modules and components. Provide technical support for researchers and developers who may use this platform for testing the compliance of their modules and components or for experimental purposes. Provide a reasonable maintenance plan with the assumption that the platform may be made accessible via Internet,(2) Take specific modules and components (e.g., a query builder, a name tagger, porting tools, etc.) and make them robust and compliant with the architecture,(3) Support the extension of the architecture to include ways for TIPSTER technologies to interact with machine translation, speech, OCR, images, and user interfaces, or with larger information systems. This extension must involve cooperation among researchers and developers of these technologies, government users, the TIPSTER Architecture Committee, the TIPSTER SE/CM support contractor, and other involved parties,(4) Manage the completion of the TIPSTER Architecture in as yet uncompleted areas. These areas include: developing a common language for pattern specification, developing a common general lexicon, and developing a specification for annotations (how all tools will mark parts of speech, etc.), for common applications and users. These developments must involve cooperation among relevant researchers and developers, the TIPSTER Architecture Committee, and the TIPSTER SE/CM.CONDUCT OF THE PROGRAM. The activities described are to be conducted over a priod of three years (FY1997-FY1999) with anticipated funding of $4-5 million per year. The range of awards in R&D is expected to be up to a maximum of $300K per year. Proposals need not cover all three years, and are expected to focus on particular aspects of the program where the proposer has expertise, rather than to try to cover all aspects of the program.PROPOSAL PREPARATION AND SUBMISSION. Proposals should be prepared in accordance with the instructions and format contained in the Proposer Information Packet (PIP). Non-conforming proposals may be rejected without review. Fourteen (14) copies of each proposal should be addressed to ARPA/ITO, ATTN: BAA 96-08, 3701 N. Fairfax Drive, Arlington, VA 22203-1714. Proposals may be reviewed and acted on as they arrive. However, no proposal will be accepted after 4:00 p.m. local time, Tuesday, February 13, 1996. Restrictive notices notwithstanding, proposals will be handled for administrative purposes by a support contractor, and FFRDC employees may participate in the review process. Any administrative questions or correspondence must be sent to one of the contact addresses below by COB, February 6, 1996. This CBD notice in conjunction with the PIP constitutes the BAA as contemplated in FAR 6.102(d)(2). No additional written information is available, nor will a formal RFP or other solicitation regarding this announcement be issued. Requests for same will be disregarded. A bidder's conference will be held January 5, 1996. Send names, addresses (including e-mail and fax), phone numbers and affiliation of attendees to the contact address below by COB December 22, 1995. Attendance is limited to one individual per organization without prior arrangement. CONTACT ADDRESSES. The contact addresses for this BAA are: Electronic mail: baa96-08@arpa.mil. FAX: (703) 696-2202 (Addressed to BAA 96- 08). Mail: ATTN: BAA 96-08, ARPA/ITO, 3701 N. Fairfax Drive, Arlington, VA 22203-1714. The PIP is available at http/www.ito.arpa.mil. (Notes: Electronic mail and fax are preferred for administrative questions and correspondence. PROPOSALS SENT BY FAX OR EMAIL WILL BE DISREGARDED). ARPA will use electronic mail and fax preferentially for correspondence regarding BAA96-08. EVALUATION OF PROPOSALS. Evaluation of proposals will be accomplished through a scientific review process using the following separate criteria for Research and Development proposals and for Architectur and Capabilities Platform proposals, listed in descending order of relative importance. The criteria for Research and Development proposals are: (1) technical merit of proposed solution, including innovativeness of the proposed approach and the potential to advance the state of the art, (2) demonstrated understanding of the text processing problem which is proposed to be solved, (3) offeror's capabilities, including experience solving related problems and (where appropriate) the ability to work within the structure of an existing architecture, (4) cost realism, and (5) relevance to agency mission. The criteria for Architecture and Capabilities Platform proposals are: (1) technical merit of the proposal, including the work's compliance with the architecture, (2) relationship of the proposed work to published R&D advances made under the TIPSTER Text Program and demonstrated understanding and ability to promote the goals of the TIPSTER Program, (3) offeror's capabilities, including experience integrating complex technologies and working in a collaborative mode, (4) cost realism, including costs for longterm operations, license fees, and the value of continuing use of deliverable software, etc., and (5) relevance to agency mission. Individual proposal evaluations will be based on acceptability or unacceptability without regard to other proposals submitted under the announcement. ARPA will select for award a subset of the acceptable proposals in order to construct a balanced program responsive to its needs. The resulting awards will be handled through agents selected by ARPA. The Government reserves the right to select for award all, some, none or portions of the proposals received. All responsible sources capable of satisfying the Government's needs may submit proposals which shall be considered by ARPA. HBCUs and MIs are encouraged to submit proposals and to join other organizations in submitting proposals. However, no portion of this BAA will be set aside specifically for HBCU and MI participation due to the impracticality of reserving discrete or severable areas of text- processing research for exclusive competition among these entities. Proposals selected for funding may result in a contract, grant or other agreement depending upon the nature of the work proposed, the required degree of interaction between the parties, and other factors. (0348)

Loren Data Corp. http://www.ld.com (SYN# 0001 19951215\A-0001.SOL)


A - Research and Development Index Page