Loren Data Corp.

'

 
 

COMMERCE BUSINESS DAILY ISSUE OF MAY 2,2000 PSA#2591

Dahlgren Division, Naval Surface Warfare Center, 17320 Dahlgren Road, Dahlgren, VA 22448-5100

58 -- AUTOMATED INFORMATION EXTRACTION SYSTEM SOL N00178-00-Q-3010 DUE 053100 POC sd13 (540) 653-7765 WEB: Dahlgren Division, Naval Surface Warfare Center, http://www.nswc.navy.mil/supply. E-MAIL: Dahlgren Division, Naval Surface Warfare Center, sd13@nswc.navy.mil. DESCRIPTION: The Naval Surface Warfare Center, Dahlgren Division (NSWCDD), in collaboration with the Joint Warfare Analysis Center (JWAC), is soliciting research and development "white papers" for the use of Natural Language Processing (NLP) technologies to extract automatically the relevant information concerning interactions and relationships among political actors and enter it in a standardized form into a database with minimal human input. To perform their mission to provide analysis of physical and human systems, analysts must sift through enormous amounts of written text to extract the relevant pieces of information. The Joint Warfare Analysis Center (JWAC) requires large amounts of data on political actors, their interactions, and the relationships among them. At present, a large proportion of this information extraction (IE) is done manually. Recent advances in the Natural Language Processing (NLP) field should allow automation of much of the information extraction process. White papers are solicited on the use of NLP technologies to extract automatically the relevant information concerning interactions and relationships among political actors and enter it in a standardized form into a database with minimal human input. The types of information extracted and the format of the output data should fit into the framework discussed below. The texts to be processed will be drawn from news media sources readily available in machine-readable format. Depending on the scope of a particular effort, the relevant political actors might include countries, individuals, organizations, social groups or any other entity that might be considered capable of some politically relevant action. Once the relevant actors are identified, we want to know about the interactions (or "events") (The Kansas Event Data System (KEDS) and the Protocol for the Assessment of Nonviolent Direct Action (PANDA) projects provide examples of the types of "event data" in which we are interested. Project descriptions and sample data can be found at the project websites: http://www.ukans.edu/~keds and http://hdc-www.harvard.edu/cfia/pnscs/panda.htm.") that occur between them. That is, "who does what to whom, and when." We would also want to know the relationships among the actors; for example, "John Doe is a member of the SRD party" or "Carol is the supervisor of Ben." Finally, we would like to capture the attributes of these entities, events, and relationships. It is not sufficient for JWAC's purposes, however, to simply extract entities and the phrases that describe an interaction or relationship; we need phrases and sentence structures which have the same meaning to be standardized in our data set (a process we will refer to as normalization in this document). For example, the sentences "Carol is the supervisor of Ben," "Carol is Ben's boss," and "Ben works for Carol" would all be represented identically. Furthermore, a person might be identified multiple ways in a group of texts (i.e. Boris Yeltsin, President Yeltsin, the Russian President, etc.) yet a single code in the data set would represent this person. The extracted entity data should be stored in a manner that allows us to change easily the level of analysis at which an entity is represented in an output data set without losing the information originally extracted from the text. Depending on the task at hand, we might want all references to Yeltsin to be represented in our data set as a unique individual (e.g. Boris Yeltsin), as a member of a specific branch of the Russian government (e.g. Russian Chief Executive), or simply as Russia (e.g. Russia). The normalization process will also need to be time-sensitive, allowing different codes to represent an individual as his or her roles change through time. In addition to information about "who does what to whom" or the relationships between two or more entities, JWAC also needs information about the qualities or attributes of those entities, events, and relationships. The range of attribute information will include descriptions of entitieslike age, strength, or titles; the attributes of events would include location or quantities; and the attributes of relationships would include duration, intensity, and affect. In short, the attributes we need to collect could include any information that describes the normalized entities, events, and relationships we define as relevant to our analysis efforts. The attributes themselves must go through a normalization process to ensure that the relevant information is extracted and represented in a consistent manner. Thus, we see three primary tasks in the IE process: entity extraction, event extraction, and relationship extraction. Each of these primary tasks can be divided into three subtasks: a. identification, which refers to identifying a word or phrase as a particular type of entity, event or relationship; b. normalization, the process of recognizing the multiple ways a single entity, event, or relationship might be designated in the input texts and standardizing those representations in the output data base; c. attribution, which is the process of extracting relevant characteristics of the entities, events, and relationships of interest and storing them in a consistent manner. Thus, we have a total of nine distinct tasks within the IE process, any of which, individually or in combination, could be addressed in the proposals solicited here. We are taking a modular approach here to the IE problem, integrating existing technologies, automated processes yet to be developed, and human activity. Thus, there will be numerous points of interface among source texts, software products, the output databases, analysis tools, and the users. It is important that considerable effort go into planning and design to keep these interface points as user-friendly and flexible as possible. We need to maintain the ability to add or replace modules within the IE process easily as technical progress allows automation of tasks performed by human analysts and the replacement of previously developed modules. White papers should describe the task or tasks within the IE process the offeror proposes to address, existing software products the offeror intends to use to address JWAC's requirements, as well as innovations the offeror intends to incorporate into the delivered IE system. Estimated costs of the task(s) to be performed should also be included. The white papers should include a description of the offeror's experience on similar projects and the qualifications of potential key contractor personnel. The white papers should not exceed ten pages in length. Technical papers may be referenced and attached, but a summary of the essential points should be included in the white paper. We anticipate funding a series of contracts beginning in FY 2001. While proposals may be extensive and broad in scope, they should be divided into tasks that can reasonably be accomplished within a reasonable timeframe (6 to 18 months). We are looking for projects that will improve the efficiency and effectiveness of the JWAC information extraction process. We will evaluate proposals according to their contribution to this goal, the expertise and experience of the offeror, ease of integration with other products, and costs. Offerors may be requested to provide additional information, including more detailed proposals, to facilitate JWAC's evaluation process. "White papers" shall be submitted to the Dahlgren Division, Naval Surface Warfare Center, Attn: Code SD13-2, Bldg. 183, Room 106, 17320 Dahlgren Road, Dahlgren, VA 22448-5100. "White Papers" and all related correspondence should reference Broad Agency Announcement Number N00178-00-Q-3010. The "white paper" should be an UNCLASSIFIED document. If desired, multiple "white papers" addressing different areas of research and development may be submitted. In the interest of equity, papers exceeding the 10-page limitation, will not be reviewed for further consideration. Offerors will be requested to submit in-depth proposal(s) should NSWCDD/JWAC deem the "white paper" of scientific and technical merit. An invitation to submit an in-depth proposal does not assure the offeror of a subsequent award. The Government reserves the right to select for award any, all, part, or none of the responses received. Multiple awards are anticipated. Technical and other questions regarding this announcement may be submitted to sd13@nswc.navy.mil. This announcement shall be open through 31 May 2000. Cut off for the "white papers" is 31 May 2000 at 2:00 p.m. at the aforementioned NSWCDD address. Award(s) may be made at any time after receipt through 30 September 2001. This notice constitutes a BAA for NSWCDD as authorized at FAR 6.102 (d) (2) and, as such, solicits the participation of all offerors capable of satisfying the Government's needs. This BAA should not be construed as commitment or authorization to incur cost in anticipation of a contract; the Government is not bound to make an award. Posted 04/28/00 (W-SN449509). (0119)

Loren Data Corp. http://www.ld.com (SYN# 0185 20000502\58-0003.SOL)


58 - Communication, Detection and Coherent Radiation Equipment Index Page