|
COMMERCE BUSINESS DAILY ISSUE OF MAY 2,2000 PSA#2591Dahlgren Division, Naval Surface Warfare Center, 17320 Dahlgren Road,
Dahlgren, VA 22448-5100 58 -- AUTOMATED INFORMATION EXTRACTION SYSTEM SOL N00178-00-Q-3010 DUE
053100 POC sd13 (540) 653-7765 WEB: Dahlgren Division, Naval Surface
Warfare Center, http://www.nswc.navy.mil/supply. E-MAIL: Dahlgren
Division, Naval Surface Warfare Center, sd13@nswc.navy.mil.
DESCRIPTION: The Naval Surface Warfare Center, Dahlgren Division
(NSWCDD), in collaboration with the Joint Warfare Analysis Center
(JWAC), is soliciting research and development "white papers" for the
use of Natural Language Processing (NLP) technologies to extract
automatically the relevant information concerning interactions and
relationships among political actors and enter it in a standardized
form into a database with minimal human input. To perform their mission
to provide analysis of physical and human systems, analysts must sift
through enormous amounts of written text to extract the relevant pieces
of information. The Joint Warfare Analysis Center (JWAC) requires large
amounts of data on political actors, their interactions, and the
relationships among them. At present, a large proportion of this
information extraction (IE) is done manually. Recent advances in the
Natural Language Processing (NLP) field should allow automation of much
of the information extraction process. White papers are solicited on
the use of NLP technologies to extract automatically the relevant
information concerning interactions and relationships among political
actors and enter it in a standardized form into a database with minimal
human input. The types of information extracted and the format of the
output data should fit into the framework discussed below. The texts to
be processed will be drawn from news media sources readily available in
machine-readable format. Depending on the scope of a particular effort,
the relevant political actors might include countries, individuals,
organizations, social groups or any other entity that might be
considered capable of some politically relevant action. Once the
relevant actors are identified, we want to know about the interactions
(or "events") (The Kansas Event Data System (KEDS) and the Protocol
for the Assessment of Nonviolent Direct Action (PANDA) projects provide
examples of the types of "event data" in which we are interested.
Project descriptions and sample data can be found at the project
websites: http://www.ukans.edu/~keds and
http://hdc-www.harvard.edu/cfia/pnscs/panda.htm.") that occur between
them. That is, "who does what to whom, and when." We would also want to
know the relationships among the actors; for example, "John Doe is a
member of the SRD party" or "Carol is the supervisor of Ben." Finally,
we would like to capture the attributes of these entities, events, and
relationships. It is not sufficient for JWAC's purposes, however, to
simply extract entities and the phrases that describe an interaction or
relationship; we need phrases and sentence structures which have the
same meaning to be standardized in our data set (a process we will
refer to as normalization in this document). For example, the sentences
"Carol is the supervisor of Ben," "Carol is Ben's boss," and "Ben works
for Carol" would all be represented identically. Furthermore, a person
might be identified multiple ways in a group of texts (i.e. Boris
Yeltsin, President Yeltsin, the Russian President, etc.) yet a single
code in the data set would represent this person. The extracted entity
data should be stored in a manner that allows us to change easily the
level of analysis at which an entity is represented in an output data
set without losing the information originally extracted from the text.
Depending on the task at hand, we might want all references to Yeltsin
to be represented in our data set as a unique individual (e.g. Boris
Yeltsin), as a member of a specific branch of the Russian government
(e.g. Russian Chief Executive), or simply as Russia (e.g. Russia). The
normalization process will also need to be time-sensitive, allowing
different codes to represent an individual as his or her roles change
through time. In addition to information about "who does what to whom"
or the relationships between two or more entities, JWAC also needs
information about the qualities or attributes of those entities,
events, and relationships. The range of attribute information will
include descriptions of entitieslike age, strength, or titles; the
attributes of events would include location or quantities; and the
attributes of relationships would include duration, intensity, and
affect. In short, the attributes we need to collect could include any
information that describes the normalized entities, events, and
relationships we define as relevant to our analysis efforts. The
attributes themselves must go through a normalization process to ensure
that the relevant information is extracted and represented in a
consistent manner. Thus, we see three primary tasks in the IE process:
entity extraction, event extraction, and relationship extraction. Each
of these primary tasks can be divided into three subtasks: a.
identification, which refers to identifying a word or phrase as a
particular type of entity, event or relationship; b. normalization, the
process of recognizing the multiple ways a single entity, event, or
relationship might be designated in the input texts and standardizing
those representations in the output data base; c. attribution, which is
the process of extracting relevant characteristics of the entities,
events, and relationships of interest and storing them in a consistent
manner. Thus, we have a total of nine distinct tasks within the IE
process, any of which, individually or in combination, could be
addressed in the proposals solicited here. We are taking a modular
approach here to the IE problem, integrating existing technologies,
automated processes yet to be developed, and human activity. Thus,
there will be numerous points of interface among source texts, software
products, the output databases, analysis tools, and the users. It is
important that considerable effort go into planning and design to keep
these interface points as user-friendly and flexible as possible. We
need to maintain the ability to add or replace modules within the IE
process easily as technical progress allows automation of tasks
performed by human analysts and the replacement of previously developed
modules. White papers should describe the task or tasks within the IE
process the offeror proposes to address, existing software products the
offeror intends to use to address JWAC's requirements, as well as
innovations the offeror intends to incorporate into the delivered IE
system. Estimated costs of the task(s) to be performed should also be
included. The white papers should include a description of the
offeror's experience on similar projects and the qualifications of
potential key contractor personnel. The white papers should not exceed
ten pages in length. Technical papers may be referenced and attached,
but a summary of the essential points should be included in the white
paper. We anticipate funding a series of contracts beginning in FY
2001. While proposals may be extensive and broad in scope, they should
be divided into tasks that can reasonably be accomplished within a
reasonable timeframe (6 to 18 months). We are looking for projects that
will improve the efficiency and effectiveness of the JWAC information
extraction process. We will evaluate proposals according to their
contribution to this goal, the expertise and experience of the offeror,
ease of integration with other products, and costs. Offerors may be
requested to provide additional information, including more detailed
proposals, to facilitate JWAC's evaluation process. "White papers"
shall be submitted to the Dahlgren Division, Naval Surface Warfare
Center, Attn: Code SD13-2, Bldg. 183, Room 106, 17320 Dahlgren Road,
Dahlgren, VA 22448-5100. "White Papers" and all related correspondence
should reference Broad Agency Announcement Number N00178-00-Q-3010.
The "white paper" should be an UNCLASSIFIED document. If desired,
multiple "white papers" addressing different areas of research and
development may be submitted. In the interest of equity, papers
exceeding the 10-page limitation, will not be reviewed for further
consideration. Offerors will be requested to submit in-depth
proposal(s) should NSWCDD/JWAC deem the "white paper" of scientific and
technical merit. An invitation to submit an in-depth proposal does not
assure the offeror of a subsequent award. The Government reserves the
right to select for award any, all, part, or none of the responses
received. Multiple awards are anticipated. Technical and other
questions regarding this announcement may be submitted to
sd13@nswc.navy.mil. This announcement shall be open through 31 May
2000. Cut off for the "white papers" is 31 May 2000 at 2:00 p.m. at the
aforementioned NSWCDD address. Award(s) may be made at any time after
receipt through 30 September 2001. This notice constitutes a BAA for
NSWCDD as authorized at FAR 6.102 (d) (2) and, as such, solicits the
participation of all offerors capable of satisfying the Government's
needs. This BAA should not be construed as commitment or authorization
to incur cost in anticipation of a contract; the Government is not
bound to make an award. Posted 04/28/00 (W-SN449509). (0119) Loren Data Corp. http://www.ld.com (SYN# 0185 20000502\58-0003.SOL)
58 - Communication, Detection and Coherent Radiation Equipment Index Page
|
|