|
DGPort Concept Space:
DGPort Concept Space is a "thesaurus" of terms relevant to the
Digital Government. Domain and was created by the Artificial
Intelligence Lab of the MIS department at the University of Arizona.
This thesaurus was created using a text mining approach to extract
terminologies and their weighted relationships from a collection
of documents pertaining to Digital Government. The primary
purpose of the DGPort Concept Space is to provide users of the
DGPort search system a list of keywords that are highly related
to the user’s original search term and which may subsequently
assist the user in finding the information he or she is seeking.
The current version of the DGPort Concept Space was created from
a collection of 300,000 web documents. As the DGPort portal expands,
however, the DGPort Concept Space will be broadened to reflect the
new documents added to DGPort. In the future, the DGPort Concept Space
will reflect the relationships between keywords contained in
over 1 million web documents.
DGPort Search Engine:
The DGPort search engine is a vertical search engine created specifically
for the domain of Digital Government. The current prototype of the DGPort
search engine is designed to search a collection of over 300,000 quality
web documents pertinent to Digital Government researchers. The current
document collection was generated using the AI Lab’s Search Engine Toolkit
and Meta Search module. In conjunction with these tools, an advanced page
collecting methodology is used to ensure the quality and coverage of the
collection. Furthermore, a content analysis algorithm and link analysis
algorithm are used to rank the search results as well as filter out unrelated
pages. Microsoft SQL Server is used as a backend database server.
Future plans for DGPort include the expansion of the DGPort document
collection to over 1 million documents as well as the inclusion of the
Stanford .gov collection - a collection of almost 17 million web documents
relating to over 5,500 government web sites.
Meta Search:
The DGPort search system allows the meta searching
of quality sites related to Digital Government.
The purpose of meta searching is to send queries to multiple
search engines, and literature databases, online journals, and to collate only
the highest-ranking subset from each data source, thus increasing precision.
Meta search subsequently provides a simple uniform user interface that promises
significant advances in coping with information overload and low-precision
issues.
Document Categorization and Visualization:
An ideal Information Retrieval (IR) system should categorize retrieved
documents automatically and give the user rapid access to various aspects
of the subject of interest. DGPort strives for this goal by providing two
tools that provide categorization of returned documents: the Document Organizer
and the Self Organizing Map. With the help of the Document Organizer,
the documents retrieved from a meta search are classified into different
categories based on the occurrence of keywords extracted from the documents.
In addition, the Self Organizing Map visualization tool helps to facilitate
the elucidation of meaning of the collection of returned documents.
|