Digital Libraries Projects - DGPort
DGPort - Digital Government Portal
DGPort is an online search system designed to provide efficient and precise searching of web documents that may be relevant to researchers in the Digital Government domain.
See a PDF of the poster presentation.
DGPort
Concept Space:
DGPort Concept Space is a "thesaurus"
of terms relevant to the Digital Government.
Domain and was created by the Artificial
Intelligence Lab of the MIS department
at the University of Arizona. This
thesaurus was created using a text
mining approach to extract terminologies
and their weighted relationships from
a collection of documents pertaining
to Digital Government. The primary
purpose of the DGPort Concept Space
is to provide users of the DGPort
search system a list of keywords that
are highly related to the user’s original
search term and which may subsequently
assist the user in finding the information
he or she is seeking.
The current version of the DGPort Concept Space was created from a collection of 300,000 web documents. As the DGPort portal expands, however, the DGPort Concept Space will be broadened to reflect the new documents added to DGPort. In the future, the DGPort Concept Space will reflect the relationships between keywords contained in over 1 million web documents.
DGPort Search
Engine:
The DGPort search engine is a vertical
search engine created specifically
for the domain of Digital Government.
The current prototype of the DGPort
search engine is designed to search
a collection of over 300,000 quality
web documents pertinent to Digital
Government researchers. The current
document collection was generated
using the AI Lab’s Search Engine Toolkit
and Meta Search module. In conjunction
with these tools, an advanced page
collecting methodology is used to
ensure the quality and coverage of
the collection. Furthermore, a content
analysis algorithm and link analysis
algorithm are used to rank the search
results as well as filter out unrelated
pages. Microsoft SQL Server is used
as a backend database server.
Future plans for DGPort include the expansion of the DGPort document collection to over 1 million documents as well as the inclusion of the Stanford .gov collection - a collection of almost 17 million web documents relating to over 5,500 government web sites.
Meta Search:
The DGPort search system allows the
meta searching of quality sites related
to Digital Government. The purpose
of meta searching is to send queries
to multiple search engines, and literature
databases, online journals, and to
collate only the highest-ranking subset
from each data source, thus increasing
precision. Meta search subsequently
provides a simple uniform user interface
that promises significant advances
in coping with information overload
and low-precision issues.
Document
Categorization and Visualization:
An ideal Information Retrieval (IR)
system should categorize retrieved
documents automatically and give the
user rapid access to various aspects
of the subject of interest. DGPort
strives for this goal by providing
two tools that provide categorization
of returned documents: the Document
Organizer and the Self Organizing
Map. With the help of the Document
Organizer, the documents retrieved
from a meta search are classified
into different categories based on
the occurrence of keywords extracted
from the documents. In addition, the
Self Organizing Map visualization
tool helps to facilitate the elucidation
of meaning of the collection of returned
documents.




