

|
[Start
Simplified Chinese demo]
[Start
Traditional Chinese demo]
| Research
Goal |
|
 |
The CMedPort
was built to provide medical and health information services
to both researchers and the general public.
It is a prototype to discover whether the integrated
techniques can help improve Internet searching and browsing
in Chinese search engines.
Because users from mainland China, Hong Kong and
Taiwan use different forms of Chinese characters (Simplified
Chinese and Traditional Chinese), the CMedPort provides
two versions of interfaces to address the user’s needs.
The CMedPort indexed more than 300,000 medical related
pages from mainland China, Hong Kong and Taiwan, using the
spidering toolkit “SpidersRUs” developed by AI Lab. It also
meta-searches six major search engines from those three
regions. Upon
searching, the encoding conversion program allows users
to search for three regions simultaneously, and see the
result list in their familiar form of Chinese characters.
When the results are returned, the CMedPort provides
summarization and categorization functions to allow post-retrieval
analysis. The
Chinese summarization is modified from TXTRACTOR, an English
summarization developed in AI Lab.
It uses cue phrases and tf*idf to select summary
sentences from the original document.
The categorization extracts key phrases with highest
frequency from the title and summary of the returned documents,
and uses those phrases as folder topics, thus gives an overview
of these documents.
|
|
|