Home
|
My Publication Software
SWATH (Smart Word Analysis for THai) is a word segmentation for Thai. Swath offers
3 algorithms: Longest Matching, Maximal Matching and Part-of-Speech Bigram.
The algorithrm are briefly in [1] and [2]. The program supports various file
input format such as html, rtf, LaTeX as well as plain text.
Download
Swath2.0.1 (binary)
for Win32 (Complied on October 7, 2003)
Swath
for Linux
Manual
Open Source
Open source version for Swath under GPL license.
Download:
ftp://linux.thai.net/pub/thailinux/cvs/software/swath
or
Check out from cvs:
cvs -d :pserver:anonymous@linux.thai.net:/home/cvs co software/swath
References
- Paisarn Charoenpornsawat. 1999. Feature-based
Thai Word Segmentation. Master's Thesis. Computer Engineering. Chulalongkorn
University, Bangkok, Thailand. (in Thai).
- Surapant Meknavin, Paisarn Charoenpornsawat, and Boonserm
Kijsirikul, 1997. Feature-based
Thai Word Segmentation. In Proceedings of the Natural Language Processing
Pacific Rim Symposium 1997(NLPRS’97), Phuket, Thailand.
|