JAWJAW (JAva Wrapper for JApanese Wordnet) is a Java API for Japanese WordNet (wn-ja) database (which also contains Princeton's English WordNet v3.0) that offers access to lexical knowledge of a given word such as hypernym, hyponym, definition, translation (English <--> Japanese).
It's an API that hides the wn-ja DB schematic details from the programmer side. We provide both the simple API for general Java programmers and more fine-grained API for Natural Language Processing (NLP) application developers.
JAWJAW: Java Wrapper for Japanese WordNet
日本語 | English
Last modified: 2013-03-20
Introduction
Simple API
Just call methods in the façade class. You can see the list of available methods here.
Sample code:
Output:
Sample code:
public class SimpleDemo { private static void run( String word, POS pos ) { // Accessing Japanese WordNet from the façade class called JAWJAW Set<String> hypernyms = JAWJAW.findHypernyms(word, pos); Set<String> hyponyms = JAWJAW.findHyponyms(word, pos); Set<String> consequents = JAWJAW.findEntailments(word, pos); Set<String> translations = JAWJAW.findTranslations(word, pos); Set<String> definitions = JAWJAW.findDefinitions(word, pos); // Showing results. (note: polysemies are mixed up here) System.out.println( "hypernyms of "+word+" : \t"+ hypernyms ); System.out.println( "hyponyms of "+word+" : \t"+ hyponyms ); System.out.println( word+" entails : \t\t"+ consequents ); System.out.println( "translations of "+word+" : \t"+ translations ); System.out.println( "definitions of "+word+" : \t"+ definitions ); } public static void main(String[] args) { // Showing a demo for "買収"(verb) which means to acquire SimpleDemo.run( "買収", POS.v ); } }
Output:
API for NLP Application Developers
In this API, you can get the raw content from the DB through DAO (Data Access Objects).
Data model:
Here's the domain model diagram generated from the Japanese WordNet DB schema. The API provides each data class and its DAO. Domain attributes "pos", "link" and "lang" are implmented as Enum class.
Available concept relationships:
Here's a summary of concept relationship "links" stored in the synlink table. (As of wn-ja v0.9)
Total number of concepts/words are:
Sample code:
Output:
Data model:
Available concept relationships:
Here's a summary of concept relationship "links" stored in the synlink table. (As of wn-ja v0.9)
link | link description | # |
also | See also | 2692 |
syns | Synonyms | 0 |
hype | Hypernyms | 89089 |
inst | Instances | 8577 |
hypo | Hyponym | 89089 |
hasi | Has Instance | 8577 |
mero | Meronyms | 0 |
mmem | Meronyms --- Member | 12293 |
msub | Meronyms --- Substance | 979 |
mprt | Meronyms --- Part | 9097 |
holo | Holonyms | 0 |
hmem | Holonyms --- Member | 12293 |
hsub | Holonyms --- Substance | 797 |
hprt | Holonyms -- Part | 9097 |
attr | Attributes | 1278 |
sim | Similar to | 21386 |
enta | Entails | 408 |
caus | Causes | 220 |
dmnc | Domain --- Category | 6643 |
dmnu | Domain --- Usage | 967 |
dmnr | Domain --- Region | 1345 |
dmtc | In Domain --- Category | 6643 |
dmtu | In Domain --- Usage | 967 |
dmtr | In Domain --- Region | 1345 |
ants | Antonyms | 0 |
Total number of concepts/words are:
- 49,190 concepts (called synsets in WordNet)
- 85,966 words
- 156,684 word definitions (pairs of word and synset)
Sample code:
public class AdvancedDemo { private static void run( String word, POS pos ) { // Access the Japanese WordNet DB and process the raw data List<Word> words = WordDAO.findWordsByLemmaAndPos(word, pos); List<Sense> senses = SenseDAO.findSensesByWordid( words.get(0).getWordid() ); String synsetId = senses.get(0).getSynset(); Synset synset = SynsetDAO.findSynsetBySynset( synsetId ); SynsetDef synsetDef = SynsetDefDAO.findSynsetDefBySynsetAndLang(synsetId, Lang.eng); List<Synlink> synlinks = SynlinkDAO.findSynlinksBySynset( synsetId ); // Showing the result System.out.println( words.get(0) ); System.out.println( senses.get(0) ); System.out.println( synset ); System.out.println( synsetDef ); System.out.println( synlinks.get(0) ); } public static void main(String[] args) { // Showing a demo for "自然言語処理"(noun) which means NLP AdvancedDemo.run( "自然言語処理", POS.n ); } }
Output:
Javadoc
Refer to this page.
Download
How to use
Download the DB from Japanese WordNet website and put it under the src/main/resources directory, e.g. "src/main/resources/wnjpn.db" (not wnjpn-0.9.db). It works on JDK 5 or later. To compile and get libraries (i.e. sqlite-jdbc-3.7.2.jar, junit-4.7.jar), we recommend you use Maven2. With the provided pom.xml file, you can easily compile and solve dependencies with "mvn compile" and sanity-check the code with "mvn test".
Version history
- 1.0.2 (2013-03-19) - Very fast initialization even when WordNet DB is in jar (0-1 sec), by using "jdbc:sqlite::resource". Compatible with m2e (m2eclipse deprecated).
- 1.0.0 (2011-10-16) - Released at Project Hosting on Google Code
- 2009-03-23 - initial release
Future works
Metrics for semantic similarity/distance between two synsetsReleased WS4J (WordNet Similarity for Java)- Command line interface
- Web interface
Contact
Hideki Shima at Carnegie Mellon University
Email: hideki at cs.cmu.edu
Email: hideki at cs.cmu.edu