De presentatie wordt gedownload. Even geduld aub

De presentatie wordt gedownload. Even geduld aub

World class IT in a world-wide market. Practical results with Emile Marten Trautwein Syllogic B.V.

Verwante presentaties


Presentatie over: "World class IT in a world-wide market. Practical results with Emile Marten Trautwein Syllogic B.V."— Transcript van de presentatie:

1 World class IT in a world-wide market

2 Practical results with Emile Marten Trautwein Syllogic B.V.

3 Road map Introduction myself Context: –Text mining tools Results with Emile

4 Introduction myself Computer Science at UvA (1986 - 1991) –Theoretical computer science Complexity of Categorial Unification Grammar Dr Janssen PhD Computer Science at Uva (1991 - 1995) –Theoretical computer science Complexity of Unification Grammars Dr v. Emde Boas, Dr Janssen, Dr Torenvliet Syllogic B.V. (1995 -...) –Research and development Text mining

5 Context Term clustering TextAnalyst - Microsystems Co. Ltd. Intelligent miner for text - IBM

6 TextAnalyst Microsystems Co. Ltd. Megaputer Intelligence Inc (distributor) Version 2.0 www.megaputer.com

7 TextAnalyst - Features Functionality includes –Hierarchical / Structured topics –Knowledge base formation –Semantic search –Abstracting Languages –English –Russian

8 TextAnalyst - Knowledge base

9 TextAnalyst - Summarization

10 Intelligent miner for text IBM Corp. Version 2.3 December 1998 www-4.ibm.com/software/data/iminer/fortext/

11 IM4Text - Features Functionality includes –Clustering –Categorization –Search –Summarization –WebCrawler Languages –English

12 IM4Text- Clustering 0 III IX, X VII XI I II IV V VI VIII XII

13 IM4Text - Summarization

14 Other tools Verity Knowledge Organizer Autonomy Knowledge Server GrapeVine TextWise's DR-LINK, CHESS and CINDOR Data Junction's Cambio DataSet Synthema, Italy (IBM Technology Watch) Semio Corp's SemioMap Cartia's ThemeScape Canis' cMap Inxight's LinguistX and VizControls Muscat's Empower

15 Emile Syllogic / University of Amsterdam Version 3.1

16 Emile - Features Functionality includes –Grammar induction –Knowledge base construction –Compound term separation Languages –Any

17 Emile - Grammar induction Fragment of Phaistos disk 1 41 40 7. 2 12 4 40 33. 2 12 6 18 *. 2 12 13 1. 2 12 13 1 18. 2 12 27 14 32 18 27. 2 12 27 35 37 21. 2 12 31 26. 2 12 32 23 38. 2 12 41 19 35. 2 27 25 10 23 18. … 16 14 18. 16 23 18 43. Fragment of grammar [0] --> [3]. [3] --> [16] [47] [14] --> 15 [40] [14] --> 2 12 [16] --> 2 [57] 25 10 23 [16] --> [14] 13 1 [16] --> 16 14 [40] --> 7 [40] --> 29 [47] --> 18 [47] --> 24 40 [57] --> 27 [57] --> 29

18 Emile - Incomplete data set Ik kan geen mail lezen met MS-Mail Ik kan geen mail schrijven met MS-Mail Ik kan geen mail openen met MS-Mail Ik kan geen mail verzenden met MS-Mail Ik kan geen mail lezen met MS-Outlook Ik kan geen mail schrijven met MS-Outlook Ik kan geen mail openen met MS-Outlook Ik kan geen mail verzenden met MS-Outlook Ik kan geen mail lezen met Mail Ik kan geen mail schrijven met Mail Ik kan geen mail openen met Mail Ik kan geen mail verzenden met Mail Ik kan geen mail lezen met Outlook Ik kan geen mail schrijven met Outlook Ik kan geen mail openen met Outlook Ik kan geen mail verzenden met Outlook

19 Emile - Variable settings Default on 12 context support: 30% expression support: 30% total support: 50% Default on 8 context support: 40% expression support: 40% total support: 60% context support: 50% expression support: 50% total support: 70% Generate data set Generate complete language Generate data set Generate 15 out of 16 sentences Generate complete language

20 Emile - Induced grammar [0] --> [2] [18] [0] --> [31] [29] [0] --> [42] [15] [2] --> Ik kan geen mail [12] met [12] --> openen [12] --> verzenden [15] --> met [41] [15] --> met [18] [18] --> MS-Mail [18] --> MS-Outlook [27] --> verzenden [27] --> lezen [29] --> met [30] [30] --> MS-Outlook [30] --> Mail [31] --> Ik kan geen mail [27] [31] --> Ik kan [45] [39] --> lezen [39] --> schrijven [41] --> Mail [41] --> Outlook [42] --> Ik kan [45] [45] --> geen mail [39] [45] --> geen mail [12]

21 Emile - Knowledge base Dictionary Type [35] K033 k033 K105 k33 Dictionary Type [87] Vrachtgeb vrachtgeb Vrachtgebouw Vracht Dictionary Type [89] CGOADTP6 Printqueue Dictionary Type [114] is Userid Password Dictionary Type [138] status Error Dictionary Type [196] scarlos vrachtbrieven Dictionary Type [215] G239 g239 Dictionary Type [237] enorm ontzettend super Dictionary Type [290] pingen benaderen

22 Emile - Knowledge base [16] --> School of Medicine, University of Washington, Seattle 98195, USA [16] --> University of Kitasato Hospital, Sagamihara, Kanagawa, Japan [16] --> Heinrich-Heine-University, Dusseldorf, Germany [16] --> School of Medicine, Chiba University [5] --> Department of Urology, [16] [94] --> Chinese [94] --> Japanese [94] --> Polish [101] --> 32 : Cancer Res 1996 Oct [101] --> 35 : Genomics 1996 Aug [101] --> 44 : Cancer Res 1995 Dec [101] --> 50 : Cancer Res 1995 Feb [101] --> 54 : Eur J Biochem 1994 Sep [101] --> 58 : Cancer Res 1994 Mar [105] --> identified in 13 cases ( 72 [105] --> detected in 9 of 87 informative cases ( 10 [105] --> observed in 5 ( 55 [11] --> LOH was [105] %

23 Emile on Biomed (1)

24 Emile on Biomed (2)

25 Emile on Biomed (3)

26 Merits Emile Language independent Clustering within sentences Incremental learning No training phase Raw text input Access to source code

27 Improve performance Start with information rich text Boot strap with substitution patterns


Download ppt "World class IT in a world-wide market. Practical results with Emile Marten Trautwein Syllogic B.V."

Verwante presentaties


Ads door Google