Download de presentatie
De presentatie wordt gedownload. Even geduld aub
GepubliceerdSimona Boender Laatst gewijzigd meer dan 10 jaar geleden
1
World class IT in a world-wide market
2
Practical results with Emile Marten Trautwein Syllogic B.V.
3
Road map Introduction myself Context: –Text mining tools Results with Emile
4
Introduction myself Computer Science at UvA (1986 - 1991) –Theoretical computer science Complexity of Categorial Unification Grammar Dr Janssen PhD Computer Science at Uva (1991 - 1995) –Theoretical computer science Complexity of Unification Grammars Dr v. Emde Boas, Dr Janssen, Dr Torenvliet Syllogic B.V. (1995 -...) –Research and development Text mining
5
Context Term clustering TextAnalyst - Microsystems Co. Ltd. Intelligent miner for text - IBM
6
TextAnalyst Microsystems Co. Ltd. Megaputer Intelligence Inc (distributor) Version 2.0 www.megaputer.com
7
TextAnalyst - Features Functionality includes –Hierarchical / Structured topics –Knowledge base formation –Semantic search –Abstracting Languages –English –Russian
8
TextAnalyst - Knowledge base
9
TextAnalyst - Summarization
10
Intelligent miner for text IBM Corp. Version 2.3 December 1998 www-4.ibm.com/software/data/iminer/fortext/
11
IM4Text - Features Functionality includes –Clustering –Categorization –Search –Summarization –WebCrawler Languages –English
12
IM4Text- Clustering 0 III IX, X VII XI I II IV V VI VIII XII
13
IM4Text - Summarization
14
Other tools Verity Knowledge Organizer Autonomy Knowledge Server GrapeVine TextWise's DR-LINK, CHESS and CINDOR Data Junction's Cambio DataSet Synthema, Italy (IBM Technology Watch) Semio Corp's SemioMap Cartia's ThemeScape Canis' cMap Inxight's LinguistX and VizControls Muscat's Empower
15
Emile Syllogic / University of Amsterdam Version 3.1
16
Emile - Features Functionality includes –Grammar induction –Knowledge base construction –Compound term separation Languages –Any
17
Emile - Grammar induction Fragment of Phaistos disk 1 41 40 7. 2 12 4 40 33. 2 12 6 18 *. 2 12 13 1. 2 12 13 1 18. 2 12 27 14 32 18 27. 2 12 27 35 37 21. 2 12 31 26. 2 12 32 23 38. 2 12 41 19 35. 2 27 25 10 23 18. … 16 14 18. 16 23 18 43. Fragment of grammar [0] --> [3]. [3] --> [16] [47] [14] --> 15 [40] [14] --> 2 12 [16] --> 2 [57] 25 10 23 [16] --> [14] 13 1 [16] --> 16 14 [40] --> 7 [40] --> 29 [47] --> 18 [47] --> 24 40 [57] --> 27 [57] --> 29
18
Emile - Incomplete data set Ik kan geen mail lezen met MS-Mail Ik kan geen mail schrijven met MS-Mail Ik kan geen mail openen met MS-Mail Ik kan geen mail verzenden met MS-Mail Ik kan geen mail lezen met MS-Outlook Ik kan geen mail schrijven met MS-Outlook Ik kan geen mail openen met MS-Outlook Ik kan geen mail verzenden met MS-Outlook Ik kan geen mail lezen met Mail Ik kan geen mail schrijven met Mail Ik kan geen mail openen met Mail Ik kan geen mail verzenden met Mail Ik kan geen mail lezen met Outlook Ik kan geen mail schrijven met Outlook Ik kan geen mail openen met Outlook Ik kan geen mail verzenden met Outlook
19
Emile - Variable settings Default on 12 context support: 30% expression support: 30% total support: 50% Default on 8 context support: 40% expression support: 40% total support: 60% context support: 50% expression support: 50% total support: 70% Generate data set Generate complete language Generate data set Generate 15 out of 16 sentences Generate complete language
20
Emile - Induced grammar [0] --> [2] [18] [0] --> [31] [29] [0] --> [42] [15] [2] --> Ik kan geen mail [12] met [12] --> openen [12] --> verzenden [15] --> met [41] [15] --> met [18] [18] --> MS-Mail [18] --> MS-Outlook [27] --> verzenden [27] --> lezen [29] --> met [30] [30] --> MS-Outlook [30] --> Mail [31] --> Ik kan geen mail [27] [31] --> Ik kan [45] [39] --> lezen [39] --> schrijven [41] --> Mail [41] --> Outlook [42] --> Ik kan [45] [45] --> geen mail [39] [45] --> geen mail [12]
21
Emile - Knowledge base Dictionary Type [35] K033 k033 K105 k33 Dictionary Type [87] Vrachtgeb vrachtgeb Vrachtgebouw Vracht Dictionary Type [89] CGOADTP6 Printqueue Dictionary Type [114] is Userid Password Dictionary Type [138] status Error Dictionary Type [196] scarlos vrachtbrieven Dictionary Type [215] G239 g239 Dictionary Type [237] enorm ontzettend super Dictionary Type [290] pingen benaderen
22
Emile - Knowledge base [16] --> School of Medicine, University of Washington, Seattle 98195, USA [16] --> University of Kitasato Hospital, Sagamihara, Kanagawa, Japan [16] --> Heinrich-Heine-University, Dusseldorf, Germany [16] --> School of Medicine, Chiba University [5] --> Department of Urology, [16] [94] --> Chinese [94] --> Japanese [94] --> Polish [101] --> 32 : Cancer Res 1996 Oct [101] --> 35 : Genomics 1996 Aug [101] --> 44 : Cancer Res 1995 Dec [101] --> 50 : Cancer Res 1995 Feb [101] --> 54 : Eur J Biochem 1994 Sep [101] --> 58 : Cancer Res 1994 Mar [105] --> identified in 13 cases ( 72 [105] --> detected in 9 of 87 informative cases ( 10 [105] --> observed in 5 ( 55 [11] --> LOH was [105] %
23
Emile on Biomed (1)
24
Emile on Biomed (2)
25
Emile on Biomed (3)
26
Merits Emile Language independent Clustering within sentences Incremental learning No training phase Raw text input Access to source code
27
Improve performance Start with information rich text Boot strap with substitution patterns
Verwante presentaties
© 2024 SlidePlayer.nl Inc.
All rights reserved.