Mining Dutch History: researching public debate in the nineteenth century Dr José de Kruif Researcher Research Institute for History and Culture Utrecht University
Newspaper (1840) 2
Pamphlet production 3
Pamphlet april
Text fragments considered typical 5 We gaan naar den grond met die verdraagzaamheid, en verliezen onze eigene vrijheid terwijl wij zoo dolzinnig ijveren voor die van anderen. We zullen er de vruchten van plukken, als de inquisitie regt spreekt op onzen vrijen grond en de schavotten staan opgerigt voor ons en onze kinderen.“ “Tolerance will be our Waterloo. We will loose our freedom whilst devoting ourselves to the freedom of others. We will only recognize the fruits of our ignorance when the inquisition judges on our free soil and the scaffolds will be the fate of ourselves and our children.” Bij gevolg kan elk middel, hoe snood, hoe onredelijk, hoe goddeloos ook, aangewend worden: staatkundige verdeeldheid revolutie, burgertwist, inquisitie, brandstapels, vergif, zede- loosheid, koningsmoord,... Ziedaar wapenen in handen der Jezuïten ! “Every means, however nasty, malicious or blasphemous can be used: inciting civil war, revolution, inquisition, burning at the stake, poison, murdering the king …are all weapons in the hands of the Jesuits.”
Digitizing, database 6 ScanOCR Text Database Meta data Textmining ResultsDocuments
Access Database 7
Extracted results 8
Synonyms Jesuits 9
Refining extraction results 10
Actors
Text Link analysis definitions 12
Opinions on the pope 13
The liberal government could count on criticism as well…
Categories arguments 16
Textmining node and anomaly 17
Peer groups & outliers 18 Group 1: History & civil disorder Group 2: History & new constitution Group 3: No history. Civil disorder Group 4: Very moderate & 3 outliers
C & R Tree served as a source or not? 19
Advantages 20 -Gives insight into large number of documents. No need to use just a few and run the risk of not having a representative sample -Combining advantages of text analysis with statistical techniques. Possibility to enrich the dictionary of the software with specific domain knowledge. - New approaches possible
Set-backs 21 -The researcher will need some knowledge of the documents and their subject to be able to interpret the results. -The approach is especially apt for broad research of large quantities of text. The more one zooms in, the less relevant the cluster results will become. -Supplementing the lexical universe of the software with specific domain knowledge might be time- consuming. - The researcher will have to be familiar, or will need to familiarize him or herself, with a number of statistic techniques (e.g. cluster analysis).
Mining Dutch History: researching public debate in the nineteenth century Dr José de Kruif Researcher Research Institute for History and Culture Utrecht University