Tuning CASCOT for improved performance CBS and CASCOT.

Slides:



Advertisements
Verwante presentaties
Break-out: practical questions
Advertisements

Update on EduStandard: public-private platform in Dutch education Henk Nijstad, Kennisnet / november 2013.
Requirements -People are able to make their own memorial page, called a memori -The website will be build first in Dutch for extension.nl, then copied.
Een alternatief voorstel Naar aanleiding van bestudering van de IAASB voorstellen denkt de NBA na over een alternatief. Dit alternatief zal 26 september.
Deltion College Engels C1 Gesprekken voeren [Edu/002]/ subvaardigheid lezen thema: Order, order…. can-do : kan een bijeenkomst voorzitten © Anne Beeker.
Manpower Logistics Academy Gent Zeehaven A creative approach to increase talent supply for the logistics industry Press conference
Personal before business in requirements prior-IT-ization Johan F. Hoorn Vrije Universiteit Computer Science Information Management and Software Engineering.
1 Co-Design at Chess-iT Guus Bosman. 2 Afstuderen bij Chess Net.Footworks tot augustus 2003 Afstuderen augustus 2003 tot maart 2004 Chess full-time vanaf.
Hoogwaardig internet voor hoger onderwijs en onderzoek Amsterdam, 23 November 2005 Walter van Dijk SURFnet Development of LCPM decision-making models and.
Teams on the frontline Geert Stroobant De Heide - Balans
Ronde (Sport & Spel) Quiz Night !
Het vier-instrumentenmodel van managementcontrol
Scaling up testing and counselling as it looks from treatment data monitoring perspectives: The applied research outcomes and the policy implications it.
Virgielcollege Mede mogelijk gemaakt door uw Eerstejaarsch Commissie.
Accessible Instructional Materials. § Discussion: Timely access to appropriate and accessible instructional materials is an inherent component.
1 Presenting Borealis 2006 © 2005 Borealis A/S Presenting Borealis A leading, innovative provider of plastics solutions February 2007.
High quality internet for higher Education and Research 1 TF-LCPM: Exchanging new ideas New ideas within SURFnet Sharing with other NRENs
Corporate Communications February 2011 Succesvol met Outsourcing Gerben Edelijn, CEO Thales Nederland.
© 2004 IBM Corporation Guts Wissema, OpenSource & Linux Sales, IBM Open Document Format.
VVW Toervaren. Onderwerpen Type AIS transponders Verplichting / Toelating gebruik Praktijk voorbeelden van op het schip Praktijk voorbeelden vanop de.
IST Status Gerrit van Nieuwenhuizen IST-MIT meeting BNL, July 24, 2008
Hyves brands Scrape, mashup and analyse. Introduction Anxiety about visible data on social networks by parents, employees (in news) Anxiety comes from.
Identification Documents Port of Ghent All documents in this leaflet are copies of identification/legitimation documents that authorise persons to access.
Asphaltrecycling in the Netherlands
SCENARIO BASED PRODUCT DESIGN
Woensdag 23 juli 2014 volgende vorige algemeen ziekenhuis Sint-Jozef Malle Dementia pathway: a condition specific approach Patrick De Wit, MD Thierry Laporta,
Elke 7 seconden een nieuw getal
In samenwerking met het Europees Sociaal Fonds en het Hefboomkrediet The role of APEL in career coaching and competence management Competence navigation.
A South-African Building Renaissance Onderzoeksbespreking november 2004.
Enterprise Application Integration Walter Moerkerken Ilona Wilmont Integratie Software Systemen 8 mei 2006.
IOP and Vrije Universiteit1 Example of bad interface  Windows: Use Start to Stop.
De digitale coach Het verbeteren van een plan van aanpak Steven Nijhuis, coördinator projecten FNT Deze presentatie staat op:
CLICK THE END EINDE THE END May peace be with you EINDE Moge de vrede met jou zijn Next time I’ll present you the alphabet Volgende keer bied ik je het.
A South-African Building Renaissance Onderzoeksbespreking november 2004.
Computer-Mediated Communication Master IK, CIW, MMI L.M. Bosveld-de Smet Hoorcollege 1; di. 7 sept. 2004;
Tussentoets Digitale Techniek. 1 november 2001, 11:00 tot 13:00 uur. Opmerkingen: 1. Als u een gemiddeld huiswerkcijfer hebt gehaald van zes (6) of hoger,
F REE R IDING IN P ROJECTS Recognize it today, Deal with it tomorrow, Prevent it in the next project Toine Andernach Focus Centre of Expertise on Education,
Geheugen, distributie en netwerken Netwerken: de basis voor distributie van gegevens en taken (processen) –bestaan zo’n 40 jaar, zeer snelle ontwikkeling.
Organizing Organization is the deployment of resources to achieve strategic goals. It is reflected in Division of labor into specific departments & jobs.
Motivation One secret for success in organizations is motivated and enthusiastic employees The challenge is to keep employee motivation consistent with.
Epidemiologie van druggebruik
Deltion College Engels B1 Gesprek voeren [Edu/001]
Deltion College Engels C1 Schrijven [Edu/002] thema: CV and letter of application can-do : kan complexe zakelijke teksten schrijven © Anne Beeker Alle.
Deltion College Engels B1 Gesprekken voeren [Edu/005] thema: applying for a job can-do : kan een eenvoudig sollicitatiegesprek voeren © Anne Beeker Alle.
Deltion College Engels C1 Gesprekken voeren [Edu/004]/ thema: There are lies, damned lies and statistics... can-do : kan complexe informatie en adviezen.
Deltion College Engels C1 Luisteren [Edu/001] thema: It’s on tv can-do : kan zonder al te veel inspanning tv-programma’s begrijpen.
Deltion College Engels B1 En Spreken/Presentaties [Edu/007] Thema: Soap(s) can-do : kan met enig detail verslag doen van ervaringen, in dit geval, rapporteren.
Deltion College Engels C1 Spreken/Presentaties [Edu/006] thema ‘I hope to convince you of… ‘ can-do : kan een standpunt uiteenzetten voor een publiek van.
Deltion College Engels B1 Schrijven [Edu/004]/ subvaardigheid lezen thema: reporting a theft can-do : kan formulieren waarin meer informatie gevraagd wordt,
Een evidence-based methodologie om taalcompetenties te meten Evidence-based HRM (EB HRM) Selor event 14/11/2013.
1992 IQ: Afstandsonderwijs Nederlands NTC-online schooljaar Peuterprogramma € 690,- Kleuterprogramma (groep.
Rational Unified Process RUP Jef Bergsma. Iterations –Inception –Elaboration –Construction –Transition De kernbegrippen (Phases)
Blended Learning. content Waarom wij e-learning hebben gebruikt Demo van de module Voorlopige resultaten van effecten op gebruikers.
Ted Nelson (1937- ) A file structure for the Complex, the changing, and the Interdeterminate.
© Shopping 2020 TITLE Date Subtitle Logo Gastheer Logo Voorzitter.
Combining pattern-based and machine learning methods to detect definitions for eLearning purposes Eline Westerhout & Paola Monachesi.
Benjamin Boerebach, Esther Helmich NVMO workshop 12 juni 2014.
Sustainable employability in Tourism The human factor October 24, 2014 Where Europe Meets the Americas.
Sharing best practices By Exar - Reinbouwgroep 28 november 2014 Peter Reinders.
Deltion College Engels B1 Gesprekken voeren [Edu/006] thema: Look, it says ‘No smoking’… can-do : kan minder routinematige zaken regelen © Anne Beeker.
Deltion College Engels B2 Lezen [Edu/003] thema: Topical News Lessons: The Onestop Magazine can-do: kan artikelen en rapporten begrijpen die gaan over.
Deltion College Engels B2 Spreken [Edu/001] thema: What’s in the news? can-do : kan verslag doen van een gebeurtenis en daarbij meningen met argumenten.
Deltion College Engels B2 Lezen[Edu/001] /subvaardigheid schrijven korte samenvattingen thema: Exotic news can-do : lezen om informatie op te doen - kan.
Deltion College Engels B1 Lezen [Edu/002] thema: But I ‘ve read it in… can-do : kan hoofdthema en belangrijkste argumenten begrijpen van eenvoudige teksten.
The Research Process: the first steps to start your reseach project. Graduation Preparation
Werkwijze Hoe zullen we als groep docenten te werk gaan?
IBM Software A vehicle manufacturer deploys business rules in one hour instead of a week IBM Operational Decision Manager software helps speed new business.
Moving Minds DNA.
Transcript van de presentatie:

tuning CASCOT for improved performance CBS and CASCOT

Outline of the presentation – Background – Developing the index – Deciding on the input – Analysing performance and quality – Using the rules – Cascot issues 2

Background, why change our coding process 3 – Redesign social surveys ‐ CAWI / CATI / CAPI: three modes one questionnaire ‐ Shortening of the interview time ‐ Coding system suitable for web based interviewing – IT policy ‐ No custom-made software applications, only standard tools

Developing the index Three lists of Dutch occupational job titles coded with ISCO 2008 – Euroccupations: 1600 job titles – National classification: job titles – National classification extended: job titles Tested with 2 input files: – Two years of answers to open question on occupation of respondents of the labour force survey – Top 1000 most frequently occuring job titles 4

Developing the index Input1: top 1000Input2: LFS 2004, 2005 indexbestand 1: 1600 job titles score %2991% score 70 en hoger33735%38148% score 40 en hoger64266% % score %502810% indexbestand 2: job titles score %7151% score 70 en hoger47349%990320% score 40 en hoger86188% % score 0303%16693% indexbestand 3: job titles score %5931% score 70 en hoger48750% % score 40 en hoger88290% % score 0232%13783% totaal

Developing the index 6 – Index twice as large (30 i.s.o. 19 thousand), performance only increased by few percentages – Index with 10 times as much entries (19 i.s.o. 1,6 thousand) performance only 2 times higher – Approximately 5000 job titles were selected for further development ‐ Titles with an exact match to answers of respondents ‐ Titles relevant to code 1000 most frequently occuring answers ‐ Suplement with detailling for answers that are often too vague to code to ISCO 2008 unit groups: researcher, advisor, engineer, account manager ‐ Euroccupations list of 1600 job titles

Deciding on the input to use for automatic coding 7 Inputbestand 1Inputbestand 2Inputbestand 3Inputbestand 4 occupationoccupation + tasksoccupation + naceoccupation + nace + tasks Performance score %00%0 0 score 70 en hoger %22504%19884%2190% score 40 en hoger % % % % score 07061%430%0 1 totaal50042 Quality score 40 en meer 4 digits correct749420%653425%543221%523724% 3 digits correct %902134%748029%721033% totaal

Input for automatic coding – Adding tasks to occupational job title improves quality but leads to an decrease in performance – Adding nace to job title and tasks does not improve quality compared to just adding tasks – Develop a process that makes optimal use of information in automatic coding steps 8

Overview of coding process, occupation 9 Step 1 Step 2 Step 3 Step 4 Coding based on occupation Coding based on occupation and main tasks Coding based on decision rules using occupation, NACE and managerial tasks Manual coding ISCO 2008 Automatic coding unit group level ISCO 2008 Manual coding at all aggregation levels of the classification Remaining portion

Developing the index and rules Aim in further testing – Performance: at least 60% coded automatically – Quality: maximum 5% records coded wrong Performance was analysed with three input files for each new version of the classification file Input 1: Top 4000 most frequently occuring job titles Input 2: All job titles collected in 8 years of LFS ( ) Input 3: All job titles combined with tasks in 8 years of LFS Quality : top 4000, and random selection 4000 records (input 2, 3) 66% of all respondents have a job title belonging to the top 4000: improvement was focussed on the top

Analysing quality and performance, top 4000 CLASSIFICATIE Version STEP 1 Coding based on occupation, top 4000 most frequent titles incl score 0excl score 0 Score klasse# resp# resp % cum # resp cum % resp #cum10- 3 / #cum9# onjuist cum # onjuist cum % onjuist van totaal cum % onjuist van # getypeer den % onjuist getypeer d per scoreklas se %215165%118%000% % %108%000% % %105%924 0% 6% % %105% % 29% % %105% %2% 32% % %104% %3% 41% % %101% %5% 76% % %96% %6% 95% % %94% %7% 98% % %94% %7% 100% aflcode % % % %109% % 11 Comparing both versions Cumulative perc. coded wrong of respondents with valid ISCO-code (excl. unknown and default) Percentage coded wrong per score class PERFORMANCE QUALITY

Using the rules to improve performance and quality ‐ Abbreviations ‐ Replacements ‐ Alternatives ‐ Conclusions ‐ Default coding rules 12

Top 20 most frequently occuring answers 13

Administratief medewerker (office clerk) input for automatic coding 14 TextAantalTextAantalTextAantal ADMINISTRATIEF MEDEWERKER7094ADMIN MEDEWERKER65ADMINISTRATIEVE MEDEWERKER26 ADMINISTRATIEF MEDEWERKSTER6160ADMINISTRATIEF WERK64ADMINISTATIEF25 ADMINISTRATIEF1746ADMINISTATIEF MEDEWERKER53ADMINISTRATIEF MEDEWERKER25 ADMINISTRATIE1193ADMIN MEDEWERKSTER52ADMINISTRATIEF MEDEW.25 ADM MEDEWERKSTER401ADMINSTRATIE52ADM MEDW24 ADM MEDEWERKER380ADM. MEDEW.51ADMIN. MEDEWERKER24 ADMINISTRATIEFMEDEWERKSTER242ADM46ADMINISTRATIEVE MEDEWERKSTER24 ADMINISTRATIEFMEDEWERKER210ADMINISTARTIEF MEDEWERKER46ADMINISTRTIEF MEDEWERKER23 ADM. MEDEWERKER152ADMINISTRATIEVE KRACHT46ADMINISTRATIEVE FUNCTIE22 ADMINSTRATIEF MEDEWERKER140ADMINISTRATIE MEDEWERKSTER45ADMINISTRATIEF MEDE21 ADM MEDEW117ADMINISTARTIEF MEDEWERKSTER44ADMINISTRATIEF MEDEWEKER21 ADM.MEDEWERKER116ADMINISTRATIEF MEDWERKER40ADMINISTRTIEF MEDEWERKSTER21 ADM.MEDEWERKSTER115ADMINISTRATIEF MEDEWEKSTER36ADMIN20 ADMINSTRATIEF MEDEWERKSTER115ADMINISTATIEF MEDEWERKSTER32ADMINISTRATIEF MEDEWERSTER20 ADM. MEDEWERKSTER114ADMINISTRATIEF MEDWERKSTER32ADMINISTRATIEF MEDERWERKER19 ADMINISTRATIEF MEDEW89ADMISTRATIEF MEDEWERKER31 ADMINISTRATIEVE WERKZAAMHEDEN17 ADMINISTRATIE MEDEWERKER86ADM.MEDEW.30ADMINISTARTIEF16 ADM MED77ADMINISTATIE29ADMINISTRAIEF MEDEWERKER16 ADMINSTRATIEF69ADM MDW26ADMINISTRATIEF MEDERWERKSTER16 ADMINISTRATIEF MED26ADMINISTRATIEF MEDEWRKSTER16

Administratief medewerker: abbreviations 15

Administratief medewerker: replacements 16 Order within the replacement rules Order between the rules: Abbreviations Replacements Alternatives Default coding Text that is replaced with should be the same in the rules that follow (mind the spaces!) Tekst that is replaced should be used in the index (mind the spaces!)

Administratief medewerker: conclusions 17 Step 1 Step 2 Coding based on occupation Coding based on occupation and main tasks All records with score <40 All records that can not conclude

Word alternatives 18

Step 3: default coding rules  decisionrules 19 Step 1 Step 2 Coding based on occupation Coding based on occupation and main tasks All records with score <40 All records that can not conclude All records with decision code Step 3 Coding based on decision rules using occupation, NACE and managerial tasks All records with score <70 and decision code in step 1 or 2 Manual coding

Adjustments to facilitate manual coding 20 No conclusions and default coding rules ISCO-08 code as an index entry: less clicks are needed to look up the correct ISCO-unit group in the tree. Now: entering the code  accept Coding experts wish: always show ancillary content of input record in stead of after clicking the button, they want to see the information for each title… Coding at a more aggregated level of the ISCO-08 (structure- and index- file) Index entries at a more aggregated level

Cascot, issues for further development 21 – Index and rules: in Dutch 2 (or more) words describing an occupation are often combined without a space, though there are exceptions. We found cascot appeared sensitive to spaces in the rules and index, sometimes leading to unexpected results. We found separating the words with a space consistently throughout index and rules was beneficial for performance and quality. – Rules: ‘if the text’ contains/is ‘the word’ or ‘the phrase’. May be another option ‘part of a word’ could be included to cope with the spelling rules with regard to spaces. – Equivalent word ends: could it be possible to create sets of word ends: machine/apparaat; wagen/auto  not all words ending with ‘machine/apparaat’ should be considered equal to words ending with ‘auto/wagen’.

Thank you for your attention! 22 Sue Westerman,