Evaluation a. Why / when b. Evaluation representations and techniques

Evaluation a. Why / when b. Evaluation representations and techniques
User based (expert-)Knowledge-based Analytisch Norms and standards Technisch c. Samenvatting

a. Waarom evalueren en testen?
Usability volgens ISO Effectiveness – does it work for prospective users? Efficiency – how much (time, effort) does it cost them? Satisfaction – their subjective reaction Evaluatie verbetert het ontwerp User-centered: is deze web site nuttig en bruikbaar voor bedoelde gebruikers? Goedkoopste manier fouten te repareren: hoe eerder hoe beter Gebruikers en klanten betrekken bevordert acceptatie van het product

Waarom vroeg evalueren en testen?
Kosten van het verbeteren van fouten: Analysis & Design Implementation Maintenance fasen $ 1,000 $ 6,000 $ 60,000 kosten Source: Hawksmere - ISO seminar material

Wanneer evalueren? Discovery Analysis Elaboration Construction
Transition Maintenance Target Group Analysis Focus Group Sessions Concept Testing Intermediate Usability Testing Active Usability Testing User Involvement Remote Usability Testing Expert Involvement Expert Review Continue evaluatie. Verschillende soorten evaluaties voor verschillende stappen binnen ontwerpproces. Nu niet uitgebreid beschreven, maar evaluatie dus wezenlijk onderdeel ontwerpen! Kan van een collega naar je website laten kijken tot uitgebreide empirische observatie. Maar altijd nuttig! Surveys Continuous Usability Evaluation

Wanneer evalueren? Vroeg in ontwerpproces:
Conceptueel (doel, taken, soort gebruiker, concept web site, etc.) Nog geen website-specifieke taken Later: Specifieke taken zijn bekend, dus kunnen getest worden Te laat voor conceptuele fouten website-specifieke taken: navigatie, labels, doorlopen schermen, etc.

b. Evaluation representations and techniques
Evaluation is based on representations (models of the system): Formal representations - to be used by design team CCT, ETAG, GOMS, NUAN, …. Representations for users, client, and expert colleagues scenario simulation and mock-up interactive prototype

Evaluation in design phases
Scenario and simulation: claims analysis prototype: cognitive walk-through prototype and implemented system: heuristic evaluation objective observation (usability lab) subjective usability evaluation mental representation and activity (hermeneutic techniques) implemented system: standards (ISO), performance measures

Types of evaluation techniques
User-based (gebruiker) Knowledge-based (ervaring en kennis) Analytisch (statistische gegevens) Norms and standards Technisch (code, implementatie) – hier niet uitgewerkt (“engineering expertise Vele soorten evaluaties. Dit is één categorisering, maar vaak met overlap (combinatie questionnaire en interview over grote groep valt zowel onder 3 als 1 bijvoorbeeld).

1. User-based User-centered design: gebruiker betrekken in ontwerp
Op verschillende manieren: Interview (individueel) Focus groep (8-10 deelnemers) Observatie (individueel) Interview: gebruik je al bij conceptuele fase, waarop je taken, doelen en soort gebruikers baseert. Maar daarna kun je opnieuw interviews afgeven om feedback te krijgen. Focus groep: groepsdiscussie op basis van bepaald onderwerp of concreet resultaat. Niet web site zelf, want dat is onhandig, maar bijvoorbeeld wel papieren prototype die je makkelijk kunt kopiëren. Observatie: iemand achter web site zetten of interactief prototype. Meer hierover zometeen.

1. User-based Wat evalueer je: Tussenliggende resultaten
Informatie Architectuur (card sorting bijv.) Wireframes Grafisch ontwerp Screenshots Etc. Prototype: Papier Interactieve mockup (bv. clickable powerpoint) Werkende web site

1. User-based, voorbeeld: focus groep

1. User-based: observatie
Soorten observaties: Opdrachten met vooraf gekozen taken (in usability laboratorium): +/- Gecontrolleerde omgeving + Specifieke procedure + Makkelijk vast te leggen Gebruiker voelt zich ‘bekeken’ Gebruikers geven minder snel op “Normaal” gebruik (field study) Empirisch: meer controle, meer gericht op kwantitatieve gegevens Observatief: natuurlijker, meer kwalitatief Kwantitatief/kwalitatief: wat noteer / meet je? Formeel: strict verloop, specifieke feedback gebruiker Informeel: vrijer, meer input gebruiker Empirisch vaak formeler Observatief vaak informeler (niet persé)

Een typisch Usability Lab
Observation Room Study Room   AV Mobile devices   Dual display  DigiTV  One-way mirror Video camera mounted on ceiling Sound-proof walls

De observatie ruimte

De gebruikers-ruimte

1. User-based observatie
Aan hand van voor de gebruiker typische taken (ref. scenario’s en flowcharts) Kwalitatief: wat voor problemen komt de gebruiker tegen? Verder mening, op- en aanmerkingen. Is de taak uitvoerbaar? Hoe lang doet de gebruiker er over? Als het niet in 1x goed gaat, waar gaat de gebruiker dan zoeken? Welke woorden begrijpt de gebruiker niet? Welke elementen vallen direct op en welke niet? Waar klikt de gebruiker op? Hoe wordt de scroll-balk gebruikt? Kwantitatief: usability metrics per taak (tijd, aantal fouten, aantal stappen, aantal taken, etc.) Kwalitatieve problemen: slechte navigatie, onduidelijke labels of menu’s, slechte leesbaarheid, etc. Usability metrics: meeteenheden proberen te vinden, d.w.z. variabelen die iets zeggen over hoe goed je web site is. Dit kun je weer gebruiken voor statistieken

1. User-based: veldstudie
Soorten observaties: Opdrachten met vooraf gekozen taken (in usability laboratorium): +/- Gecontrolleerde omgeving + Specifieke procedure + Makkelijk vast te leggen Gebruiker voelt zich ‘bekeken’ Gebruikers geven minder snel op “Normaal” gebruik (field study) + natuurlijke setting en natuurlijke motivatie +/- Met onvoorziene gebeurtenissen + Vrijer verloop Moeilijker op te nemen Weinig ruimte voor observators Empirisch: meer controle, meer gericht op kwantitatieve gegevens Observatief: natuurlijker, meer kwalitatief Kwantitatief/kwalitatief: wat noteer / meet je? Formeel: strict verloop, specifieke feedback gebruiker Informeel: vrijer, meer input gebruiker Empirisch vaak formeler Observatief vaak informeler (niet persé)

Test taken Taken, dus geen functionaliteiten Vraag, geen opdracht
GOED: “Waar kun je het nieuwe boek over Harry Potter kopen?” FOUT: “Zoek in de sectie wetgeving naar de voorwaarden voor huursubsidie in het woningreglement” Vraag, geen opdracht Vb.(website): “Hoeveel kost dit product?” Niet: “Vind de productinformatie” Geef gebruiker vrijheid om taak uit te voeren. Taken moeten realistisch en typisch zijn (ref. scenario’s) Taken moeten het product redelijk ‘dekken’ Verschillende aspecten / onderdelen / functionaliteit Doorgaans 10 – 15 taken (45 minuten)

2. Knowledge-based evaluatie
Op basis van kennis en ervaring van ontwerpers Cognitive walkthrough Heuristische evaluaties Checklists Ervaring en kennis zit in hoofd van ontwerpers, en vaak zijn ze daar niet bewust van.

Expert evaluation: Cognitive walkthrough
Definition: “finding usability problems in a user interface by having a small set of evaluators examine the interface and give an opinion for each step in the dialogue for a selected set of scenarios” Evaluators: user interface specialists, not from the design team

Cognitive walkthrough
Specify scenarios for possible problematic interactions, at the level of single user and system actions Ask the evaluator to answer a small set of standard questions for each step Example question set: what would a normal user do in this situation? why (based on what information or knowledge)? what would the user expect the system to do next?

Cognitive walkthrough
Problems: not possible to consider all possible scenarios no information on recovery of errors time aspect is not considered Benefits: very early indications of problems of representation of information and of consistency

Cognitieve walkthrough
Systematische methode voor het doorlopen van de site Voer typische taak uit op site (of prototype) en kijk of alle bijbehorende stappen door een “gemiddelde” gebruiker zouden kunnen worden uitgevoerd. Kan worden uitgevoerd door 1 persoon (ontwerper)

Bestaat uit een aantal stappen: Definieer de doelgroep voor de test Creëer realistische scenario's Doorloop de scenario’s met ‘de 4 vragen’ Analyseer elk scenario en geef ontwerp verbeteringen Stap 1 en 2 zijn al gedaan in de taakanalyse

De vier vragen om elke stap van de scenario’s te analyseren: Wat wil de gebruiker in deze situatie als volgende stap bereiken? Wat denkt de gebruiker dat hij nu moet doen? Waarom denkt de gebruiker dat dit de goede actie is? Welke systeem reactie verwacht de gebruiker?

Heuristische evaluatie
Heuristiek = vuistregel. Garanderen in de meeste gevallen basis usability Aan de hand van bepaalde aspecten en principes: Bv: functionaliteit, dialoog, representatie, … Kan worden gedaan door een usability specialist Kan worden gedaan met een groep Meerdere mensen zorgen voor aanvullende inzichten

Heuristic Evaluation (Nielsen)
Visibility of system status The system should always keep users informed about what is going on, through appropriate feedback within reasonable time. Match between system and the real world The system should speak the users' language, with words, phrases and concepts familiar to the user, rather than system-oriented terms. Follow real-world conventions, making information appear in a natural and logical order. User control and freedom Users often choose system functions by mistake and need a clearly marked "emergency exit" to leave unwanted states without having to go through an extended dialogue. Aantal principes, zal ze niet allemaal oplezen. In boek van Brinck groot aantal.

Heuristic Evaluation Consistency and standards
Users must not wonder whether different words, situations, or actions mean the same thing. Error prevention Even better than good error messages is a careful design which prevents a problem from occurring in the first place. Recognition rather than recall Make objects, actions, and options visible. The user should not have to remember information from one part of the dialogue to another. Instructions for use of the system should be visible or easily retrievable whenever appropriate. Flexibility and efficiency of use Accelerators - unseen by novices - may speed up interaction for experts so that systems can cater to both inexperienced and experienced users. Let users tailor frequent actions. Consistency. Al eerder over met grafisch ontwerp en navigatie. Functie en betekenis zelfde elementen moet ook zelfde zijn!

Heuristic Evaluation Help users recognise, diagnose, and recover from errors Express error messages in plain language (no codes), precisely indicate the problem, and constructively suggest a solution. Help and documentation Even though systems are best used without documentation, it may be necessary to provide help. This should not be too large, be easy to search, focused on user tasks, listing concrete steps to be carried out. Aesthetic and minimalist design Dialogues should not contain information that is irrelevant or rarely needed. Every extra unit of information in a dialogue competes with relevant units of information and diminishes relative visibility.

Expert evaluation: Heuristic evaluation checklist Roe & Arnold

Heuristic evaluation Make errors caused by system limitations self-exploratory

Heuristic evaluation

3. Analytische evaluatie
Kwantitatief, gebaseerd op cijfers Questionnaires: Naar zoveel mogelijk mensen opsturen Subjectief!! Hit logs: Uitgebreide site-meter Page hits + transfer rates. Welke pagina’s worden het meest bezocht en vanuit waar gaat men waarheen? Interpretatie is speculatief

Subjective evaluation techniques
Not less reliable than objective techniques examples: SUMI software usability SMEQ mental effort ESPRIT MUSIC project ISA mental load instantaneous self assessment

SUMI (licence needed) http://www.megataq.mcg.gla.ac.uk/sumi.html
50 statements on software system 5 sub-scales for experienced users, in standard working conditions diagnosis of usability problems requires at least 10 users sub-scales: Efficiency; Affect; Helpfulness; Control; Learnability global score: perceived usability

SUMI Scoring through “stencils” standard scores, based on large samples of industrial product evaluation Reliable interpretation requires a sample of at least 10 users who “know” the product in normal context of use. diagnosed: < action needed > acceptable software > good software for individual users or individual questions, see manual

Analytische evaluatie: SUS
System Usability Scale (SUS) – Measuring website usability: Digital Equipment Corporation, 1986 John Brooke: A quick and valid tool, based on ISO and European Community ESPRIT project “MUSiC”

SUS Originally aiming at “software systems”
To be used after users got to “know” the system in real life context. Later adapted to websites Validity: Correlates well with well established more time consuming general usability scales (e.g. SUMI)

SUS Scoring: Items 1, 3, 5, 7, 9: strongly disagree = 0, etc. till
Strongly agree = 4 Items 2, 4, 6, 8. 10: strongly disagree = 4, etc., till Strongly agree = 0 Add scores, multiply total by 2.5: Total score range 0 – 100

SUS Reliability: At least 15 users
that have used the website for some realistic tasks in “natural” conditions Will lead to repeatable results

SUS Examples of tasks Task 1: Your digital camera uses SmartMediacards. Find the least expensive external reader (USB) for your PC that will read them. Task 2: You do lots of hiking. Find the least expensive personal GPS with map capability and at least 8 MB of memory.

For these websites: Finance.yahoo.com

SMEQ (no license needed) http://www.megataq.mcg.gla.ac.uk/smeq.html
Cognitive workload for single tasks, performed with a system for experienced users, under standard conditions of use sample size 10 or more very simple scoring, very reliable (r = .82) ISA (no license needed) More easy ways to measure, simple rating buttons, though less reliable?

SMEQ ISA 150 exceptional 100 very strong strong fair reasonable 50
somewhat a little hardly not at all

SMEQ Dutch version

4. Norms & standards ISO 9241 colors non-keyboard input devices
usability principles information presentation user guidance menus command interfaces direct manipulation form filling natural language interfaces

c. Samenvatting Testen/evalueren is een wezenlijk en belangrijk onderdeel van het ontwerp proces: Levert een verbeterd en dus goed ontwerp Fouten voorkomen Acceptatie Testen kan zowel vroeg als laat in het proces plaatsvinden Er zijn verschillende soorten testen die je kunt gebruiken Gebruik de juiste tests op het juiste moment

Evaluation a. Why / when b. Evaluation representations and techniques

Verwante presentaties

Presentatie over: "Evaluation a. Why / when b. Evaluation representations and techniques"— Transcript van de presentatie:

Verwante presentaties

Over het project

Feedback

Inloggen

Inloggen via een sociaal netwerk:

Evaluation a. Why / when b. Evaluation representations and techniques

Verwante presentaties

Presentatie over: "Evaluation a. Why / when b. Evaluation representations and techniques"— Transcript van de presentatie:

Verwante presentaties

Over het project

Feedback