De presentatie wordt gedownload. Even geduld aub

De presentatie wordt gedownload. Even geduld aub

Kansrekening en steekproeftheorie

Verwante presentaties


Presentatie over: "Kansrekening en steekproeftheorie"— Transcript van de presentatie:

1 Kansrekening en steekproeftheorie
Pieter van Gelder TU Delft IVW-Cursus, 16 September 2003

2 De basis van de theorie der kansrekening als fundament voor de cursus; Schatten van verdelingsparameters; Steekproef theorie, waarbij zowel met als zonder voor-informatie wordt gewerkt (Bayesiaanse versus Klassieke steekproeven); Afhankelijkheden tussen variabelen en risico's.

3 Inspection in Civil Engineering

4

5 Stochastic variables

6 Outline What is a stochastic variable? Probability distributions
Fast characteristics Distribution types Two stochastic variables Closure

7 Stochastic variable Quantity that cannot be predicted exactly (uncertainty): Natural variation Shortage of statistical data Schematizations Examples: Strength of concrete Water level above a tunnel Lifetime of a chisel Throw of a dice

8 Relation to events Express uncertainty in terms of probability
Probability theory related to events Connect value of variable to event E.g. probability that stochastic variable X is less than x is greater than x is equal to x is in the interval [x, x+ x] etc.

9 Probability distribution
Probability distribution function = probability P(Xx): FX(x) = P(Xx) 1 stochast 0.8 dummy 0.6 (x) F X 0.4 0.2 x

10 Probability density Familiar form probability ’distribution’:
This is probability density function

11 Probability density Differentiation of F to x:
fX(x) = dFX(x) / dx f = probability density function fX(x) dx = P(x < X  x+dx)

12 P(X x) x P(x < X  x+d x) x x+d x 1 0.8 0.6 (x) F 0.4 0.2 fX(x)
x 0.5 0.4 P(x < X  x+d x) 0.3 fX(x) 0.2 0.1 x x+d x

13 1 0.8 0.6 (x) X F 0.4 0.2 P(X x) x 0.5 0.4 0.3 fX(x) 0.2 0.1 x

14 Discrete and continuous
discrete variable: 1 2 3 4 5 6 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 p X (x) x 1 2 3 4 5 6 0.3 0.4 0.5 0.6 0.7 0.8 0.9 x F X (x) continuous variable: -4 -2 2 4 6 0.1 0.2 0.3 0.4 0.5 x f X (x) -4 -2 2 4 6 0.2 0.4 0.6 0.8 1 F X (x) x probability density (cumulative) probability distribution

15 Fast characteristics mX mean, indication of location
0.5 0.4 (x) 0.3 X f 0.2 sX 0.1 -4 -2 2 4 mX x 6 mX mean, indication of location sX standard deviation, indication for spread

16 Fast characteristics fX(x) Mean  location maximum (mode) sX mX x 0.7
0.6 0.5 fX(x) 0.4 0.3 0.2 0.1 sX 1 2 3 4 5 mX x Mean  location maximum (mode)

17 Fast characteristics Mean Variance Standard deviation
(centre of gravity) Variance Standard deviation Coefficient of variation

18 Normal distribution Normal distributions 1 0.8 (x) 0.6 X f sX 0.4 sX 0.2 -4 -2 2 4 mX x 6 Completely determined by mean and standard deviation

19 Normal distribution Probability density function
Standard normally distributed variable (often denoted by u):

20 Normal distribution Why so popular? Central limit theorem:
Sum of many variables with arbitrary distributions is (almost) normally distributed. Convenient in structural reliability calculations

21 Two stochastic variables
joint probability density function

22 Contour map probability density
-2 -1.5 -1 -0.5 0.5 1 1.5 2 x y

23 Two stochastic variables
Relation to events dh dx

24 Example Health survey. Measurements of: Length Weight
1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 0.5 1 1.5 2.5 3 lengte (m) kansdichtheid (1/m) 0.05 0.04 0.03 kansdichtheid (1/kg) 0.02 0.01 50 60 70 80 90 100 110 gewicht (kg)

25 Logical contour map? 110 100 90 weight (kg) 80 70 60 50 1.4 1.6 1.8 2
2.2 length (m)

26 Dependency 110 100 90 weight (kg) 80 70 60 50 1.4 1.6 1.8 2 2.2
length (m)

27 Fast characteristics Location: mX, mY means Spread
sX, sY standard deviation Dependency covXY covariance rXY = covXY / sX sY correlation, between -1 and 1

28 Independent variables

29 Closure of the short Introduction to Stochastics
What is a stochastic variable? Probability distributions Fast characteristics Distribution types Two stochastic variables

30 Parameter estimation methods
Given a dataset x1, x2, …, xn Given a distribution type F(x|A,B,…) How to estimate the unknown parameters A,B,… to the data?

31 List of estimation methods
MoM ML LS Bayes

32 MoM Distribution moments = Sample moments xnf(x)dx = xin
F(x) = 1- exp[-(x-A)/B] AMOM = std(x) BMOM = mean(x) +std(x)

33 Binomial distribution
X~Bin(N,p) The binomial distribution gives the discrete probability distribution of obtaining exactly n successes out of N Bernoulli trials (where the result of each Bernoulli trial is true with probability p and false with probability q=1-p). The binomial distribution is therefore given by fX(n) =

34 E(X) = Np; var(X)=Npq

35 MoM-estimator of p pMOM = xi / N for j=1:M, X=0; for I=1:N,
if rand(1)<p, x(I)=1; end end y(j)=sum(x) pMOM(j)=y(j)/N; hist(pMOM)

36 Case Study Webtraffic statistics The number of pageviews on websites

37 Statistics on Usage of Screen sizes
Is it necessary to download from every user his/her screen size? Is it sufficient to inspect the screen size of just N users, and still have a reliable percentage of the used screen sizes?

38 Assume 41% of the complete population uses size 1024x768
Inspection population size N = 100, 1000, …and simulate the results by generating the usage from a Binomial distribution. Theoretical analysis: Cov=sqrt(1/p - 1)N-1/2

39 Coefficient of variations (as a function of p and N)
P N 100 1000 10 000 106 41.4% 11.75% 3.7% 1.2% 0.1% 39.8% 12.3% 3.9% 1.3% 6.2% 38.9% 0.4% 5.4% 41.8% 13.2% 4.2% 3.2% 55.0% 17.4% 5.5% 0.55%

40 Optimisation of the inspection sample size
Assume the costs of getting screen size information from a user is A Assume the costs of having a larger cov-value is B TC(N) = A.N + B.sqrt(1/p - 1)N-1/2 The optimal sample size follows from TC’(N) = 0, giving N* = B/2A.(1/p - 1)-2/3 For this choice of N, the cov = (2A/B.(1/p – 1))1/3

41 Case study container inspectie
Toelaatbare ‘ontglip kans’ p = 1/1.000 containers Populatie bestaat uit containers Inspectie bestaat uit controle van containers Stel dat 1 container uit deze steekproef wordt afgekeurd Dan is pMOM=0.001, maar std(pMoM)=0.0316 Als std(pMoM)<0.001, dan inspectie van volledige populatie (immers std(pMoM)=sqrt(pq)sqrt(1/N))

42 Inspectie volledige populatie (bij kleine p-waarden)
Inspectiekosten moeten zich terugverdienen uit de boete-opbrengsten Inspectiekosten: x K/C Opbrengst zonder inspectie: NI (Negative Impact) Opbrengst met inspectie: p x x boete – x K/C p x x boete – x K/C > NI

43 Bayesian statistics P(A|B)=P(A and B)/P(B) P(A|B)=P(B|A)P(A)/P(B)
A = parameters B = data


Download ppt "Kansrekening en steekproeftheorie"

Verwante presentaties


Ads door Google