Suppose we have a classification problem with classes labeled 1, . . . , c and an additional "doubt" category labeled c + 1. Let r : R d → {1, . . . , c + 1} be a decision rule. Define the loss function

Respuesta :

Answer:

The answer and explanation to this question is attached.

Ver imagen shadrachadamu
Ver imagen shadrachadamu

Answer:

Explanation:

Let first simplified the risk given our specific loss function. if f(x) = i i is not double , then the risk is

R(f(x) = i|x) = ∑ L(f(x) = i , y = j ) P(y = j|x)                                      2

                 =0.P (Y= i|x) +λc ∑ P (Y= j|x)                                      3

                 =λc (1 - P(Y= i|x))                                                        4

When f(x)  = c  + 1, meaning you have choosing doubt , the risk is

R(f(x) = c +1|x) = ∑ L (f(x)= c+1, y=j) P(Y=j|x)                                  5

        =λd∑ P(Y=j|x)                                                                       6

        =λd                                                                                      7

because   ∑ P(Y=j|x)  should sum to 1 since its a proper probability distribution.

Now let fopt : Rd→ {1, . . . , c + 1} be the decision rule which implements (R1)–(R3).We want to show that in expectation the rule foptis at least as good as an arbitrary rulef. Let x ∈ Rdbe a data point, which we want to classify. Let’s examine all the possiblescenarios where fopt(x) and another arbitrary rule f(x) might differ:Case 1: Let fopt(x) = i where i 6= c + 1.– Case 1a: f(x) = k where k 6= i. Then we get with (R1) thatR(fopt(x) = i|x) = λc1 − P(Y = i| x)≤ λc1 − P(Y = k|x)= R(f(x) = k|x).– Case 1b: f(x) = c + 1. Then we get with (R1) thatR(fopt(x) = i|x) = λc1 − P(Y = i| x)≤ λc(1 − (1 −λdλc)) = λd= R(f(x) = c + 1|x).Case 2: Let fopt(x) = c + 1 and f(x) = k where k 6= c + 1. Then:R(f(x) = k|x) = λc(1 − P (Y = k|x)R(fopt(x) = c + 1|x) = λ            

Otras preguntas